Actor-Critic without Actor: Critic-Guided Denoising for RL

Best AI papers explained - A podcast by Enoch H. Kang

Categories:

This paper introduces a novel reinforcement learning framework called Actor-Critic without Actor (ACA), which is designed to be a lightweight and efficient alternative to traditional actor-critic methods. ACA eliminates the explicit actor network, generating actions instead from the gradient field of a noise-level critic via a diffusion-based denoising process. This method significantly reduces algorithmic and computational overhead compared to standard and diffusion-based actor-critic approaches, as demonstrated by requiring substantially fewer parameters and achieving competitive performance on online RL benchmarks like MuJoCo tasks. A key feature of ACA is its noise-level critic, which conditions value estimates on the diffusion timestep, stabilizing gradients and ensuring the policy maintains immediate alignment with the critic's latest value updates while preserving multi-modal action coverage. Overall, ACA offers a simplified, expressive, and parameter-efficient solution for online reinforcement learning.