
Sign up to save your podcasts
Or
This academic paper introduces Non-Stationary Natural Actor-Critic (NS-NAC), a novel model-free, policy-based reinforcement learning algorithm designed for time-varying environments where rewards and transition probabilities change. Traditional reinforcement learning often assumes stationary environments, but real-world applications frequently involve dynamic systems. NS-NAC addresses this by incorporating restart-based exploration and adaptive learning rates to balance forgetting outdated information and learning new environmental dynamics. The paper also presents BORL-NS-NAC, a parameter-free extension that does not require prior knowledge of environmental variation, and provides theoretical guarantees for both algorithms through dynamic regret analysis, supported by empirical simulations.
This academic paper introduces Non-Stationary Natural Actor-Critic (NS-NAC), a novel model-free, policy-based reinforcement learning algorithm designed for time-varying environments where rewards and transition probabilities change. Traditional reinforcement learning often assumes stationary environments, but real-world applications frequently involve dynamic systems. NS-NAC addresses this by incorporating restart-based exploration and adaptive learning rates to balance forgetting outdated information and learning new environmental dynamics. The paper also presents BORL-NS-NAC, a parameter-free extension that does not require prior knowledge of environmental variation, and provides theoretical guarantees for both algorithms through dynamic regret analysis, supported by empirical simulations.