Posts
Empowerment: A New Reward-Free Paradigm for Human-AI Collaboration?
Introducing CleanIL for Imitation and Inverse Reinforcement Learning
Observations and Implementation Tricks for Imitation and Inverse Reinforcement Learning
Resource Rational Adaptive Inference Time Compute
RNNs are Switching State Space Models?
Simple Alchemy for Meta Reinforcement Learning
A Tutorial on Dual Reinforcement Learning - Mostly Intuitions
Do We Need Reward in RLHF? DPO and the Unlikelihood Family Curse
Why do we need RLHF? Imitation, Inverse RL, and the role of reward
On the Exploration-Exploitation Tradeoff and Identifying Epistemic Actions in POMDPs
Another Attempt to Rationalize Expected Free Energy: Insights From Reinforcement Learning
Making Sense of Active Inference: Optimal Control Without Cost Function
The Uniqueness of Agent Beliefs in Meta and Bayesian Reinforcement Learning
The Uniqueness of Meta Learning and Autoregressive Pre-training
Bayesian Theory of Mind for RLHF: Towards Richer Human Models for Alignment
Dummy First Post
subscribe via RSS