Posts
-
Notes on Linear Gaussian Models
-
Empowerment: A New Reward-Free Paradigm for Human-AI Collaboration?
-
Introducing CleanIL for Imitation and Inverse Reinforcement Learning
-
Observations and Implementation Tricks for Imitation and Inverse Reinforcement Learning
-
Resource Rational Adaptive Inference Time Compute
-
RNNs are Switching State Space Models?
-
Simple Alchemy for Meta Reinforcement Learning
-
A Tutorial on Dual Reinforcement Learning - Mostly Intuitions
-
Do We Need Reward in RLHF? DPO and the Unlikelihood Family Curse
-
Why do we need RLHF? Imitation, Inverse RL, and the role of reward
-
On the Exploration-Exploitation Tradeoff and Identifying Epistemic Actions in POMDPs
-
Another Attempt to Rationalize Expected Free Energy: Insights From Reinforcement Learning
-
Making Sense of Active Inference: Optimal Control Without Cost Function
-
The Uniqueness of Agent Beliefs in Meta and Bayesian Reinforcement Learning
-
The Uniqueness of Meta Learning and Autoregressive Pre-training
-
Bayesian Theory of Mind for RLHF: Towards Richer Human Models for Alignment
-
Dummy First Post
subscribe via RSS