← Back to Simulations

⚡ Reward Prediction Error

Dopamine neurons signal reward prediction error (RPE): the difference between received and expected reward. This TD learning signal drives reinforcement learning throughout the brain.

Learning Rate (α): 0.1

Discount (γ): 0.9

Reward Magnitude: 1.0

Generation: 0

Expected Value: 0.00

RPE (δ): 0.00

DA Firing: baseline

TD Learning:
• δ = r + γV(s') - V(s)
• δ > 0: Better than expected
• δ < 0: Worse than expected
• δ = 0: As expected

⚡ Reward Prediction Error

📊 Reward Prediction Error

About This Simulation

Key Concepts

Why It Matters

How to Explore