Tetris AI โ€” DQN Agent

Deep Q-Network trained on 1,000 episodes ยท PostHog Engineering Challenge
Score
0
Lines
0

Training Progress

Episodes: 1,000  ยท  Best Score: โ€”  ยท  Parameters: 103,300

AI Decision โ€” Current Action

โ† Left
โ†’ Right
โ†ป Rotate
โ†“ Drop

Q-Values โ€” What the Agent Thinks

Higher Q-value = agent believes this action leads to better future score

โ† Left
โ€”
โ†’ Right
โ€”
โ†ป Rotate
โ€”
โ†“ Drop
โ€”

Reward Function

The only human judgment in the system โ€” everything else the agent learned itself

Line clear+100
Tetris (4 lines)+800
Per holeโˆ’5
Bumpinessโˆ’0.5
Heightโˆ’0.3
Deathโˆ’500

Network Architecture

Input: 15 board features
Hidden: 256 neurons
Hidden: 256 neurons
Hidden: 128 neurons
Output: 4 Q-values
Trained with experience replay + target network stabilisation

15 Input Features

โ€ข 10 column heights
โ€ข Total holes
โ€ข Surface bumpiness
โ€ข Max column height
โ€ข Average height
โ€ข Lines cleared so far