Learning Dynamics Lab

This lab compares how a few standard optimizers move across the same 2D loss surface. It runs entirely in the browser: change the surface, move the start point, and watch the trajectories update.

Loss surface

Click anywhere on the plot to move the shared start point.

Run A

Run B

Run metrics

Compare how each optimizer is moving in the current run.

Run A

SGD

Exploring

Current step: 0
Loss: 16.5636
Position: (-3.400, 2.300)
Gradient magnitude: 13.8060

Run B

ADAM

Exploring

Current step: 0
Loss: 16.5636
Position: (-3.400, 2.300)
Gradient magnitude: 13.8060

Surface notes

A narrow valley with very different curvature along each axis. This is a classic setup for seeing zig-zagging SGD paths and the stabilizing effect of adaptive methods.

Optimizer notes

Run A: SGD

SGD follows the raw gradient directly. It is easy to reason about, but it can zig-zag badly in narrow valleys and is sensitive to the learning rate.

Run B: ADAM

Adam combines momentum with per-coordinate scaling. It usually settles quickly on these toy surfaces and makes adaptive behavior easy to compare.