Learning Dynamics Lab

This lab compares how a few standard optimizers move across the same 2D loss surface. It runs entirely in the browser: change the surface, move the start point, and watch the trajectories update.

Loss surface

Click anywhere on the plot to move the shared start point.

Run A
Run B
AB

Run metrics

Compare how each optimizer is moving in the current run.

Run A

SGD

Exploring
Current step
0
Loss
16.5636
Position
(-3.400, 2.300)
Gradient magnitude
13.8060

Run B

ADAM

Exploring
Current step
0
Loss
16.5636
Position
(-3.400, 2.300)
Gradient magnitude
13.8060

Surface notes

A narrow valley with very different curvature along each axis. This is a classic setup for seeing zig-zagging SGD paths and the stabilizing effect of adaptive methods.

Optimizer notes

Run A: SGD

SGD follows the raw gradient directly. It is easy to reason about, but it can zig-zag badly in narrow valleys and is sensitive to the learning rate.

Run B: ADAM

Adam combines momentum with per-coordinate scaling. It usually settles quickly on these toy surfaces and makes adaptive behavior easy to compare.

Subscribe via RSS

Copyright © 2020 - 2026 Alex Leung