optimizer-memory-profiles

I ran the template PyTorch Memory Profile code in several settings.

Execution Environment Hardware Type Optimizer Gradient Accumulation Memory Profile
Local CPU SGD No View
Remote CPU SGD No View
Remote GPU SGD No View
Remote GPU SGD + Momentum No View
Remote GPU Adam No View
Remote GPU Adam Yes View

Running the profiling

modal run profiling.py

Notes

These are some observations

Memory Profiling Results

Local CPU SGD

Local CPU SGD Memory Profile

Modal CPU SGD Memory Profile

Modal GPU SGD Memory Profile

Modal GPU SGD Momentum Memory Profile

Modal GPU Adam Memory Profile

Modal GPU Adam GradAcc Memory Profile