How would you speed up slow single-GPU training?

June 23, 2026Source: interviewintermediate

WHAT IT TESTS: knowledge of scaling training. OUTLINE: vertical scaling to bigger or multi-GPU instances, then data-parallel or model-parallel distributed training across nodes.

WHAT IT TESTS: whether you can scale ML training thoughtfully. ANSWER OUTLINE: first scale up to a larger or multi-GPU instance and apply mixed precision and a larger batch size to use the hardware fully; second scale out with distributed training, data parallelism replicating the model across GPUs and syncing gradients via all-reduce, or model parallelism for models too large to fit. Mention the communication overhead trade-off.

Read the original → interview

#distributed-training
#gpu
#scaling
#cloud
#machine-learning

Get five bites like this every day.

Tezvyn delivers a daily feed of 60-second tech bites with quizzes to lock in what you learn.

Get on Play Store Get on App Store