tezvyn:

Optimize cost of a big-data analytics platform

Source: interviewintermediate

WHAT IT TESTS: practical cloud cost optimization. OUTLINE: storage tiering and lifecycle plus compression and partitioning; compute via spot instances, right-sizing, and efficient file formats; query and pipeline optimization to scan less data.

WHAT IT TESTS: whether you can attack cloud cost on multiple fronts. ANSWER OUTLINE: for storage, apply lifecycle policies to move cold data to cheaper tiers or archive, delete stale data, and compress and use columnar formats like Parquet to shrink footprint. For compute, run batch jobs on spot or preemptible instances for big discounts, right-size clusters, and use auto-scaling so you pay only while jobs run.

Read the original → interview

Get five bites like this every day.

Tezvyn delivers a daily feed of 60-second tech bites with quizzes to lock in what you learn.

Optimize cost of a big-data analytics platform · Tezvyn