tezvyn:

Data lake versus data warehouse

Source: interviewbeginner

WHAT IT TESTS: storage architecture fundamentals. OUTLINE: lakes store raw, schema-on-read data of any type cheaply; warehouses store curated, schema-on-write structured data for fast SQL; choose a lake for varied raw data and ML.

WHAT IT TESTS: whether you grasp the schema and use-case differences between the two stores. ANSWER OUTLINE: a data warehouse holds curated, structured, schema-on-write data optimized for fast BI and SQL analytics; a data lake stores raw structured, semi-structured, and unstructured data cheaply on object storage with schema-on-read flexibility. Choose a lake when you have diverse or unstructured sources, large volumes, exploratory or machine-learning needs, and want to defer schema decisions.

Read the original → interview

Get five bites like this every day.

Tezvyn delivers a daily feed of 60-second tech bites with quizzes to lock in what you learn.

Data lake versus data warehouse · Tezvyn