tezvyn:

Explain data lineage and how you would implement it

Source: Wikipedia: Data lineageintermediate

This tests your ability to design for data observability. Define lineage (origin, transformation, movement), then propose a solution using metadata extraction (OpenLineage) and a central graph store/UI (Marquez) to trace data from microservices to analytics.

This tests your ability to translate a data governance concept into a practical system design for a distributed environment. A great answer first defines lineage (origin, transformations, movement), then outlines an implementation. Propose using a standard like OpenLineage to emit metadata from services and ETL jobs. This data is sent to a backend like Marquez, which builds a graph of dependencies. This allows for visualization and tracing errors back to their source. The red flag is only defining the term or suggesting manual documentation.

Read the original → Wikipedia: Data lineage

Get five bites like this every day.

Tezvyn delivers a daily feed of 60-second tech bites with quizzes to lock in what you learn.

Explain data lineage and how you would implement it · Tezvyn