Train-test split vs. time-series cross-validation?

June 11, 2026Source: otexts.comintermediate

Tests if you see why temporal data breaks random splits. Contrast random sampling with sequential 'walk-forward' validation, where you only use past data to predict the future.

This tests your understanding of data leakage in time-series models. A strong answer first defines a traditional random split, then contrasts it with time-series cross-validation (like 'rolling forecasting origin') which preserves temporal order by only using past data for training. The key is explaining that random splits violate causality by leaking future information into the training set, leading to invalid, overly optimistic performance metrics. A red flag is suggesting standard k-fold CV for a forecasting task.

Read the original → otexts.com

#machine learning
#time series
#model evaluation
#data science

Get five bites like this every day.

Tezvyn delivers a daily feed of 60-second tech bites with quizzes to lock in what you learn.

Get on Play Store Get on App Store