1. Why is shuffling a dataset before conducting k-fold CV generally a bad idea in finance? What is the purpose of shuffling? Why does shuffling defeat the purpose of k-fold CV in financial datasets?
2. Take a pair of matrices (X, y), representing observed features and labels. These could be one of the datasets derived from the exercises in Chapter 3.
(a) Derive the performance from a 10-fold CV of an RF classifier on (X, y), without shuffling.
(b) Derive the performance from a 10-fold CV of an RF on (X, y), with shuffling.
(c) Why are both results so different?
(d) How does shuffling leak information?