Switching from Prophet (MAE 36.63) to gradient boosting reduced error by 96%. Tabular models dominate for this task.
Single model (1.12) → 2-model (1.02) → 3-model (0.70) → 7-model (0.46). Each diversity source helps.
Adding HGB models with different seeds (42/123/777) provided cheap diversity. More seeds hit diminishing returns at 3.
HGB with q0.45 and q0.55 loss added asymmetric diversity that simple seed changes couldn't achieve. Final breakthrough.
Scipy Nelder-Mead on val MAE found optimal weights. Equal weights ≈ optimized for similar-strength models, but matters for diverse ones.
Neural nets (MLP), stacking, Fourier features, feature interactions, target encoding — all failed. GBMs handle these internally.