POSTS
Dance Score Predictions
It’s time to predict some scores!

Last time, I walked through how I trained three optimized machine learning models to predict Strictly Come Dancing scores, using all available data through Week 3 of the current series, Series 16. All three models performed decently, but the gradient boosting regressor tended to achieve the best accuracy scores in the cross-validation. So, I used that model as the “official” prediction of the week’s scores, though I was interested to see how the other models’ predictions compared.
With the trained models in hand, all that’s necessary is to feed them the inputs for week 4, and ask them to predict the total score out of 40. I assembled the prediction inputs using the information available on Ultimate Strictly. For example, Kate and Aljaž were scheduled to dance a Samba (to Toto’s “Africa”, though the model doesn’t know what to do with that information—poor model!). I also added the information that all of the dances will take place during Series 16 and take place 0.23 of the way through the series. I imported the inputs into a Python session as a DataFrame called predict_inputs, et voilà!—
predict_inputs['gbr'] = gbr.predict(predict_inputs).round()
predict_inputs['xgbr'] = xgbr.predict(predict_inputs).round()
predict_inputs['rfr'] = rfr.predict(predict_inputs).round()where gbr, xgbr, and rfr denote the gradient boosting, XGBoost, and random forest regressors, respectively. I rounded the predictions to the nearest integer since judges can’t award fractional scores.
The predictions were as follows:
| celebrity | professional | dance | gbr | xgbr | rfr |
|---|---|---|---|---|---|
| Ashley Roberts | Pasha Kovalev | Tango | 32 | 33 | 28 |
| Charles Venn | Karen Clifton | Salsa | 24 | 24 | 25 |
| Danny John-Jules | Amy Dowden | Viennese Waltz | 29 | 30 | 27 |
| Faye Tozer | Giovanni Pernice | Rumba | 30 | 28 | 29 |
| Graeme Swann | Oti Mabuse | Jive | 22 | 25 | 23 |
| Joe Sugg | Dianne Buswell | Cha-cha-cha | 26 | 28 | 25 |
| Kate Silverton | Aljaž Skorjanec | Samba | 25 | 26 | 26 |
| Katie Piper | Gorka Márquez | Jive | 17 | 17 | 20 |
| Lauren Steadman | AJ Pritchard | Quickstep | 23 | 24 | 26 |
| Dr. Ranj Singh | Janette Manrara | Paso Doble | 24 | 26 | 25 |
| Seann Walsh | Katya Jones | Charleston | 23 | 23 | 23 |
| Stacey Dooley | Kevin Clifton | Foxtrot | 27 | 27 | 25 |
| Vick Hope | Graziano Di Prima | Quickstep | 25 | 26 | 25 |
None of the predictions seemed absolutely bonkers, which was great! The random forest predictions did seem all be smushed together in the middling 20s, which seemed less plausible based on experience—I would agree with the other two models to expect a few totals in the thirties for standout dances. A quick seaborn distplot makes the difference in distributions clear:
Plot: histogram comparing predicted model score distributions for Week 4
I was then lucky enough to have the assistance of the biggest fan of Strictly in the U.S., who was willing to contribute fan-predicted scores for the upcoming show (without having seen the model predictions). A comparison of the gradient boosting model and the fan predictions:
| celebrity | professional | dance | fan | gbr |
|---|---|---|---|---|
| Ashley Roberts | Pasha Kovalev | Tango | 36 | 32 |
| Charles Venn | Karen Clifton | Salsa | 25 | 24 |
| Danny John-Jules | Amy Dowden | Viennese Waltz | 29 | 29 |
| Faye Tozer | Giovanni Pernice | Rumba | 36 | 30 |
| Graeme Swann | Oti Mabuse | Jive | 30 | 22 |
| Joe Sugg | Dianne Buswell | Cha-cha-cha | 28 | 26 |
| Kate Silverton | Aljaž Skorjanec | Samba | 28 | 25 |
| Katie Piper | Gorka Márquez | Jive | 21 | 17 |
| Lauren Steadman | AJ Pritchard | Quickstep | 25 | 23 |
| Dr. Ranj Singh | Janette Manrara | Paso Doble | 25 | 24 |
| Seann Walsh | Katya Jones | Charleston | 30 | 23 |
| Stacey Dooley | Kevin Clifton | Foxtrot | 30 | 27 |
| Vick Hope | Graziano Di Prima | Quickstep | 28 | 25 |
Taken as a whole, the fan predictions seemed to tend a little higher. The largest discrepancy was for Seann and Katya’s Charleston, with a seven-point spread between the two predictions. Could it be the model was overweighting the celebrity results from the early weeks of the series, when scores can be very low, and not sufficiently accounting for week-to-week improvement? Despite this, the fan and model agreed that Danny and Amy’s Viennese waltz was likely to score a 29, and that Katie and Gorka were at high risk of finding themselves at the bottom of the leader board.
All left to do was wait until last Saturday, when the scores were in. When the judges revealed their scores, how would the results compare?
Find out next time, and keeeeeeeeeeeeeeeeeeeeep data-ing!