Final 3rd analysis – more follow ups


Thanks a lot for all the feedback and discussion regarding the final third analysis. Here are a few follow-ups on the feedback.

Feedback: The correlation between goals scored and passes in the final third is driven by the top 5 goal scoring clubs. If they are removed from the data set, the correlation might be weak.
This was brought up by @WillTGM & @Chumolo

Follow-up:

  • The correlation is not nearly as strong if the top 5 (goals scored) are removed. However, 5 teams constitute 25% of the sample space. If we cherry pick the top 5, it is not surprising that the correlation becomes much weaker.
  • I did an experiment choosing 15 clubs randomly from the 20. In several such experiments, the correlation was strong and significant. R2 varied between 0.56 and 0.87. The regression was significant. (F-test)
  • On a similar note, if outliers like Liverpool and Newcastle are excluded, the correlation becomes much stronger.

Feedback: Significance of the regression

@rui_xu brought up a great point about the importance of the significance of the regression and how just R2  might not tell the whole story.

Follow-up:
I did the F-test for all the regressions with the following results

  • However, when I did the same analysis using data from all the 380 games of last season (760 samples), the correlation was weak (as observed for the 38 games of Man City) and the regression was significant for the larger sample space.

Please keep the feedback coming!

Advertisement

Follow-up analysis: Final third passing and Goals scored per game


This is a follow up to my post regarding the strong correlation between completed in the final third and goals scored.

Question

Is there a correlation between the final third completions & goals scored at the game level?

Analysis

I investigated to see if this correlation exists at the game level using the #MCFCAnalytics data set. I plotted the completions in the final third vs. goals scored for Manchester City in all their 38 games of English Premier League.
Blue = Away
; Orange = Home

Manchester City Goals vs. Pass completions in the final 3rd on a per game basis

Findings:

  • Linear regression had an R2 of 0.04  implying that there is no correlation between passes completed in the final third and goals scored at the game level.
    I did the plot for a few other teams and got similar results.
     
  • Arsenal – Away and Liverpool – Home. In both cases, Manchester City had very little success completing passes in the final 3rd. However, they lost 1-0 at the Emirates and won 3-0 at home vs. Liverpool.
    Against Liverpool, City had 6 shots on target and 2 off target.
    Against Arsenal, City had 0 shots on target and 3 off target.
  • QPR – Home and QPR – Away. City scored 3 goals each against QPR home and away. However, they had a season high 326 completed passes in the final 3rd at home vs. just 74 in the away fixture.
    Shots vs. QPR Away – 5 on target & 10 off target.
    Shots vs. QPR Home – 15 on target and 10 off target.

The City – QPR fixture was that crazy season finale. City fell behind and they threw everyone forward to go for the win and the Premier league title. QPR was a man down from 55th minute and they defended at the edge of their 18-yard box for most of 2nd half. This explains the unusually high number of completed passes in the final third.

The above examples underline the rarity of the “goal” event. In any given game, there could be factors like bad shooting, luck, the opponent’s goalkeeper having a great game etc., which could influence the # of goals scored. However, over a season those things seem to even out.

In the next step of analysis I will add a 2nd variable to the model and analyze.

%d bloggers like this: