Final 3rd analysis – more follow ups


Thanks a lot for all the feedback and discussion regarding the final third analysis. Here are a few follow-ups on the feedback.

Feedback: The correlation between goals scored and passes in the final third is driven by the top 5 goal scoring clubs. If they are removed from the data set, the correlation might be weak.
This was brought up by @WillTGM & @Chumolo

Follow-up:

  • The correlation is not nearly as strong if the top 5 (goals scored) are removed. However, 5 teams constitute 25% of the sample space. If we cherry pick the top 5, it is not surprising that the correlation becomes much weaker.
  • I did an experiment choosing 15 clubs randomly from the 20. In several such experiments, the correlation was strong and significant. R2 varied between 0.56 and 0.87. The regression was significant. (F-test)
  • On a similar note, if outliers like Liverpool and Newcastle are excluded, the correlation becomes much stronger.

Feedback: Significance of the regression

@rui_xu brought up a great point about the importance of the significance of the regression and how just R2  might not tell the whole story.

Follow-up:
I did the F-test for all the regressions with the following results

  • However, when I did the same analysis using data from all the 380 games of last season (760 samples), the correlation was weak (as observed for the 38 games of Man City) and the regression was significant for the larger sample space.

Please keep the feedback coming!

Follow-up analysis: Final third passing and Goals scored per game


This is a follow up to my post regarding the strong correlation between completed in the final third and goals scored.

Question

Is there a correlation between the final third completions & goals scored at the game level?

Analysis

I investigated to see if this correlation exists at the game level using the #MCFCAnalytics data set. I plotted the completions in the final third vs. goals scored for Manchester City in all their 38 games of English Premier League.
Blue = Away
; Orange = Home

Manchester City Goals vs. Pass completions in the final 3rd on a per game basis

Findings:

  • Linear regression had an R2 of 0.04  implying that there is no correlation between passes completed in the final third and goals scored at the game level.
    I did the plot for a few other teams and got similar results.
     
  • Arsenal – Away and Liverpool – Home. In both cases, Manchester City had very little success completing passes in the final 3rd. However, they lost 1-0 at the Emirates and won 3-0 at home vs. Liverpool.
    Against Liverpool, City had 6 shots on target and 2 off target.
    Against Arsenal, City had 0 shots on target and 3 off target.
  • QPR – Home and QPR – Away. City scored 3 goals each against QPR home and away. However, they had a season high 326 completed passes in the final 3rd at home vs. just 74 in the away fixture.
    Shots vs. QPR Away – 5 on target & 10 off target.
    Shots vs. QPR Home – 15 on target and 10 off target.

The City – QPR fixture was that crazy season finale. City fell behind and they threw everyone forward to go for the win and the Premier league title. QPR was a man down from 55th minute and they defended at the edge of their 18-yard box for most of 2nd half. This explains the unusually high number of completed passes in the final third.

The above examples underline the rarity of the “goal” event. In any given game, there could be factors like bad shooting, luck, the opponent’s goalkeeper having a great game etc., which could influence the # of goals scored. However, over a season those things seem to even out.

In the next step of analysis I will add a 2nd variable to the model and analyze.

Passing in the final third and goals – EPL 2011-12 #MCFCAnalytics


Question:

Is there a correlation between passing in the final third and the goals scored?

I used the #MCFCAnalytics data set to find the answer.

Analysis

Plot of  Total # of completed passes in the final vs. Goals scored for all the 20 teams in the 2011-12 season of the Barclays Premier League

 Findings:

  • Linear regression had an R2 of 0.671indicating a strong correlation between passes completed in the final third and goals scored.
    Excluding the outlier of Liverpool from the dataset the R2jumped to 0.827.
  • Liverpool is ranked 3rd in the # of passes completed in the final third. However, they are only ranked 15th in goal scored.
  • 75.73– Liverpool’s expected goals scored based on the above regression. However, they managed to score only 42 goals.
    • What is the reason for the huge negative difference?
  • Swansea’s case is interesting. You may remember the term “Swansealona” was one of the favorites with EPL analysts and reporters last season due to their reputation for passing style and high amounts of possession. However, they are below the league average on passes completed in the final third.
  • Newcastle  is ranked 18th in passes completed in the final third. However, Newcastle is ranked 7th in goal scored.Expected goals scored for Newcastle is 29.6. They managed to score 51!
  • Blackburn is ranked last in passes completed in the final third. However, Blackburn scored a lot more goals (44) than their expected goals scored (24.2)
  • Stoke is at the bottom – Lowest # of goals scored and 2nd lowest # of passes completed in the final third.  Not surprising based on their style of play.

Liverpool

I hypothesized that

  1. Liverpool might be crossing a lot and
  2. Most crosses occur in the final third. (I would love to look at (X,Y) data to establish this fact.)
  3. Poor shot quality (which might or might be related to their propensity to cross)

Findings:

  • 1103 – Liverpool attempted the highest # of crosses +corners of all teams in 2011-12
  • 840 –  Liverpool attempted the highest # of open play crosses in 2011-12
  • 19th in overall crossing efficiency  (#of successful crosses+corners/# of successful  + # of unsuccessful crosses+corners)
  • 14th in open play crossing efficiency (# of successful open play crosses/# of successful + # of unsuccessful open play crosses)
  • 18th in overall shooting efficiency ( shots on target/shots on target + shots off target + blocked shots)
  • 15thin shooting efficiency not including blocked shots (shots on target/shots on target + shots off target)

    A glance at the top 10 open play crossers of Liverpool in 2011-12.

Player

Attempts

Efficiency

Downing

148

0.209

José Enrique

138

0.210

Henderson

72

0.125

Adam

70

0.157

Gerrard

69

0.203

Bellamy

67

0.194

Johnson

65

0.185

Kuyt

57

0.246

Suárez

47

0.149

Kelly

38

0.105

Liverpool Average

0.192

League Average

0.202

  • 2 – According this article on EPLIndex, Liverpool scored just 2 goals from 840 open play crosses all season. That is 1 goal per every 420 open play crosses.
  • 79 – The average # open play crosses per goal scored in the 2011-12 season. Liverpool are almost 10 times worse than Man United (44.5)  and Norwich (45.1) in open play crosses/goals category. If there ever was a stat that would (or should) regress to the mean, this is it.

Liverpool had a very talented team in 2011-12. This manifested itself in their high # of completions in the final third where the defensive pressure is highest. Once they are in possession in the final third, they seem to have relied heavily on “crossing the ball” to enable their center-forward Andy  Carroll to take a shot (or head) OR knock it down for their attacking midfielders and wide forwards to take a shot. One big problem was that delivering  crosses is not a very efficient way of passing the ball.  Another problem was they did not seem to have a plan B. It is quite possible that opponents have figured out Liverpool’s crossing strategy and their lack of plan B. The combination of these three factors has contributed significantly to the poor offensive display of Liverpool last season.

Newcastle United

  • 4th – Newcastle is 4thbest in shooting efficiency (goals scored/(shots on target + shots off target)). They stayed 4th even when I included blocked shots in the denominator.
    • This could be the reason why they are an outlier in the final-third completions vs goal scored plot.
    • Manchester City, Arsenal and Manchester United are the top – 3 in shooting efficiency.

Newcastle had two great strikers in Demba Ba and Papisse Cisse who accounted for 29 goals between them. These two were the focus of Newcastle attack and were very efficient with their shots. They did not need a high # of completed passes in the final third to score their goals as they were able to convert a higher % of their shots into goals.

Blackburn Rovers

  • 7thBlackburn are 7th best in shooting efficiency inside the box (goals scored from inside the box/(shots on target inside the box + shots off target inside the box)).
  • Yakubu scored 17 goals for Blackburn and has the 2ndbest  Goals to Shots ratio among all the forwards who have scored than 10 goals.
    • This could be one of the reasons for their big positive differential between actual goals scored (44) and the expected goals scored (24.2).

Summary

# of successful passes in the final third has a strong correlation to goals scored.

Final third is a “high-value” area for scoring goals. More completions in the final third means a team is spending more time in the high-value area. This translates into more opportunities to take a shot or draw errors from defenders to win set pieces from close range, which further increase scoring opportunities.

A high number of completions in the final third alone might not guarantee goals. Liverpool and Newcastle , two examples from the two extremes of the outlier spectrum are cases in point. However, it is one of the key contributing factors to scoring goals. The fact R2 jumped from 0.671 to 0.827 when Liverpool’s data was excluded from the data set strengthens is a case in point.