MCFC Analytics – Summary of blog posts # 3


Thanks for the amazing response to Summary of blog posts #1 & Summary of blog posts #2

I also want to thank people who have reached out to me via twitter with links to their blogs & posts.

Goalscorer ‘footedness’ by @DavidAHopkins measures the footedness or the foot favoured by Premier League goalscorers.

How do the more successful clubs keep the ball in EPL by @JDewitt talks about how the top teams in EPL keep possession. Also by John is Successful Passing and Winning

A sneak peek of a very interesting carto by @Kennethfield  Charlie Adam’s “passing wheel”

Football Philosophy – Long passes by @Poolq1984 explores the importance of long ball in football.

@We_R_PL has a nice post on how to use the MCFC dataset more efficiently. He also has spreadsheet which has the own goals calculated per team.

@footballfactman has a post on Darron Gibson using a mix of data from MCFC dataset, whoscored and statszone

The always excellent MarkTaylor0 has detailed post Analysing the quality of shots in Bolton – Manchester City game using the advanced dataset.

@ChrisJLilley has 3 posts on his blog using MCFC data

GK positional analysis

Premier league game changers Part I & Part II

@DanJHarrington has cranked up a lot of things using the advanced dataset

1.  an interactive tableau viz to see touches of each player in Bolton -City on the pitch.

2. Passing visualization using D3.js

3. Dan also has some interesting visualization work in progress. There is a cool video in the link showing ball movement.

Network passing diagrams by @DevinPleuler

Bolton – http://t.co/mcRQ0oHU

Man City – http://t.co/6mtGgJQS

Extracting data from XML

There have been some questions regarding this and some folks have come up with solutions

1. If you have MS Excel 2007 or a later version you can open the file in XML. The only issue with is that XML’s are nested and Excel converts this into a very flat format. So you will see multiple rows for the same events. For example: A successful pass has multiple rows indicating the direction, the x,y coordinates of where it is passed to. Read the data spec thoroughly to understand how the data is formatted in the XML. It will help understand the data much better.

2. Code for R users to extract the F-24 XML by @MarchiMax

3. Code snippets from @JBrisson to extract events from the F-24 XML

4. If you are into programming, most languages have XML parsers. A simple search will get you code snippets to start with.
If I missed any links, please let me know via Twitter or comment on the blog post. Always use #MCFCAnalytics tag in twitter so I can pick them up easily!

Advertisements

Stoke City vs. Manchester City–Opposition Analysis


This is an “Opposition analysis” of Stoke City, Manchester City’s opponent on Saturday 15st September at the Britannia Stadium. I used the #MCFCAnalytics Lite data set to do this analysis.

Stoke

Stoke – Offense

Offensively bad
Goals scored
Shots on Target
Shots off Target

20th
20th
14th
Strong in the airHeaded goalsHeaded shots (on + off target) 14 – 3rd in the League  –  40%  (14/35) of their goals are from headers
2nd
Poor Final 3rd passing
Completions in final 3rd

19th
Lots of “throw ball”
Attempts on goal from throws (on + off target

1st

 

Stoke – Key attacking players

 

Goals Peter Crouch10  – 5 from headers
Jonathan Walters 7
Shots 55 – Jonathan Walters took the highest # of shots for Stoke followed by
49 – Peter Crouch
Robert Huth leads the Headers with 13 (on + off target)
Assists Mathew Etherington7
Jermaine Pennant
6
Jonathan Walters5
Final third passing Peter Crouch (286 – 46% completion rate) had the maximum completions in the final 3rd.
Glenn Whelan (268 – 63%) and
Jonathan Walters (253 – 54%)
are the next best passers in the final 3rd.
Other interesting aspects 1st in Assists to Goals ratio

 

Stoke – Offensive summary

Stats from last season indicate that Stoke is very “direct” in its attack (my “Eureka” moment right there!). Headed goals constitute 40% of their total goals scored. They were last in goals scored and shots on target. Overall, a very poor offensive record.

Peter Crouch is Stoke’s top-scorer. Jonathan Walters is the most valuable offensive player who can score (7 goals) as well as provide (5 assists) and is one of their most active passers in the final third.

Peter Crouch has the most # of completions in the final third, but that is not saying much. His completion % is below par. It is very likely that Stoke’s offensive game plan is to lob long balls in the direction of Peter Crouch, whose subsequent pass(or header) is easily intercepted. Stoke ranked last in the  # of corners won. This is partly explained by the fact they are 19th in final 3rd completions – Could it be because Stoke is not spending enough time in the final 3rd to force clearances or mistakes from opposing defenders?

Stoke have the league’s highest Assist-to-Goals ratio. Stoke are very poor shooting from outside the box. The two stats put together imply that they neither have someone who can make dangerous solo runs at the defence and create a goal scoring opportunity on their own nor possess a goal-scoring threat from outside the box. Pulis has addressed the latter by signing Charlie Adam, whose 40-yard boomers will at least add another dimension to their attack. Adam’s signing should also improve their passing in the final 3rd.

They signed free agent Michael Owen last week in an effort improve their goal scoring. It is likely that Tony Pulis is looking for someone who can pounce on the knockdowns by Peter Crouch in and around the 18-yard box and take high percentage shots on goal. Not a bad idea in theory but I am skeptical on the kind of impact Michael Owen is going to have in the Stoke system.
New signing US defender Geoff Cameron will provide cover for Rory Delap with his powerful throw.
All their new signings are geared towards upgrading personnel for their direct approach rather than try something different.

Stoke – Defence

Goals conceded 53 – tied for 7th
Penalties conceded 7 – tied for 4th
Shots conceded 2nd – From outside the box
13th – From inside the box
Corners conceded 5th
Aerial duels 2nd – Total duels
1st – Duels won &
1st – Duels winning %
Ground duels won % 20th
Tackles 20th – Tackles won
20th – Tackles winning %
Clearances 1st – headed clearances
1st – total clearances
Fouls committed 5th
Pitch size 100 x 64 (vs. Man City’s 105 x 68) – smallest in EPL

 

Stoke – Defensive summary

Stoke rely heavily on their strong and physical aerial game in the defense as well. Stoke create the 2nd highest # of aerial duel situations in the league and are the best in the league in winning % of aerial duels. This shows Stoke’s clear affinity to play the ball in the air to take advantage of the physical conditions of its players. On the flip side, Stoke are the worst in the league  in winning ground duels & tackles.

Stoke have the highest # of clearances 1910 (459 more than Norwich who are second. League average 1128). They also have the highest # of headed clearances in 959 (159 more than QPR who are second. League average is 575). This indicates that Stoke defenders are probably slow and tend to react late and get into situations where they have to make a clearance. They conceded the 5th highest # of corners in the league.

Britannia stadium has the smallest pitch in all of Premier league. It is 4 meters narrower, 5 meters shorter and 10.36% smaller in area than that of Manchester City. This means less space to work with on the ground. The passing angles for players like David Silva, Tevez et al, will be restricted. It will make it easier for Stoke defenders to close down the attacking players of the opposition despite their inferior technique and slowness. It also makes the aerial game a bit easier as the likelihood of completing a pass through the air is probably easier than passing on the ground in a small and crowded field. (This could also be the reason why Stoke defenders are forced to make so many headed clearances as the opposing teams are forced to play the ball in the air to have a better chance of completing a pass in and around the 18-yard box).

City should field as many players as possible who can pass the ball in tight spaces to move the ball on the ground close to their 18-yard box to force hurried clearances, defensive mistakes and set pieces. Stoke give up the 2nd highest # of shots on target from outside the box. Yaya Toure must fancy his chances of scoring a goal in this game.

Stoke are reasonably good (13th lowest) at conceding shots from inside the box. This could be due to their physical defending style. If the ref is “letting them play” then Stoke defence could frustrate attackers of the opposition and force them to settle for shots from the outside.

 

Stoke – Goalkeeping (Asmir Begovic)

Goals conceded 31 – 1.41 goals per game 7th (for All GKs with 20 or more starts)
GK distribution efficiency
(Successful GK distribution/Total GK distribution)
67% – 11th
Long passes completion 51% – 1st
Short passes completion rate 75% – 19th (only 24 short passes attempted)
Proportion of Long to short passes 95% – 1st (league average 76)

 

Stoke – Goalkeeping Summary

I have considered only Asmir Begovic’s numbers for this analysis, as he is the starter this season. He is the best in the league at completing long passes and one of the worst goalkeepers at completing short passes. 95% of all passes attempted by Begovic are long (1st in the league). He gave up 1.41 goals per game, 7th highest in the league. However, he only conceded 0.727 goals per game ( 8 goals in 11 games)  with 4 clean-sheets at home. Overall Stoke gave up only 20 goals in 19 home games. This was one of the main reasons for their survival last season.

 

City vs. Stoke Head – to – head 2011-12

  • City drew 1-1 at the Britannia and won 3-0 at home
  • Can you guess who scored the away goal for City and from where? Yaya Toure, from outside the box. Peter Crouch scored for Stoke.
  • In the home game at the Etihad, Aguero – 2 and Adam Johnson – 1.

 

Final word

City will find the going tough but should win this game. The key for City is to stay patient with their passing game and not be drawn into the physical and aerial battle that Stoke is so comfortable. Stoke do not create many clear scoring chances. If the City defence can keep their errors to a minimum, Stoke will most likely not score.

Passing in the final third and goals – EPL 2011-12 #MCFCAnalytics


Question:

Is there a correlation between passing in the final third and the goals scored?

I used the #MCFCAnalytics data set to find the answer.

Analysis

Plot of  Total # of completed passes in the final vs. Goals scored for all the 20 teams in the 2011-12 season of the Barclays Premier League

 Findings:

  • Linear regression had an R2 of 0.671indicating a strong correlation between passes completed in the final third and goals scored.
    Excluding the outlier of Liverpool from the dataset the R2jumped to 0.827.
  • Liverpool is ranked 3rd in the # of passes completed in the final third. However, they are only ranked 15th in goal scored.
  • 75.73– Liverpool’s expected goals scored based on the above regression. However, they managed to score only 42 goals.
    • What is the reason for the huge negative difference?
  • Swansea’s case is interesting. You may remember the term “Swansealona” was one of the favorites with EPL analysts and reporters last season due to their reputation for passing style and high amounts of possession. However, they are below the league average on passes completed in the final third.
  • Newcastle  is ranked 18th in passes completed in the final third. However, Newcastle is ranked 7th in goal scored.Expected goals scored for Newcastle is 29.6. They managed to score 51!
  • Blackburn is ranked last in passes completed in the final third. However, Blackburn scored a lot more goals (44) than their expected goals scored (24.2)
  • Stoke is at the bottom – Lowest # of goals scored and 2nd lowest # of passes completed in the final third.  Not surprising based on their style of play.

Liverpool

I hypothesized that

  1. Liverpool might be crossing a lot and
  2. Most crosses occur in the final third. (I would love to look at (X,Y) data to establish this fact.)
  3. Poor shot quality (which might or might be related to their propensity to cross)

Findings:

  • 1103 – Liverpool attempted the highest # of crosses +corners of all teams in 2011-12
  • 840 –  Liverpool attempted the highest # of open play crosses in 2011-12
  • 19th in overall crossing efficiency  (#of successful crosses+corners/# of successful  + # of unsuccessful crosses+corners)
  • 14th in open play crossing efficiency (# of successful open play crosses/# of successful + # of unsuccessful open play crosses)
  • 18th in overall shooting efficiency ( shots on target/shots on target + shots off target + blocked shots)
  • 15thin shooting efficiency not including blocked shots (shots on target/shots on target + shots off target)

    A glance at the top 10 open play crossers of Liverpool in 2011-12.

Player

Attempts

Efficiency

Downing

148

0.209

José Enrique

138

0.210

Henderson

72

0.125

Adam

70

0.157

Gerrard

69

0.203

Bellamy

67

0.194

Johnson

65

0.185

Kuyt

57

0.246

Suárez

47

0.149

Kelly

38

0.105

Liverpool Average

0.192

League Average

0.202

  • 2 – According this article on EPLIndex, Liverpool scored just 2 goals from 840 open play crosses all season. That is 1 goal per every 420 open play crosses.
  • 79 – The average # open play crosses per goal scored in the 2011-12 season. Liverpool are almost 10 times worse than Man United (44.5)  and Norwich (45.1) in open play crosses/goals category. If there ever was a stat that would (or should) regress to the mean, this is it.

Liverpool had a very talented team in 2011-12. This manifested itself in their high # of completions in the final third where the defensive pressure is highest. Once they are in possession in the final third, they seem to have relied heavily on “crossing the ball” to enable their center-forward Andy  Carroll to take a shot (or head) OR knock it down for their attacking midfielders and wide forwards to take a shot. One big problem was that delivering  crosses is not a very efficient way of passing the ball.  Another problem was they did not seem to have a plan B. It is quite possible that opponents have figured out Liverpool’s crossing strategy and their lack of plan B. The combination of these three factors has contributed significantly to the poor offensive display of Liverpool last season.

Newcastle United

  • 4th – Newcastle is 4thbest in shooting efficiency (goals scored/(shots on target + shots off target)). They stayed 4th even when I included blocked shots in the denominator.
    • This could be the reason why they are an outlier in the final-third completions vs goal scored plot.
    • Manchester City, Arsenal and Manchester United are the top – 3 in shooting efficiency.

Newcastle had two great strikers in Demba Ba and Papisse Cisse who accounted for 29 goals between them. These two were the focus of Newcastle attack and were very efficient with their shots. They did not need a high # of completed passes in the final third to score their goals as they were able to convert a higher % of their shots into goals.

Blackburn Rovers

  • 7thBlackburn are 7th best in shooting efficiency inside the box (goals scored from inside the box/(shots on target inside the box + shots off target inside the box)).
  • Yakubu scored 17 goals for Blackburn and has the 2ndbest  Goals to Shots ratio among all the forwards who have scored than 10 goals.
    • This could be one of the reasons for their big positive differential between actual goals scored (44) and the expected goals scored (24.2).

Summary

# of successful passes in the final third has a strong correlation to goals scored.

Final third is a “high-value” area for scoring goals. More completions in the final third means a team is spending more time in the high-value area. This translates into more opportunities to take a shot or draw errors from defenders to win set pieces from close range, which further increase scoring opportunities.

A high number of completions in the final third alone might not guarantee goals. Liverpool and Newcastle , two examples from the two extremes of the outlier spectrum are cases in point. However, it is one of the key contributing factors to scoring goals. The fact R2 jumped from 0.671 to 0.827 when Liverpool’s data was excluded from the data set strengthens is a case in point.

%d bloggers like this: