MCFC Analytics blogposts Summary #9


In the past week, I found the following posts written using the #MCFCAnalytics data

  1. Some interesting stuff by @PedroAfonso85 building on some previous work  to breakdown the importance of ball possession and some discussion about the oft discussed yet hard to quantify, momentum.
  2. @MarkTaylor0 analyzed Blocked shots to find if blocking shots is a talent.
  3. @hpstats visualized points difference “with/without” a player in  the starting lineup. Also from the same blog is profiling players based on their shooting
  4. @SportsViz has a video with examples of 3D-visualization of passes using the data from Bolton vs. City game

Previous Summaries

Summary #8

Summary #7

Summary #6

Summary #5

Summary #4

Summary #3

Summary #2

Summary #1

MCFC Analytics blogposts – Summary #8


Here is the list of interesting posts I found in the past week

  1. An interesting post on home advantage and how it manifests itself into football stats by @FbPerspectives. The post also has a link to a detailed paper from 2009 on home advantage.
  2. Guardian Data blog has an interactive visualization of the Bolton – City game by @jburnmurdoch. The viz has a pitch map + a radial diagram that captures the pass direction and length.
  3. The man in the yellow shirt – an analysis of the refs by @PedroAfonso85
  4. An interactive visualization of the direction of a player’s passes by @alekseynp . Some of the outliers are very interesting.
  5. Momentum in Bolton – City game. by @SoccerStatistic . This is a different approach from the previous attempts on visualizing momentum using this data set.

I did not publish anything last week, although I did start writing. Hopefully I will publish something later this week.

Previous Summaries

Summary #7

Summary #6

Summary #5

Summary #4

Summary #3

Summary #2

Summary #1

If I missed any, please post them in the comments section or tweet them to me!

MCFC Analytics – blogposts summary #7


I did not see too many new posts in the past week. I didn’t publish any as I was busy with a different project.

  1. An interactive viz of Bolton – Manchester City  match data by @JBurnMurdoch on @GuardianData blog
  2. @HPStats attempts at defining metrics to be able to cluster players based on their style. Here is a good first step on Passing
  3. @shots_on_target made a summary of vital stats regarding goals, shooting accuracy, penalties etc..
  4. Scouting report on Tim Howard by @footballfactman
  5. An interactive visualization of the full dataset by @PhilyB1976 I posted this in one of the first few summary posts but there is additional information on the site. worth revisiting!
  6. An

Previous Summaries

Summary #6

Summary #5

Summary #4

Summary #3

Summary #2

Summary #1

If I missed any, please post them in the comments section or tweet them to me!

MCFC Analytics-Summary of blogposts #6


This week I saw a few more new bloggers getting into the act with the data.

First up, there was this article by @RWhittall of TheScore.com where Richard talked about “soccer data abuse by some bloggers using the MCFC data”. The gist of the article is that some of the bloggers are extrapolating too much with their conclusions based on one year’s worth of data from one league. The other point made in the article is that the output of the majority of  the work in soccer analytics isn’t groundbreaking and it is just adding a data context to what we already knew.

While I see where Richard is coming from, I don’t quite agree either with his assessment of the state of soccer analytics or the “data abuse” bit.

Unquestionably, we haven’t even scratched the surface of what we can do with data in soccer. The majority of the research work in the soccer analytics is carried out in the private domain.  That is because soccer data is not a commodity like it is in other sports like Baseball. The MCFC & Opta project could be a significant step in the direction of making soccer data more accessible to a wider audience,  if it can get enough passionate people interested in the project. However, like in any type of writing in the public domain, there is the good and the not so good. One of the things we discussed with Gavin Fleig, Head of Performance Analysis at Manchester City, Simon Farrant, Marketing coordinator at Opta et al is to build a community that fosters communication, collaboration and open feedback among the members and the readers. This should help everyone get better in some time.

Without further ado, here are links to some interesting work I found in this past week.

@MarkTaylor0 has a comprehensive piece on the state of soccer analytics and where it stands vis-à-vis other sports like NFL and Baseball. – The case for data analysis in football. This is a must read.

Analytics posts

  1. @PedroAfonso85 has a couple posts using the advanced data set
  2. @ChrisJLilley continues with his positional analysis series with Strikers and Central attacking midfielders
  3. @FootballFactman ‘s piece talks about what to look for in goalkeepers of the premier league
  4. @shots_on_target talks about the correlation between points in fantasy football and attacking stats
  5. In my weekly opposition analysis series I analyzed at Sunderland using last season’s data.

Visualization posts

  1. Earlier today I saw Voetstat, a neat blog by @Voetstat_craig which has some visualizations of pass completion + heatmaps. There are multiple posts. I haven’t had a chance to read all of them yet.
  2. @TomBerthon has this visualization of how goals were scored in the Bolton – City game from last season

If I missed any links, post them in the comments section and tweet them with the hashtag #MCFCAnalytics. I will retweet them.

Previous Summaries

Summary #5

Summary #4

Summary #3

Summary #2

Summary #1

Sunderland – Opposition Analysis


This is an “Opposition analysis” of Sunderland, City’s opponent on Saturday 2012/10/6 at the Etihad. I used the #MCFCAnalytics Lite data set to do this analysis.

Disclaimer: The analysis is primarily based on data from 2011-12 season with some data points from the first six games of 2012-13 season.

Sunderland – Offense

Goals 1.13 per game – 12th (excluding own goals)
Strong on direct free kicks 5 goals – 1st
% of Open play goals 76% – 4th
% of goals from inside the box 69.7% – 19th
% of goals from outside the box 30% – 2nd
Shots on Target 3.71 per game – 18th
Efficiency: Goals/shots On + off Target 7th
Efficiency Inside the box 16th
Efficiency Outside the box 3rd
Assists per Goals scored 18th
Poor from inside the box 19th in proportion of goals from inside & 16th in efficiency
Strong from outside the box 5th most goals and 3rd most efficient from outside
Final 3rd completions / comp % 16th / 14th
Poor in the final third and opposition box 18th in final 3rd touches & 19th in touches in opp. box
Poor in short passingcompletions / comp % 15th /15th
Good in long balls:  # of successful / success % 9th / 8th
Good in Open play crosses 7th in # of crosses & crossing accuracy
Very few Through balls 18th – less than 1 through ball per game
Other Sunderland just took 6 short corners all season, fewest in the league.

Sunderland – Key attacking players (2011-12)

Goals Bendtner – 8, Larsson & Sessegnon – 7 each,

McClean – 5

Shots On Target Bendtner – 23, Sessegnon – 21, Larsson – 17
Efficiency Larsson – 23%, McClean – 17% and Bendtner – 16. %
Assists Sessegnon – 9, Bendtner – 5
Final 3rd Completions Sessegnon – 401, Larsson – 281, Bardsley – 253
Final 3rd Completion% Sessegnon – 77.4%, Larsson – 66.27%, Bendtner – 60.6%
Touches in opposition box Sessegnon – 120, Bendtner – 98

Sunderland – Offensive summary

Major personnel changes for 2012-13

IN – Steven Fletcher; OUT – Nicklas Bendtner

Bendtner was highest goal-scorer for Sunderland last season with eight. He also had five assists. Steven Fletcher is doing more than enough to replace him. Fletcher has scored all the five goals of Sunderland so far. There have not been any major changes apart from this.

What the numbers say

A mixture of long balls, great long-range shooting, some great free kicks and accurate crossing were the mainstay of Sunderland’s offense last season. Their attack ran through Stephané Sessegnon, Sebastian Larsson and Nicklas Bendtner.

Sunderland was poor in the final third and even worse from inside the box (19th in touches inside the opposition box). They scored 30% of their goals (13) from outside the box. They do not have a lot of through balls (less than 1 per game, 18th in the EPL) or assists (18th in assists per goal scored). Sunderland was poor in short passing (15th in # of completions and completion %). These stats indicate that Sunderland were very direct in attack. The low # of assists per goal is likely due to Sunderland playing a counterattacking style football. (= a lesser emphasis on interplay between multiple players in the final third to create a chance). They used long balls to good effect to get close to the opponents goal and take shots from outside the box. Their shooting and shooting efficiency from inside the box is poor.

So far this season

Steven Fletcher has accounted for all the five goals Sunderland scored this season. They have a hard time keeping the possession of the ball (like last season). They are unbeaten this season with four draws and a win. Their inability to hold on to leads (or scoring an extra goal) has cost them dearly. They had a lead into the second half in four of the five games but have won only once. Their problems with keeping the ball imply that opponents find it easier to breakthrough, especially in the second half when Sunderland is most likely trying to protect a lead or the point.

Steven Fletcher – One man army, so far. Picture courtesy – dailyrecord.co.uk

Sunderland – Defence

Goals conceded 1.21/game – 5th fewest
Final 3rd passes completions allowed 100/game 6th most
Short passes allowed 343/game 2nd most
Shots on Target Conceded 8th fewest
Lots of headed clearances 7th most
Fouls conceded 10.8/game – 8th fewest
Tackling machines! 1stin tackles won76% tackle success rate – 5th highest

5th in last man tackles

Weak in aerial duels, strong in ground duels 19thin % of aerial duels won5th in % of ground duels won
Corners 7th most corners conceded but conceded just 1 goal from corners, fewest in the league
Make it easy for opponent GKs 3rd highest GK distribution success for opponent GKs
Opponents get a lot of clean sheets 3rd highest # of clean sheets for opponents

Sunderland – Defensive summary

Based on the numbers, Sunderland is a clean tackling defence who do not concede many shots on target. However, they allow opponents a lot of short passes & pass completions in the final third. This indicates that they are likely not pressing and defend deep. Opponent GK’s have great success (over 70%, 3rd in the league) distributing the ball against Sunderland, another indicator that they do not press much and defend off the player. Their relatively low foul count is probably indicative of this. They concede a high number of corners but have just conceded one goal off of corners last season. They are strong in ground duels and are one of the worst teams in aerial duels. They also employ a high number of head clearances.

City had a lot of success against Sunderland in the final third with 181 & 167 completions away and home respectively (average: 135). However, this advantage did not translate into shots on target for City. This could be a side effect of their clean tackling and high # of headed clearances.

Sunderland – Goalkeeping – Simon Mignolet

Goals conceded overall 1.13/game – 6th fewest
Goals from outside 0.31/game – 4th most in the league
Saves made 3.2/game – 7th most
GK distribution efficiency(Successful GK distribution/Total GK distribution) 17th of 18 GKs with 29 or more starts
Long passes completion 34% – 16th of 18
Short passes completion rate 77.4% – 17th of 18(53 attempts 2nd fewest)
Ratio of Long to short passes 90-10

Sunderland – Goalkeeping Summary

Mignolet is good with saves and does not allow many goals (which, is probably a reflection of the overall defensive scheme, not just the goalkeeper). However, he seems to have trouble distributing and passing the ball. The proportion of long passes of the total passes is highly skewed in favor of the long passes. These numbers indicate that Mignolet hoofs the ball as far as possible and most of the time his passes end in loss of possession.

The low number of short passes and pass completion rate of short passes could be indicative of an overall scheme and/or that Mignolet & the Sunderland central defenders are not very good at passing short from their goal.

This means pressing the ball high in the defensive third of Sunderland could be a very productive strategy for opponents. City forwards might enjoy a lot of success prolonging their possessions in the final third by keeping the pressure on the Sunderland GK and defence.

City vs. Sunderland Head – to – head 2011-12

  • Sunderland had great success against City last season. They took four points from the Champions
  • At the Etihad, City needed a big comeback from 1-3 down to salvage a point.
  • Sebastian Larsson x 2 and Nicklas Bendtner were the scorers for Sunderland. Mario Balotelli x 2 and Alexsandr Kolarov scored for City
  • At the Stadium of light, Sunderland upset City 1-0 with a late goal from Di Jong Won.
  • City had 181 (away) and 167 (home) completions in the final third, both higher than their average of 135/game. Shots were close to their game averages.

Final word

City is very likely to have a lot of success pressing Sunderland in their defensive third. They might not find it very difficult to pass short and have lengthy possession spells in the Sunderland final third. However, they need to stay patient as Sunderland defend very well as a team. Sessegnon, Fletcher and Larsson are the three players to watch out for at the other end of the pitch.

MCFC Analytics – Summary of blog posts #5


We had a great meeting this weekend to discuss how to move our community forward. We discussed some great ideas. As @MCFCGavinFleig pointed out on twitter, the next big announcements and steps forward will be public in late November/early December when the “CityAnalyticsCommunity” will be launched. Until then, keep blogging away with the data.

Here is a summary of the blog posts based on #MCFCAnalytics data.

Analytics posts

  1. @MarkTaylor0How passing sequences create chances – the title is self-explanatory. Great post
  2. @JDewittLong passing in the Premiership – John looks at the long passing and its correlation to finishing position in the league table. Interesting post. A question that came up when I read this post is, how is correlation to points or goals scored instead of position in the league table?
  3. @TheWestStandO digs deeper into Fernando Torres’ struggles in front of goal last season
  4. @ChrisJLilley defines metrics and rates the attacking midfielders, central midfielders and the defensive midfielders of last season.
  5. @We_R_PLComparison of Top scorers in EPL
  6. @Hpstats  – A better passing statistic this was posted in the comments of summary #4
  7. Fulham – opposition analysis by me

Visualization posts

  1. @AlexThamks – a neat viz of Assists, chances created & key passes per formation + the best 11 for each formation using Tableau Public
  2. @Tomberthon  – a visualization of the advanced data set and how to make sense out of it

Other

  1. @MarchiMax has a refined version #Rstats code for parsing the F-24 XML
  2. @DannyPage has implemented a Ruby on rails code for importing the F-24 XML

Past summaries

Summary #4

Summary #3

Summary #2

Summary #1

MCFC Analytics – Summary of blog posts #4


It has been about a month since the basic MCFC data set has been released and it is great to see lots of people churning out stuff using both the basic and advanced data sets.

Based on the tweets with #MCFCAnalytics tag, there are quite a few peoples’ projects are in progress. Good luck to all of you. Make sure you share your project/blog links with the hashtag.

Some people are looking for partners and contributors to the projects they are working on. If you are interested, please keep a tab on the #MCFCAnalytics tab and get in touch with folks directly.

Analysis posts

  1. @MarkTaylor0Analyzing the passes by comparing them to their expected pass completion rates using passes of James Milner in Bolton Vs. Manchester City from 2011-12 season.
  2. Mark also has post on how Man City and Bolton passed the ball
  3. @JdewittHow goals are scored in EPL
  4.  @ChrisJLilleyAnalyzing center-backs of the premier league
  5. @analysefooty (this blog!)Opposition analysis of Arsenal

Visualization posts

  1. @DanJHarrington – a very interesting visualizations of passes using Vector diagrams in Tableau Public
  2. @MarchiMax – a visualization of where the ball is a few seconds before a shot is taken
  3. @OngoalsscoredVisualization of the goalscorer’s body parts. Very neat!

If I missed any please post your links in the comments section.

Links to previous summaries

Summary #1

Summary #2

Summary #3

Feel free to tweet me or email me if you want to chat with me on something specific!

Arsenal – Opposition Analysis


This is an “Opposition analysis” of Arsenal, City’s opponent on Sunday 23rd September at the Etihad Stadium. I used the #MCFCAnalytics Lite data set to do this analysis

Arsenal – Offense

Open play goals – bread & butter

Goals scored

% of Open play goals

Shots on Target

Shots on Target inside the box

Shot efficiency

Goals/shots On + off Target

Overall

Outside the box

Inside the box

Assists per Goals scored

 

3rd in Aggregate, from inside the box and from open play

1st

3rd

1st

 

 

3rd

7th

5th

5th

 

Strong from inside the box

1st in # of shots on target

Weak from outside the box

16th in % of goals from outside the box

Passing

Final 3rd  completions / comp %

Short passescompletions / comp %

Long passes completions / comp %

Long balls completions / comp %

 

3rd / 4th

1st / 6th

16th / 7th

20th / 19th

 

Other

2nd – open play touches in the opposition’s 18yard box

18th in open play crossing efficiency

  • successful open play crosses/successful + unsuccessful open play crosses

Importance of 1st goal

Scored the first goal 23 times – 3rd in EPL

Record when scoring first 16 W – 3 D – 4 L

Record when not scoring first 5 W – 4 D – 6 L

Arsenal – Key attacking players

Goals Van Persie – 30Walcott – 8Arteta & Vermaelen – 6 each

Shots On Target

Efficiency

Van Persie – 82, Walcott – 34, Ramsey – 18, Gervinho – 17

Van Persie – 21.2%, Arteta & Verlmaelen – 23%, Walcott – 13.7%

 

Assists

Song – 11, Van Persie – 9, Walcott – 8, Gervinho – 6
Final 3rd passing

Completions

Completion %

 

Arteta – 617, Ramsey – 502, Rosicky– 501

Arteta – 85.7%, Gervinho – 80.9%, Sagna– 80.1%

Other interesting aspects Immediate impact of Santi Cazorla, Lucas Podolski and Olivier Giroud

Arsenal – Offensive summary

Personnel changes

RVP was colossal for Arsenal last season with 30 goals. The 2nd highest goal-scorer for Arsenal was Theo Walcott with 8. The Dutchman is not with club anymore. He is replaced by the three-headed monster, Podolski – Giroud – Cazorla.

At first sight it might seem like an RVP-less Arsenal would be a lot easier to defend. It might even be true for the first handful of games of the season. However, once Giroud, Podolski and Santi Cazorla are in-sync with each other and with Arsene Wenger’s scheme, they will be a much harder team to defend.

As Arsene Wenger pointed out after the 6-1 win over Southampton, when you have someone like RVP who scored 30 goals, the opposition knows who will get the ball. Arsenal have added variety to their attack with Giroud, Podolski and Cazorla upfront. All three can shoot, score, assist and work to create space for the others.

While Giroud has not scored yet, his movement has been intelligent and has been unlucky on occasion. Santi Cazorla has slotted in seamlessly at Arsenal (and in the EPL) and much of the same for Lucas Podolski. Cazorla leads EPL in completions in the final third and already has a goal and 2 assists. Podolski has 2 goals and an assist.

What the 2011-12 numbers say

Based on last year’s numbers Arsenal attack is primarily based on short passing and taking high percentage shots from close range. They are 1st in short passes completed and 1st in shots on target from inside the box. Arsenal also gets majority of their goals from open-play. They are 2nd in touches inside opponents’ 18-yard box. Arsenal also have a high assist to goal ratio. Arsenal are bottom of the table in long balls and are 16th in long pass completions. They also do not cross particularly well.

All this put together: Arsenal pass, pass and pass some more until they get inside the area. Once inside the area they try to pass again before taking a high percentage shot (or miss the shooting opportunity).

They were average to mediocre at converting corners and set pieces, although that might change with the arrival of Steve Bould as Wenger’s deputy. Steve is known for his preparation and tactical work on the set pieces. We have already seen some of it this season with Cazorla making some signs holding up the ball before taking corners. Both Cazorla and Lucas Podolski are very good free-kick takers and Cazorla has a powerful outside shot. He led La Liga last season with 5 goals from outside the box (including direct free kicks).

Santi Cazorla – Genius : Photo Courtesy – Guardian

I have written a piece about Santi Cazorla’s impact on a football team a few weeks ago. He has already had a big impact at Arsenal. Not only does he add bite to the attack upfront, his arrival also allows Arteta to play much deeper in the central midfield, which seems to suit him better. This also allows Arsenal to quickly transition to their defensive shape when not in possession. Cazorla (and Podolski) both track back to defend when they lose the ball. Something that RVP was not very good at.

To slow the Arsenal offense, City needs to find a way to minimize the impact of Cazorla and Podolski. Arsenal is a bit weak at fullbacks due to the absence of right back Bacary Sagna. Carl Jenkinson is playing in his place and has looked suspect. They do not attack much on the right, as Jenkinson stays conservative for the most part. Gibbs on the left side has been much more adventurous. If you do a heat map of Arsenal attacks so far this season, I will not be surprised if it is skewed to the left.

To slowdown Cazorla will not be easy. During his time at Villarreal, teams like Barça would push their fullbacks up and force Cazorla to defend the full back, thus pushing him deep and further away from the high-value areas.

Arsenal – Defence

Goals conceded

49 – 8th lowest

Touches in final 3rd allowed

Lowest in EPL

Shots Conceded

3rd lowest3rd lowest– From inside the box2nd lowest – From outside the box

Tackles

1st in last man tackles

Clearances

2nd lowest in all clearances & headed clearances

Blocks

Lowest

Arsenal – Defensive summary

After their early season funk and the 8-2 loss at the Old Trafford Arsenal have defended really well last season. They allowed the lowest # of touches in the final 3rd and the 3rd lowest # of shots in the league.

Arsenal are also 1st in last man tackles with 25 (12 more than the 2nd best). This implies that they most likely defended with a high backline and tried to recover possession as quickly as possible. Since they keep the ball a lot, it reduces the touches for the opposition in Arsenal’s defensive third. The last man tackles were by center-backs to cut out the through balls.(Koscielny – 9, Vermaelen – 5 & Mertesacker – 3). With such a defensive scheme, it is not surprising that Arsenal forced the highest # of offsides and have let in 4th highest # of through balls. Arsenal defence also has the lowest # of blocks and 2nd lowest # of clearances. They defending far away from their area, so there is a less need for clearances.

This season, so far has been a slightly different story. Arsenal are defending deeper (opinion based on watching games) and more compactly (2 lines of 4 very close to each other). There is more emphasis on defending set pieces and corners. This could all be due to Steve Bould but could also be due to the absence of Bacary Sagna or probably a bit of both. They have conceded just once so far (on what seemed like gaffe by Szczesny).

By defending deeper Arsenal might concede a lot more corners, crosses and throw-ins close to the area but it also reduces their giving up breakaway attacks and through ball opportunities.

Arsenal – Goalkeeping – Wojciech Szczesny

Goals conceded overall

49 – tied for 11th lowest

Saves

Lowest in EPL

Clean sheets

13 – tied for 5th most

GK distribution efficiency(Successful GK distribution/Total GK distribution)

2nd best

Long passes completion

39%

Short passes completion rate

95.5% – 3rd best

Proportion of Long to short passes

51-49

Arsenal – Goalkeeping Summary

Szczesny is one of the best young goalkeepers in the league prone to the occasional error (like last week vs. Southampton). He is one of the best short passer and 2nd best distribution. He also has one of the most balanced long passes to short passes ratio at 51:49. This stat underlines further the Arsenal philosophy of short passes.

He did concede a lot of goals (49) but a lot of it is down to Arsenal’s defensive scheme. They used a high backline, which means when the opposition forwards beat the high line, they were more likely to have a favourable match-up in terms of numbers and a clear sight of the goal. Szczesny’s league lowest # of total saves could very likely be a side effect of the overall defensive scheme.

City vs. Arsenal Head – to – head 2011-12

  • City won at home 1 – 0 and Arsenal won at home 1 – 0
  • City missed Yaya Toure in the game at Emirates and failed to register a shot on target for the only time all season.
  • City also had a season low 53 successful passes in the final third in the game at Emirates (season average : 135)
  • Even at the game in Etihad City only managed 105 successful passes.
  • Importance of 1st goal – Both teams have impressive records when scoring first, especially City
    • City’s record when scoring first is 25 Wins 2 Draws and 1 Loss
    • Arsenal’s record when scoring first is 16 Wins 3 Draws and 4 Losses

Final word

Last season Arsenal gave Manchester City two of its toughest games of the season. They did not allow City to enjoy the possession dominance in the final 3rd they are used vs. rest of the teams in the EPL. The games were very close. Small details and moments of individual brilliance (or an error) determined the results.

To win, City needs:

  • to limit the influence of Cazorla and Podolski.
  • Take advantage of one of the few weaknesses of Arsenal, the fullbacks – especially on the right side.
  • Minimize Arsenal’s touches in the final 3rd – Arsenal will enjoy a lot of possession due to the nature of their game. However, limiting their possession in the high-value areas will be key to City’s success.
  • Score first – City has an impeccable record of 25W 2D 1L when scoring first
  • David Silva, Yaya Toure, Balotelli and Tevez need to have great games for City. The injury to Samir Nasri at the Bernabeu could be a big blow if it forces him to miss out the Sunday’s clash.

MCFC Analytics – Summary of blog posts # 3


Thanks for the amazing response to Summary of blog posts #1 & Summary of blog posts #2

I also want to thank people who have reached out to me via twitter with links to their blogs & posts.

Goalscorer ‘footedness’ by @DavidAHopkins measures the footedness or the foot favoured by Premier League goalscorers.

How do the more successful clubs keep the ball in EPL by @JDewitt talks about how the top teams in EPL keep possession. Also by John is Successful Passing and Winning

A sneak peek of a very interesting carto by @Kennethfield  Charlie Adam’s “passing wheel”

Football Philosophy – Long passes by @Poolq1984 explores the importance of long ball in football.

@We_R_PL has a nice post on how to use the MCFC dataset more efficiently. He also has spreadsheet which has the own goals calculated per team.

@footballfactman has a post on Darron Gibson using a mix of data from MCFC dataset, whoscored and statszone

The always excellent MarkTaylor0 has detailed post Analysing the quality of shots in Bolton – Manchester City game using the advanced dataset.

@ChrisJLilley has 3 posts on his blog using MCFC data

GK positional analysis

Premier league game changers Part I & Part II

@DanJHarrington has cranked up a lot of things using the advanced dataset

1.  an interactive tableau viz to see touches of each player in Bolton -City on the pitch.

2. Passing visualization using D3.js

3. Dan also has some interesting visualization work in progress. There is a cool video in the link showing ball movement.

Network passing diagrams by @DevinPleuler

Bolton – http://t.co/mcRQ0oHU

Man City – http://t.co/6mtGgJQS

Extracting data from XML

There have been some questions regarding this and some folks have come up with solutions

1. If you have MS Excel 2007 or a later version you can open the file in XML. The only issue with is that XML’s are nested and Excel converts this into a very flat format. So you will see multiple rows for the same events. For example: A successful pass has multiple rows indicating the direction, the x,y coordinates of where it is passed to. Read the data spec thoroughly to understand how the data is formatted in the XML. It will help understand the data much better.

2. Code for R users to extract the F-24 XML by @MarchiMax

3. Code snippets from @JBrisson to extract events from the F-24 XML

4. If you are into programming, most languages have XML parsers. A simple search will get you code snippets to start with.
If I missed any links, please let me know via Twitter or comment on the blog post. Always use #MCFCAnalytics tag in twitter so I can pick them up easily!

MCFC Analytics – summary of blog posts # 2


I got good response for the first summary post I did last week. Here is a summary of articles done using MCFC Analytics data in the past week.

@MarkTaylor0 did a great post called “How teams win”. Mark calculated a list of various correlations that lead to wins.

Mark also did another interesting post  on  Newcastle’s 2011/12 season and the role of luck in their success.

@JimmyCoverdale Did a post enumerating how “Will he score goals in the Premier League? Is a wrong question to ask “ of newcomers to the league.

Jimmy also has a great post discussing the “Effectiveness of Crossing and the correlation with chances created”

@Zahlenwerkstatt did a post ranking goalkeepers in the 2011-12 season based on minutes played, save % and goals conceded.
I have made a couple of follow-ups based on the feedback of Final 3rd Analysis  Follow up #1 & Follow up #2

I have a couple of new posts lined up for later this week.

@OptaPro & Gavin Fleig‘s update on the Advanced data & T & Cs

Simon and Gavin released the updated T & Cs this past week allaying apprehensions of some of the bloggers regarding some of the language in the original T & Cs.

They have also announced that the first installment of the advanced data set will be released this week! I am excited.

If you find an article that is using MCFC Analytics data and is not posted here, please let me know. I will add it in the next week’s summary.