What do you do after 18,000+ games of cricket?

Hey there everyone!

The Twenty20 Cricket World Cup is under way, and the final 10 teams have now been named and will be competing over the next 2-3 weeks for the world title.

What better excuse to dust off my web crawler and do another iteration of Cricket Team and Player stats.   So yesterday I logged back into import.io and fired up my previously created automatic data extraction web crawlers that source information from http://www.espncricinfo.com/.

Now, my previous cricket crawling was focussed in on Cricket Test matches, so I knew I needed to make some adjustments to extract data specific to Twenty20 style matches.  [For the un-initiated, Twenty20 cricket is a compressed 20 overs per side format, only taking ½ day to play, versus the traditional 5 day game]. To my pleasant surprise I found that ImportIO now has a new (still in beta – but working ok for me) crawling method that specialises in pulling multiple tables of data.  WOW, now I didn’t have to limit myself to one type of game stats – I could crawl them ALL!

import io beta

Each of the 10 teams have 15/16 players in their squads, so it turns out that the 155 players in the tournament have over 18,700 games of cricket recorded between them.  That’s around 120 games each – and a really rich source of data.

So with the introduction of a ‘Game Type’ dimension, and some tweaking of some calculations, I now have transformed my Ashes Players Tableau viz into a Twenty20 International world cup viz.

Check it out on Tableau public.

Twenty20International small

Twenty20International small 2

Enjoy responsibly……..



Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s