Crosspost: The Hindu’s Rape Statistics Story

A few weeks ago The Hindu’s Data Blog had a three part series looking at Data on rape cases in Delhi.  It was a powerful story that had a lot of people talking and a good example of what can be done with data available.  Rukmini S has written a piece detailing how she combed through the data to get the story.

Below is an excerpt.

How we put together the statistics that went into our investigation

“Delhi is better than most Indian cities for legal data journalism because it puts all district court judgements online – and promptly – and these can be text-searched. Ideally, I should have been able to scrape all judgements for ‘376’, the IPC section related to rape. However, I encountered a ton of issues that would have rendered a scraping tool useless (as far as I know – if you think there was a way I could have done it, do leave me a comment).

For one, while rape cases are sessions-triable, and so should show up as ‘sessions case” in the nomenclature, for some judges the cases were inexplicably classified as “criminal cases”. Then, while a simple text-search for ‘376’ should have been enough to get me all cases, the text-search function inexplicably collapsed around March 2014. With elections coming up, I had limited time to work on this and had to essentially open every single sessions court judgement and search for ‘376’ in each one. Luckily, the search function revived after two months.”

Read the rest here.