Notes from DataMeet-Up in Delhi, 22 November 2013

We had DataMeet-Up on Friday, November 22, 2013, at the Akvo office in Yususf Sarai Community Centre, Delhi.

Here are the notes from the meet-up [additional information in square brackets]:

Election Data Hackathon

  • We will undertake a collaborative mapping of datasets relevant for election data hackathon, using GitHub and Google Drive. More details about this below.
  • Datasets that we are trying to locate include: election results data (total vote count, vote count per party/candidate, etc), total utilisation and composition of utilisation of MP Local Area Development funds, parliamentary activities of MPs (presence/absence, questions asked, bill discussed, committees joined. etc), crime data corresponding to constituencies, etc.
  • We will identify organisations who might hold additional relevant data, such as PRS Legislative Research, Association for Democratic Reforms (and MyNeta.info), Gramener, and Hindustan Times [Anika used to work at HT].
  • Two caveats: (1) we may not get unique and standard identifiers across datasets, and (2) calculations may get difficult in case of by-elections [Lok Sabha Secretariat will have details of all by-elections, which can be accessed through RTI request].

Hack for Change on Women’s Rights

  • Shobha, Breakthrough.tv, led the discussion on the planned Hack for Change event being organised by Breakthrough and Hacks/Hackers, as part of the 16 days of activism against violence against women.
  • The hackathon is organised around urban safety data from Whypoll , multimedia evidences of early marriage practices in Bihar and Jharkhand gathered by Gramvaani , etc. It will also include a Wikipedia Edit-athon facilitated by Noopur Raval.
  • There were multi-directional discussions around other datasets of relevance for the hack event, which I have not kept track of very well. Overall, there were discussions around datasets available from , those published by National Crime Records Bureau, FIR and call database of Delhi police (and how to access that), and data on violence against women gathered by Tata Institute of Social Sciences from police stations across seven states.

Presentation on iPython

  • Konark Modi presented a detailed introduction to using iPython to undertake data cleaning in a very organised manner, as well collaboration features/workflow of iPython.
  • There emerged a demand for a tutorial on OpenRefine (previously Google Refine), which will be organised in a later meeting.

Mapping Indian Election Data

  • We will start documenting publicly available datasets relevant for studying past General Assembly (Lok Sabha) elections in India and the activities of the elected members at present. One can contribute to this mapping exercise in two ways, as mentioned below.
  • GitHub: We have created a repository for this data mapping exercise under the DataMeet organisation at GitHub. The organisation page can be accessed here, and the (india-election-data) repository can be accessed here. In the repository, I have created a draft format for documenting the identified datasets. This draft format can be accessed here. Please feel free to suggest changes to the draft format by opening an issue.
  • To document a dataset, use the format given in the repository, fill up the details, and rename the file according to the dataset’s name, such as “election-results-delhi-1995.md”. Then if you notice any requirement of data cleaning/reorganisation or lack of clarity regarding the dataset, open an issue (where the name of the dataset is mentioned) to note that task.
  • Google Drive spreadsheet: Alternatively, you can access this spreadsheet on Google Drive and add the relevant information about the dataset documented by you.

Please comment here or post to the DataMeet mailing list for any clarifications and suggestions.