Notes from DataMeet-Up in Delhi, 31 May 2013

Authors: Satyakam Goswami & Nasr ul Hadi

Venue

Akvo Foundation, Yusuf Sarai, New Delhi

Discussion

This was our second meet-up in Delhi. We began with a round of introductions, a brief discussion about DataMeet’s (Bangalore-centric) history.

Data Journalism

Nasr shared his experiences trying to apply data science to journalism and how, in most cases, the data was either incomplete, not specific enough or simply inaccessible. One of his projects involved patrolling with the Delhi Police and tracking the number of accident victims they transported to the Jai Prakash Narayan Trauma Center. Out of curiosity, Satyakam asked him more about how the trauma centre worked, because they have been using a FOSS SMS service to reduce clogging of their registration/reception counters.

Nasr also discussed the work of organisations like Human Rights Law Network and the immense amount of data they have on instances of abuse, child incarceration, etc in Indian prisons. Unfortunately, these aren’t digital. Guneet gave a similar description of police reform work done by CHRI.

Collaborations with Other Groups

We also discussed opportunities to collaborate with other meetups that shared our interest in data science. Hack/Hackers’ Delhi chapter conducted a news apps hackathon in February. Their next event will focus on data journalism, probably along the lines of the recent Editors’ Lab. We still need to discuss how to:

  1. educate Journalist on the importance of data driven journalism, and
  2. enable them with the tools and possible collaborations with people in this group.

We are also open to other such collaborations, but couldn’t think of any other community with a similar interest. If you know of any, please do let us know.

Elections 2014 and Public Infrastructure

Everyone was interested in building apps around then upcoming Delhi Elections and then the General Elections in 2014. Anyone interested in taking the lead on this, please do as soon as possible.

Akvo’s Isha told us how they are working closely with organisations that collect substantial amount of district-level data about water/sanitation infrastructure. To collect data in the field, they use/promote a self-developed Android-based software called FLOW.

Vivek also brought up data.gov.in and how he found some of their data to be incomplete. Satyakam suggested we work closely with the data.gov.in team in order to get the data we want. Developers in attendance suggested we ask the portal for a webservices API. Surenderan described how they do this at Change.org. Google’s Raman Jit joined us close to wrap up and offered to help with sourcing any data we need from various government stakeholders.

We still have to decide a date and venue for the June meet.

Participants

Guneet Narula, Sputznik

Isha Parihar, Akvo

Ashim Kapoor

Nasr ul Hadi

Vivek Khurana, Mintango Technologies

Surendran Balachandran, Change.org

Gaurav, Film Maker

Gora Mohanty

Irshad Reyaz, Landshark Labs

Rahul

Raman Jit Singh Chima, Google India

Satyakam Goswami, Consultant

DataMeet-Up in Delhi, Friday 31st May

Note: The post contains revised date and location of the meeting. Earlier, it was set to be held on 24th May. Sorry for the inconvenience.

We are organising the next Delhi DataMeet-Up at 5 pm, Friday 31st May, 2013.

This time we are meeting at the Akvo Foundation office. The address and map are given below.

We have decided to focus on the following topics in this meeting:

  • Developing and organising the DataMeet community in Delhi, possible collaboration with other technical communities, research organisations etc.
  • Possibility of a job/project portal, to share, develop and interact about work opportunities.
  • Start creating a list of datasets that we can request as a group to be made available on data.gov.in.

RSVP: If you are joining us for the DataMeet-Up, please let us know at mail[at]ajantriks.net.

Location: Akvo Foundation, 3rd Floor, Ramnath House, Plot #18, Yusuf Sarai Community Centre, Yusuf Sarai. Nearest metro is Green Park.

loading map - please wait...

Akvo Foundation 28.557765, 77.208158 Akvo Foundation3rd Floor, Ramnath HousePlot #18, Yusuf Sarai Community CentreYusuf Sarai

Notes from DataMeet-Up in Delhi, 12 April 2013

Last Friday (12th April), a DataMeet-Up was organised in Sarai-CSDS, New Delhi.

There has been talk for a meeting like this in Delhi for long now. A few hackathons and data-related events have been organised in the past (see this and this). With the recently concluded Open Data Camp in Bangalore and the substantial buzz (at least in Delhi) created by the 12th Plan Hackathon earlier this month, the timing of this DataMeet-Up was quite apt to take a step back, focus on big picture issues, and keep building the open data community in Delhi.

The meet-up began with a round of introductions, and a brief discussion about the DataMeet group and its (Bangalore-centric) history.

This was followed by a discussion of the 12th Plan Hackathon experience. The presence of members of prize winning teams (from the IIT Delhi venue) and representatives of the NDSAP-PMU (NDSAP Project Management Unit) team energised the conversation. The hackathon was found to be a positive experience overall. It especially succeeded to create a set of initiatives to address the difficult task of making planning documents, statistical evidence and proposals for allocation more accessible to (at least some of) the people of the country. The demanding nature of the datasets and documents made available for the hackathon (in terms of the required background understanding of the themes) perhaps led to a number of submissions to engage with issues not directly associated to the 12th Plan.

From the experiences of the hackathon, we quickly moved to talk about ‘what’s next?’. Responding to the demand for a public API to access the datasets hosted at the data portal, Subhransu and Varun from NIC talked about the ongoing efforts to develop the second version of the OGPL software that the data portal uses. Unlike the existing version, the second version will not host uploaded datasets as individual files but as structured/linked data (following RDF specifications). While this was of great interest to some of the participants in the meet-up, it was not immediately clear to everybody why this shift (from separate data files to structured/linked data) is such a big deal. So we spent some time discussing semantic web, linked data and API.

Another thread of ‘what’s next?’ discussion explored official and un-official processes for requesting government agencies to publish specific datasets on the data portal, and also on how (and whether) non-government agencies can share cleaned-up government datasets through the data portal. We talked about approaching Data Controllers for the agencies concerned, endorsing each other’s data requests to drive community-based demands, and also the possibility of an alternative portal (perhaps using OGPL itself) to share governmental datasets cleaned up and reformatted by individuals and organisations. The group also noted and celebrated the initiative from NIC to use GitHub for storing and sharing code and data used in visualisations, data cleaning processes and data-based applications.

The challenge of unclear licensing of data, both in case of bought data products (such as Census of India and Nation Sample Survey) and publicly available datasets, was flagged but was not discussed fully.

Next, we had a round of inputs from the participants on what kind of data they have been collecting and working with, and using what softwares.

Akvo is working closely with organisation(s) that collect substantial amount of district-level data in certain focus regions, including water infrastructure, quality and usage data. For field level data collection, they use/promote a self-developed Android-based software called FLOW. They mentioned that absence of good quality basemaps (either vector or satellite imagery-based), for the areas they are working in, makes environmental data collection rather difficult. The changed Google Map API terms of use is forcing them to consider other options and move off Google (Satellite) Maps. It was suggested that they should try using imagery from Bhuvan. They also expressed interest in using data from and contributing to the OpenStreetMap project.

Accountability Initiative, located in the Centre for Policy Research in New Delhi, collects, digitises, cleans up, uses, analyses and archives substantial volume of national and state budget data and utilisation reports. They, however, cannot share the (digitised and cleaned up) data publicly due to ambiguous and missing licence agreements. They are producing large volume of data analysis but relatively lesser amount of visualisations. They mostly use Microsoft Excel and Stata for their data operations. Picking up the thread from Vibhu (of Accountability Initiative), Subhransu (of NIC) talked about the data cleaning challenges NIC is facing while working with various government agencies to open up their respective datasets.

Ravi and Pratap talked about the data usage situation in the journalism world. They mentioned that most journalists prefer accessing government data in hard (printed) copies, as that is seen as a permanent, easily archivable, and easily accessible (without knowing programming and data-wrangling skills) format. RTI remains one of the backbones of investigative journalism, and almost the entire volume of government data obtained through RTI gets stored in printed format (all over the journalists’ offices). The barrier of programming skills is the most important factor keeping Indian journalists away from more explorative and in-depth usage of government data.

In his quick update on the students’ scene in Delhi, Parin told us that there is little excitement around data analysis, management and visualisation. The group found this troubling. In a later discussion, maybe we can talk about it more and develop plans for engaging students to work with government (and non-government) data.

We briefly discussed GapMinder and Ushahidi, and data visualisation work by the teams at the New York Times and the Guardian. These examples are well-known but how to recreate them (in a different context) is often not very clear.

At the end we went back to a question regarding the quality of data published in the data.gov.in portal that was raised earlier in the community interaction session of the NDSAP workshop held on 4th April 2013. We were informed by Subhransu and Varun that the data shared on the portal goes through a three-stage quality-checking procedure — (1) first, the data to be shared is put together and rechecked by the Data Creators (who are headed by a Data Controller) in the government agency concerned, (2) the Data Controller of the agency undertakes the second stage of quality checking, and (3) finally the data is shared with the NDSAP-PMU team at NIC, who rechecks the data before approving its uploading to the portal. If required, the NIC team asks the agency to share the raw data for comparing with the (formatted) shared data. Vibhu raised a crucial question about how dependable and representative are such ‘raw data’ collected by the central government agencies.

As the questions were getting tougher and the evening older, we concluded the meeting. The next meeting will be sometime in mid-May. exact date and venue is to be decided.

Participants:

Guneet Narula, Sputznik

Amitangshu Acharya , Akvo

Isha Parihar, Akvo

Ravi Bajpai, Indian Express

Vibhu Tewary, Accountability Initiative

Pratap Vikram Singh, Governance Now

Shashank Srinivasan, Independent

Subhransu, NIC

Varun, NIC

Parin Sharma, Independent

Sumandro Chattapadhyay, Sarai-CSDS

DataMeet-Up in Delhi, Friday 12th April

With the recently held Open Data Camp 2013 in Bangalore and the upcoming #12thPlan hackathon being organised by Planning Commission et al, the time seems right to organise a gathering of data enthusiasts in Delhi.

The DataMeet gatherings started in Bangalore in mid-2011 initially as periodic meet-ups, which slowly became a wider loose network of data enthusiasts in Bangalore and elsewhere, eventually leading up to Open Data Camps and data visualisation workshops.

Please let me invite you to a DataMeet gathering to be held in Sarai, CSDS, Civil Lines, Delhi on Friday, 12th April at 5 pm. The purpose of this gathering is to meet fellow (government and non-government) data enthusiasts in Delhi, share challenges and opportunities, and beginning discussions around future projects and initiatives. We hope to have a good mix of researchers, programmers, activists and advocates.

RSVP: If you are joining us for the DataMeet-Up, please let us know at mail[at]ajantriks.net.

Location: The map below shows the location of Sarai. The closest metro station is Civil Lines (on the Yellow Line), located at the other end of the Underhill Road. Let us know if you need any further details regrading reaching CSDS or other things.

loading map - please wait...

Sarai, Centre for the Study of Developing Societies, New Delhi 28.678258, 77.218137 Sarai Centre for the Study of Developing Societies #29 Rajpur Road

Photo Gallery Open Data Camp 2013 – Day 1

Pictures from Day 1 of OpenDataCamp 2013. There are about 400 of them taken by Meera Sankar. I have purposely not filtered any pictures. Its the complete set. You can use them as you like as long as the usage follows Creative Commons Attribution-NonCommercial-ShareAlike 3.0 Unported License. Please credit Meera Sankar.
Continue reading Photo Gallery Open Data Camp 2013 – Day 1

DataMeet Events and Major Links

Major Links this week!

Projects

Mentions of Open Data

Events:

Questions of the week

I was wondering whether there exists a mapping of PIN Codes to MP/MLA/Corporator constituencies/wards, please? Also, if a similar mapping exists to districts/taluks etc.?  – vis Gautam John

Data.Gov.In Beta is Up

Data.gov.in just launched its beta version. With 13 datasets already online and downloadable it is a good start. Granted not everything is working smoothly but given how new it is I’m feeling pretty good about the effort.

Like all things everywhere it never really is about intentions or really first steps but implementation and continued support. This is where the difficultly will lie with the Indian Government’s IT department the National Informatics Centre. They will be running the site and implementing the portal and I hope also can be an intermediary between citizens and the ministries that are providing the data.

Unlike data.gov or data.gov.uk, the data does not have to stay on data.gov.in, according the governing policy the National Data Sharing and Accessibility Policy. The individual Ministry has control of the data and can still charge, not make data downloadable, and also restrict data and not tell you why. Also incredibly valuable datasets like the Census and the National Sample Survey (official link not working) will not be available for download.
Continue reading Data.Gov.In Beta is Up

Reflections of Chennai’s Data Workshop from India Water Portal

Cross Posted from India Water Portal

Written by Aarti Kelkar-Khambete

This workshop organised by Transparent Chennai at The Institute of Financial Management and Research, Chennai was the outcome of the experiences of the earlier open data camp events organised by Transperant Chennai in Bangalore and Hyderabad, where there was a wide discussion among attendees who were excited by the potential of
data and the open data movement, but who did not have the necessary skills or technical background to work effectively with it.
It was felt that there was a much larger community of activists, researchers, and on-profits who could benefit from learning to use the kinds of tools presented at the camps. Thus, this event was planned differently from a data camp and focused on training activists, researchers and students to work with data where participants would learn about open data, data visualisation, spatial data and practical issues that come up when working with data in various forms.

The workshop thus aimed at helping the participants to:

  • Understand various formats of data, diverse possibilities of data visualisation and effective tools for doing so, with a special focus on web-based tools
  • Understand how to think through projects involving collection, processing and visualisation of data
  • Develop a basic understanding of software packages and methods for visualising quantitative data, creating geo-visualisation and undertaking participatory mapping
  • Understand the connection between data technologies and rights to access and use data.

Read the rest of the summary here.

DataMeet is a community of Data Science and Open Data enthusiasts.