Category Archives: City

DataMeet-Up in Delhi, Friday 31st May

Note: The post contains revised date and location of the meeting. Earlier, it was set to be held on 24th May. Sorry for the inconvenience.

We are organising the next Delhi DataMeet-Up at 5 pm, Friday 31st May, 2013.

This time we are meeting at the Akvo Foundation office. The address and map are given below.

We have decided to focus on the following topics in this meeting:

  • Developing and organising the DataMeet community in Delhi, possible collaboration with other technical communities, research organisations etc.
  • Possibility of a job/project portal, to share, develop and interact about work opportunities.
  • Start creating a list of datasets that we can request as a group to be made available on data.gov.in.

RSVP: If you are joining us for the DataMeet-Up, please let us know at mail[at]ajantriks.net.

Location: Akvo Foundation, 3rd Floor, Ramnath House, Plot #18, Yusuf Sarai Community Centre, Yusuf Sarai. Nearest metro is Green Park.

[mapsmarker marker=”3″]

Notes from DataMeet-Up in Delhi, 12 April 2013

Last Friday (12th April), a DataMeet-Up was organised in Sarai-CSDS, New Delhi.

There has been talk for a meeting like this in Delhi for long now. A few hackathons and data-related events have been organised in the past (see this and this). With the recently concluded Open Data Camp in Bangalore and the substantial buzz (at least in Delhi) created by the 12th Plan Hackathon earlier this month, the timing of this DataMeet-Up was quite apt to take a step back, focus on big picture issues, and keep building the open data community in Delhi.

The meet-up began with a round of introductions, and a brief discussion about the DataMeet group and its (Bangalore-centric) history.

This was followed by a discussion of the 12th Plan Hackathon experience. The presence of members of prize winning teams (from the IIT Delhi venue) and representatives of the NDSAP-PMU (NDSAP Project Management Unit) team energised the conversation. The hackathon was found to be a positive experience overall. It especially succeeded to create a set of initiatives to address the difficult task of making planning documents, statistical evidence and proposals for allocation more accessible to (at least some of) the people of the country. The demanding nature of the datasets and documents made available for the hackathon (in terms of the required background understanding of the themes) perhaps led to a number of submissions to engage with issues not directly associated to the 12th Plan.

From the experiences of the hackathon, we quickly moved to talk about ‘what’s next?’. Responding to the demand for a public API to access the datasets hosted at the data portal, Subhransu and Varun from NIC talked about the ongoing efforts to develop the second version of the OGPL software that the data portal uses. Unlike the existing version, the second version will not host uploaded datasets as individual files but as structured/linked data (following RDF specifications). While this was of great interest to some of the participants in the meet-up, it was not immediately clear to everybody why this shift (from separate data files to structured/linked data) is such a big deal. So we spent some time discussing semantic web, linked data and API.

Another thread of ‘what’s next?’ discussion explored official and un-official processes for requesting government agencies to publish specific datasets on the data portal, and also on how (and whether) non-government agencies can share cleaned-up government datasets through the data portal. We talked about approaching Data Controllers for the agencies concerned, endorsing each other’s data requests to drive community-based demands, and also the possibility of an alternative portal (perhaps using OGPL itself) to share governmental datasets cleaned up and reformatted by individuals and organisations. The group also noted and celebrated the initiative from NIC to use GitHub for storing and sharing code and data used in visualisations, data cleaning processes and data-based applications.

The challenge of unclear licensing of data, both in case of bought data products (such as Census of India and Nation Sample Survey) and publicly available datasets, was flagged but was not discussed fully.

Next, we had a round of inputs from the participants on what kind of data they have been collecting and working with, and using what softwares.

Akvo is working closely with organisation(s) that collect substantial amount of district-level data in certain focus regions, including water infrastructure, quality and usage data. For field level data collection, they use/promote a self-developed Android-based software called FLOW. They mentioned that absence of good quality basemaps (either vector or satellite imagery-based), for the areas they are working in, makes environmental data collection rather difficult. The changed Google Map API terms of use is forcing them to consider other options and move off Google (Satellite) Maps. It was suggested that they should try using imagery from Bhuvan. They also expressed interest in using data from and contributing to the OpenStreetMap project.

Accountability Initiative, located in the Centre for Policy Research in New Delhi, collects, digitises, cleans up, uses, analyses and archives substantial volume of national and state budget data and utilisation reports. They, however, cannot share the (digitised and cleaned up) data publicly due to ambiguous and missing licence agreements. They are producing large volume of data analysis but relatively lesser amount of visualisations. They mostly use Microsoft Excel and Stata for their data operations. Picking up the thread from Vibhu (of Accountability Initiative), Subhransu (of NIC) talked about the data cleaning challenges NIC is facing while working with various government agencies to open up their respective datasets.

Ravi and Pratap talked about the data usage situation in the journalism world. They mentioned that most journalists prefer accessing government data in hard (printed) copies, as that is seen as a permanent, easily archivable, and easily accessible (without knowing programming and data-wrangling skills) format. RTI remains one of the backbones of investigative journalism, and almost the entire volume of government data obtained through RTI gets stored in printed format (all over the journalists’ offices). The barrier of programming skills is the most important factor keeping Indian journalists away from more explorative and in-depth usage of government data.

In his quick update on the students’ scene in Delhi, Parin told us that there is little excitement around data analysis, management and visualisation. The group found this troubling. In a later discussion, maybe we can talk about it more and develop plans for engaging students to work with government (and non-government) data.

We briefly discussed GapMinder and Ushahidi, and data visualisation work by the teams at the New York Times and the Guardian. These examples are well-known but how to recreate them (in a different context) is often not very clear.

At the end we went back to a question regarding the quality of data published in the data.gov.in portal that was raised earlier in the community interaction session of the NDSAP workshop held on 4th April 2013. We were informed by Subhransu and Varun that the data shared on the portal goes through a three-stage quality-checking procedure — (1) first, the data to be shared is put together and rechecked by the Data Creators (who are headed by a Data Controller) in the government agency concerned, (2) the Data Controller of the agency undertakes the second stage of quality checking, and (3) finally the data is shared with the NDSAP-PMU team at NIC, who rechecks the data before approving its uploading to the portal. If required, the NIC team asks the agency to share the raw data for comparing with the (formatted) shared data. Vibhu raised a crucial question about how dependable and representative are such ‘raw data’ collected by the central government agencies.

As the questions were getting tougher and the evening older, we concluded the meeting. The next meeting will be sometime in mid-May. exact date and venue is to be decided.

Participants:

Guneet Narula, Sputznik

Amitangshu Acharya , Akvo

Isha Parihar, Akvo

Ravi Bajpai, Indian Express

Vibhu Tewary, Accountability Initiative

Pratap Vikram Singh, Governance Now

Shashank Srinivasan, Independent

Subhransu, NIC

Varun, NIC

Parin Sharma, Independent

Sumandro Chattapadhyay, Sarai-CSDS

DataMeet-Up in Delhi, Friday 12th April

With the recently held Open Data Camp 2013 in Bangalore and the upcoming #12thPlan hackathon being organised by Planning Commission et al, the time seems right to organise a gathering of data enthusiasts in Delhi.

The DataMeet gatherings started in Bangalore in mid-2011 initially as periodic meet-ups, which slowly became a wider loose network of data enthusiasts in Bangalore and elsewhere, eventually leading up to Open Data Camps and data visualisation workshops.

Please let me invite you to a DataMeet gathering to be held in Sarai, CSDS, Civil Lines, Delhi on Friday, 12th April at 5 pm. The purpose of this gathering is to meet fellow (government and non-government) data enthusiasts in Delhi, share challenges and opportunities, and beginning discussions around future projects and initiatives. We hope to have a good mix of researchers, programmers, activists and advocates.

RSVP: If you are joining us for the DataMeet-Up, please let us know at mail[at]ajantriks.net.

Location: The map below shows the location of Sarai. The closest metro station is Civil Lines (on the Yellow Line), located at the other end of the Underhill Road. Let us know if you need any further details regrading reaching CSDS or other things.

[mapsmarker marker=”1″]

Photo Gallery Open Data Camp 2013 – Day 1

Pictures from Day 1 of OpenDataCamp 2013. There are about 400 of them taken by Meera Sankar. I have purposely not filtered any pictures. Its the complete set. You can use them as you like as long as the usage follows Creative Commons Attribution-NonCommercial-ShareAlike 3.0 Unported License. Please credit Meera Sankar.
Continue reading Photo Gallery Open Data Camp 2013 – Day 1

Reflections of Chennai’s Data Workshop from India Water Portal

Cross Posted from India Water Portal

Written by Aarti Kelkar-Khambete

This workshop organised by Transparent Chennai at The Institute of Financial Management and Research, Chennai was the outcome of the experiences of the earlier open data camp events organised by Transperant Chennai in Bangalore and Hyderabad, where there was a wide discussion among attendees who were excited by the potential of
data and the open data movement, but who did not have the necessary skills or technical background to work effectively with it.
It was felt that there was a much larger community of activists, researchers, and on-profits who could benefit from learning to use the kinds of tools presented at the camps. Thus, this event was planned differently from a data camp and focused on training activists, researchers and students to work with data where participants would learn about open data, data visualisation, spatial data and practical issues that come up when working with data in various forms.

The workshop thus aimed at helping the participants to:

  • Understand various formats of data, diverse possibilities of data visualisation and effective tools for doing so, with a special focus on web-based tools
  • Understand how to think through projects involving collection, processing and visualisation of data
  • Develop a basic understanding of software packages and methods for visualising quantitative data, creating geo-visualisation and undertaking participatory mapping
  • Understand the connection between data technologies and rights to access and use data.

Read the rest of the summary here.

Data Science meets Data Technology

Data Science meets Data Technology
The Big Picture
August 18th, 2012
10:00 AM – 12:30 PM
Conference Room 2
NIAS, Bangalore

While data is increasingly important in academia as well as in industry, the two worlds do not intersect each other all that often. DSDT is a monthly forum for sharing ideas about data across disciplines and industries. Each DSDT meeting will consist of two talks on a common theme, pairing a data scientist with a data technologist along with time for discussion. From the second session onward, we will have a tutorial and hacking session after the talks where we will learn how to work on understanding and analysing data sets relevant to that meeting’s theme. The schedule for the first meeting on the 18th at NIAS is given below.

10:00 AM – 10:30 AM: Tea
10:30 AM – 11:00 AM: Analysis: The Big Picture. Rajesh Kasturirangan, NIAS.
11:00 AM – 11:15 AM: Discussion of talk 1.
11:15 AM – 11:45 AM: Data and Visualization: The Big Picture. S. Anand, Gramener.
11:45 AM – Noon. Discussion of talk 2.
Noon – 12:30 PM: General Discussion.
For more information, visit http://analysis.knofu.org/

PIN code mapping

Where: Skype ID: datameet

Agenda:

Summary:

  • We’ll go for bulk geo-coding as opposed to crowd-sourcing
  • We’ll bulk source addresses. Please add any other sources you can think of
  • The Postal College’s list of post offices
  • Branch lists from banks such as SBI, or organisations like BSNL
  • Telephone directories
  • We’ll run them through Yahoo’s Placefinder, which is liberal in API limits and in licensing
  • We’ll create Voronoi treemaps out of those (ideally as OpenStreetMap XML files)

Linked mentioned during the meet:

Text & Geo processing

Where: Skype ID: datameet

Agenda:

  • Introductions [everyone, 10 seconds each]
  • Discussion on the most interesting visualisation you’ve seen recently
  • Discussion on any sources of data you’ve come across
  •  Recording of the talk is available

Linked mentioned during the meet:

Quote of the call: “this call started with no agenda, but ended with quite hands full. happy. nothing more to add” — Balaganesh