Category Archives: DataMeet-Up

Mumbai Meet 4: Data Cleaning

Mumbai had it’s fourth data meet on December 6, 2014 with a total of 11 participants. Due to scheduling issues, the November meet-up was moved from last Saturday of the month to the first Saturday of December. This time the meet-up was held at Pykih’s office on 8th floor at Sardar Patel Institute Of Technology.

The speaker was Bhavin Dalal, Senior Technology Manager, from Hansa Cequity.
At Cequity, he plays multiple role not limiting to solution architect, consultant & project manager. While he has strong product framework knowledge , his expertise lies in data warehousing technologies.

Bhavin spoke on two main topics:

1. Data Cleaning – he explained what is Data Quality and which factors determine the quality of data. He briefed through the common Data quality problems faced while cleaning the data. He showed us an example where they faced problems while cleaning car data and how they solved it. He also explained data cleaning methods which will helped us to understand the approaches towards data cleaning, the importance to do data cleaning and some do’s and don’t while capturing data.


2. Visualising census data for better understanding India – here he gave us eye popping fact list revolving around the census data. This topic gave us the better understanding that there are plethora of data points which can be meaningfully used to come up with really good insights on Indian population.


The next data meet will be held on last Saturday of December 2014. Please follow the Mumbai Meet-Up Group to know about the details.

Mumbai Meet 3: Mapping Schools In Karnataka

Mumbai saw its third data meet on 26th October, 2014 with a total of 14 participants, in-spite of it being a Diwali weekend. This time around we decided to try out a new place and the venue was a roof top place located at Chium Village, Khar West. A nice cozy place but a tad bit difficult to find for people who are not familiar with the area.

photo (5)


This time also the crowd was titled heavily towards the tech side. 

The speaker was Sanjay Bhangar, co-founder, CAMP, who is a web developer for the past 8 years, with extensive experience in online video and mapping technologies. who first,  gave a small introduction to the Data Meet, its founders Thejs and Nisha and how it now operates as a trust and that the idea is to encourage open data movement among data enthusiasts.

Sanjay spoke on two main topics:
1. Introduction to our video archival platforms – they have been running this for the last five years.  He explained how to gather metadata about all Indian films ever made, general video analysis tools ( timeline generation / cut detection), etc.

He explained the use of ,  and how it is an online tool for saving videos.


2.Mapping schools in Karnataka – explained how they have been collecting data on schools in Karnataka and are working with the Akshara Foundation who run a lot of programs on schools and they have a lot of child level data which allows you to track performance of children in schools across the state.  A suggestion was made if they could also map crime data highlighting  the recent crime against children in Bangalore schools.

3.He showed us an example of how he worked on a project of mapping historical data for the New York Public Library. 

The next data meet will be held on 29th November, 2014. Pls follow the Mumbai Meet-Up Group to know about the details.

Mumbai’s Data Meet Kicks Off With A Bang

Written by Sanjit Oberai
Mumbai’s first Data Meet kicked off on 30th August at the Sardar Patel Institute of Technology with a total of  26 people attending the event.  It started off with a round of introduction by all the attendees which was a mix comprising developers, journalists, students and data enthusiasts. 
There was an introduction by Ritvvij, who is the founder of Pykih, where he spoke about how important it is to have a data group in Mumbai. ( Listen to the recording here
The first talk was by Ajaj Kelkar (above) , who is the  Cofounder  and he gave an introduction to how the recent movement of Open Data started about 5-7 years back in the US as there was  a need to move data from the private space into public space, and this was possible by the the active push seen by transparency groups.  This idea spread and many progressive countries realised that this can be a powerful movement which can be used for public good.   
Data can be used to help take decisions on the social or personal platforms. He highlighted that there are barriers and we need to overcome them.  He explained how many cities abroad have appointed Chief Data Officers whose jobs is to monitor data in each city i.e. municipality budgets, etc. 
An important point also spoken about was on Privacy.  As consumers we are leaving a lot of data out there on social media platforms. However, in the absence of proper laws, we need to be careful about what we put out there and this needs to an area where we need to think carefully about.
The second talk was by Ritvvij, founder of who explained the importance of how one should visualise. He emphasised that a lot of people are not aware of how to use correct tools to visualise data and the most common mistake people make is with the humble PIE CHART. He further went onto explain the process of visualisation mistakes that many news organisations are making today and what they could do to improve their charts/graphs
He also gave examples of his recent work with where he made custom visualisations for them for the elections and the IPL.  He also worked with . 
He also spoke about the issues most journalist face when dealing with government data and how difficult it was for them to have access to it.  He ended by stating that there was a dire need for a tool that could become the CMS for data journalism there by allowing journalists to focus on the story rather than doing data janitorial work. 
The third talk was by Sanjit Oberai, Deputy Editor of IndiaSpend, a non-profit that uses data to tell stories. He spoke about how there are tonnes of stories buried in government data and how to write articles around that. He spoke about how they research articles, what are the sources of data and how one can visualise data using free to use tools like Data Wrapper, Knoema, Tableau, etc.
 He also spoke about a new initiative called Fact Check which can be used to bring about accountability and raise the common man’s awareness. He cited examples about the Goa MLAs who were going to Brazil and how that created quite a stir with the Congress calling this a wasteful expenditure.  A quick factcheck was done to see the assets declared by them in the sworn affidavits provided by the candidates before the legislative elections in 2012.
 He also spoke of a Data Room which would be a first its kind resource for students, journalists and researchers that will allow comparison of state wise data like population, health, education, etc.
The last talk was by Srinivas Kodali, an IIT-Madras graduate, who is researching with transport data of cities. He explained how one could scrap data from websites and demoed tools like Selenium for scraping from sites that generate data on the fly using AJAX. However, Selenium is a front-end tool typically used for testing and requires a browser session open. Hence, it cannot be used for large scale scrapping. Then he went on to show case PhantomJS that would allow scrapping  as Selenium would but in a Headless fashion i.e. without the browser UI.
More Pics:
The next meeting with be held at the end of September. Details will be posted on the site soon.

Screening: The Internet’s Own Boy – Aaron Swartz Documentary and discussions

All of us on DataMeet group are aware of Aaron Swartz. So I thought it would be a great idea to watch it together and share ideas. Hence we are screening the movie coming thursday. Please make it if possible. RSVP on the meetup page. If you can’t make it. The movie is on for everyone to download and watch. You can also visit the movie project page for more details.

Screening: The Internet’s Own Boy – Aaron Swartz Documentary

Thursday, Aug 28, 2014, 6:30 PM

ThoughtWorks Technologies
Ground Floor, ACR Mansion, 147/f, 8th Main Road, 3rd Block, Koramangala, Bangalore – 560034 Bangalore, IN

24 DMers Went

For the August Meetup we will be screening the Aaron Swatz documentary “The Internet’s Own Boy” we will have a short discussion on the Indian experience around open access and what we can learn.WIth the advent of the Goonda Act and other vague policies regarding who owns knowledge created using pu…

Check out this Meetup →

Notes from DataMeet-Up in Delhi, 31 July 2014

After a long hiatus, we had a DataMeet-Up in Delhi on Friday, July 31. Thanks to the Centre for Internet and Society for hosting us.

The meet-up had a small but very productive mix of old and new faces. Here is the list of participants:

* Deeptanshu
* Guneet Narula, Sputznik
* Isha Parihar, Akvo Foundation
* Namrata Mehta, Center for Knowledge Societies
* Praachi Misra, Competition Commission of India
* Rajat Das, Contify
* Riju / Sumandro Chattapadhyay,
* Rohith Jyotish, Centre for Budget and Governance Accountability
* Shobha SV, Breakthrough

We started with a round of ‘what is DataMeet’ and moved into ‘what should DataMeet do in Delhi.’ Here are the suggestions that came up in the meeting:

1. Data Liberartion Strategy: We can work towards creating a strategy and workflow to undertake data liberation tasks. These tasks can focus on two types of data – (1) data that is not available in public yet and needs to be brought out by requesting the authorities concerned and/or speaking to them about it, and (2) data that is available in public but not in an open / directly-usable / machine-readable manner. We of course have done some work towards especially the second type of data, such as with MP constituency boundaries shapefile and with scraping of weather data. It will be useful to prepare and document strategies for such tasks.

Deeptanshu suggested that an important available-but-not-machine-readable data that we can work with in near future is the proceedings of the parliament published in the parliament’s website. We can possibly speak to ADR and PRS if they have done any work towards converting that data to machine-readable formats.

2. Learning and Sharing: We felt that DataMeet should undertake pedagogic functions – from internal training / sharing sessions within the DataMeet members, to public workshops for data and visualisation tools and techniques, to online documentation of the same. It seems that the existing (regular or otherwise) members of Delhi chapter of DataMeet is a good mix of those who look forward to pick up data / visualisation / programming skills and those who can offer to teach that. Often the latter group looks forward to learn about available datasets, ways of interpreting government data (from NSSO to budget sheets), and legal considerations associated with data — all of this the former group (who wants to learn data / visuaisation / programming skills) can offer to help with. Hence it make a lot of sense to convert our monthly meet-ups into short learning and sharing sessions.

Further, we can document the learning and sharing taking place in the meet-ups and put it up as online references. This will slowly create a knowledge base, with contributions from across the city chapters. There was a short discussion if we should use a Wiki to create such a knowledge base or a WordPress blog. The programming group is more comfortable with the former, while the non-programming group is more comfortable with the latter. With WordPress providing detailed ‘edit history,’ I guess it is alright to use WordPress for the sake of general ease of use.

Let us start the documentation over the next 3-4 meet-ups and think of what is the best way to upload it – either as a section of DataMeet blog / wiki / github or a sub-site.

3. DataMeet-Ups as Tiny Hackathons: It was suggested that on each DataMeet-Up, we take up a particular task — either of data liberation or of data visualisation — and focus on a particular topic and dataset, and spend time together working on the task. This will include thinking about the task, creating a workflow, sharing the skills concerned, and doing the task. And finally we showcase the work done through the DataMeet blog and elsewhere.

Further, this will also produce visible evidence of the government data made available at the portal being actually used, and thus to raise awareness of the available data and its demand.

4. Legal and Policy Discussion: It was briefly mentioned that some members of the group often face questions related to legal and policy context of open government data, and also regarding opening of non-governmental data. We should look for resource persons and organisations to advise on such issues. The DataMeet mailing list can also function as a primary discussion space for these topics. However, the mailing list can be too public a space for certain discussions.

Open Data Camp Delhi 2014

We had an initial chat about organising the Open Data Camp in Delhi in November 2014. The date and venue discussion is pending. We will take that up in the next DataMeet-Up.

The two primary objectives of the Open Data Camp Delhi are (1) a social and networking event for open data people (who are talking about and/or working with open data ) in Delhi, and (2) learn about their interests and challenges and prepare the road plan for Delhi chapter of DataMeet. Clearly, the first objective is more community-facing, and the second one is DataMeet-facing.

Here is the draft agenda for the Open Data Camp Delhi:

09:30-10:00 Ice-Breaker
10:00-10:30 Open Data and DataMeet [What is open data? What is DataMeet? Why is DataMeet? Why is open data relevant?]
10:30-11:30 Lightning Talks #1 [6 talks of 8 minutes each]
11:30-12:00 Tea/Coffee
12:00-13:00 Lightning Talks #2 [6 talks of 8 minutes each]
13:00-14:00 Lunch
14:00-16:00 Open Data Matchmaking Session [We set up two boards at the beginning of the day. One for writing down what data project one has in mind and what skills are required, and the other for writing down what data skills one can offer. On the basis of this, people meet up during the matchmaking session and talk about their plans.]
16:00-17:00 Closing and Thanks followed by Tea/Coffee
17:00-18:00 DataMeet Roadmap Discussion [Open to anyone who wants to participate]

It was suggested that lightning talks should be chosen as a combination of directly selected (by organisers) and community selected (through a submission and voting mechanism) modes.

DataMeet-Up in August 2014

We planned the next Delhi DataMeet-Up to take place on Wednesday, August 27, afternoon, where we will work on visualising datasets related to budget 2014. Rohith from CBGA, and his colleagues, will help us select the datasets and interpret them.

The venue is yet to be decided. Possible options are Akvo, CKS, Sarai, and Youth Ki Awaaz. Maybe CBGA can host it too.

Further, this also works as a warm-up session towards the Hack the Budget event being organised by World Bank in September.

Subscribe to Blog via Email

Enter your email address to subscribe to this blog and receive notifications of new posts by email.

DataMeetUp – Bangalore – July/2014

Its been quite sometime that we had a physical meetup in Bangalore. We are planning to host one this month end, i.e on Thursday, July 31, from 7:00 PM to 9:00 PM. Like all our previous monthly meetups, we have a rough plan

Main Talk: Large Scale Telephonic Polls for Real Time Data Collection
Thejesh GN will share his experiences about conducting large scale telephonic polls for real time data collections.
This could be employed for
– Exit poll conducting
– Real time sentiment gathering during the events like Budget
He will share his experiences, learnings and tips.

Group updates – DataMeet group updates and discussions related to future plans.

If you like to show your data project or talk about it. Let us know we will add it to the schedule.

Please do RSVP on our Meetup group. It helps us in planning the event.

July 2014 – Bangalore – Meetup

Thursday, Jul 31, 2014, 7:00 PM

Ashoka Innovators for the Public
54, 1st Cross, Domlur Layout

26 Members Attending

Main Talk: Large Scale Telephonic Polls for Real Time Data CollectionThejesh GN will share his experiences about conducting large scale telephonic polls for real time data collections.This could be employed for- Exit poll conducting- Real time sentiment gathering during the events like BudgetHe will share his experiences, learnings and tips….

Check out this Meetup →

[mapsmarker marker=”6″]

DataMeet-Up in Delhi, Friday, November 22

After a hiatus, the Delhi DataMeet-Up is back. We are meeting today, Friday, November 22, at Akvo Foundation office, at 5:00 pm.

Here is the tentative agenda of the meeting:

  • Updates: Sharing news across the network.
  • Discussion: Discussing how we can support Hack for Change around women’s rights being organised by and Hacks/Hackers New Delhi. Shobha and Anika would begin the discussion by talking about the planned event.
  • Discussion: Beginning a discussion towards a election data hackathon. It will be led by Satyakam and Surendran.
  • Presentation: Using iPython for exploratory data visualisation – Konark Modi,
  • Discussion: Brief chat about the overall agenda of the Delhi chapter of DataMeet.

Location: Akvo Foundation, 3rd Floor, Ramnath House, Plot #18, Yusuf Sarai Community Centre, Yusuf Sarai. Nearest metro is Green Park.

[mapsmarker marker=”3″]

DataMeet-Up in Delhi, Saturday, 31st August

We are organising the next Delhi DataMeet-Up at 5:30 pm, Saturday 31st August, 2013.

This time we are meeting at the Mimir Technologies office. The address and map are given below.

As suggested in earlier meetings, we will begin with three short presentations (around 5 minutes each, followed by 5 minutes of discussion for each presentation) and then get into the core discussion.

Here is the structure of the meeting:

RSVP: If you are joining us for the DataMeet-Up, please let us know by commenting to this post (see below).

Location: Mimir Technologies Office, Basement, #U1, Green Park Extension.
Note: The office is located just behind the main road that goes by the Green Park metro station. Please contact Gora at 9868527992 in case of any issues with finding the place.

[mapsmarker marker=”4″]

DataMeet-Up in Delhi, Friday 19th July

We are organising the next Delhi DataMeet-Up at 6 pm, Friday 19th July, 2013.

This time we are meeting again at the Akvo Foundation office. The address and map are given below.

We have decided to focus on the following topics in this meeting:

  • Update on activities from Satyakam
  • Update on the ‘Opening Government Data by Mediation: Exploring the Roles, Practices and Strategies of Data Intermediary Organisations in India’ project by Sumandro and Zainab
  • Update from other organisations and individuals
  • Update from Nisha Thompson on plans for DataMeet Foundation/Trust
  • Setting the future agenda for DataMeet Foundation/Trust in general, and the Delhi chapter in particular

RSVP: If you are joining us for the DataMeet-Up, please let us know at mail[at]

Location: Akvo Foundation, 3rd Floor, Ramnath House, Plot #18, Yusuf Sarai Community Centre, Yusuf Sarai. Nearest metro is Green Park.

[mapsmarker marker=”3″]

DataMeet-Up in Delhi, Friday 31st May

Note: The post contains revised date and location of the meeting. Earlier, it was set to be held on 24th May. Sorry for the inconvenience.

We are organising the next Delhi DataMeet-Up at 5 pm, Friday 31st May, 2013.

This time we are meeting at the Akvo Foundation office. The address and map are given below.

We have decided to focus on the following topics in this meeting:

  • Developing and organising the DataMeet community in Delhi, possible collaboration with other technical communities, research organisations etc.
  • Possibility of a job/project portal, to share, develop and interact about work opportunities.
  • Start creating a list of datasets that we can request as a group to be made available on

RSVP: If you are joining us for the DataMeet-Up, please let us know at mail[at]

Location: Akvo Foundation, 3rd Floor, Ramnath House, Plot #18, Yusuf Sarai Community Centre, Yusuf Sarai. Nearest metro is Green Park.

[mapsmarker marker=”3″]