Tag Archives: featured

Investing in Data: Pre Budget Consultation with the Finance Minister

Last Thursday DataMeet was lucky to be invited to a Pre Budget Consultation with the Finance Minister Arun Jaitley. We were invited to attend with the IT sector group and give some suggestions on how the next budget could invest in open data.

After some consulting with the various city chapter organizers we came up with some recommendations that could appeal to this audience.  We decided to emphasize that government data is a financial asset that needed to be invested in, in order for it to reach its optimal economic impact.  A stance the US government made in it’s open data policy.

You can read the note we submitted here:

The meeting was Thursday morning in Delhi at the Finance Ministry offices, Sumandro came to represent CIS and I attended to represent DataMeet.

The Finance Minister was there along with the Secretaries;
Shri R.N. Watal, Finance Secretary, Shri Shaktikanta Das, Secretary, DEA, Dr. Hasmukh Adhia, Revenue Secretary, Ms Anjuli Chib Duggal, Secretary, Financial Services and Dr. Arvind Subramanian, Chief Economic Adviser (CEA).

It was a round table and the participants were organized by software and hardware, and we presented in the order we were seated.

  1. Shri Ramadas Kamath, Infosys,
  2. Shri P.V.Srinivasan, WIPRO,
  3. Shri Anil Chanana, CFO, HCL,
  4. Shri Pauroos D Karkaria,TCS,
  5. Shri R. Chandrashekhar, Chief Economist, NASSCOM,
  6. Ms Nisha Tompson, Founder, Datameet,
  7. Shri Vinod Sharma, Chairman, Electronics and Computer Software Export Promotion Council,
  8. Shri Nitin Kunkolienker, Vice President, Manufactures Association for Information Technology (IT),
  9. Shri Rajoo Goel, ELCINA Electronic Industries Association of India,
  10. Shri Hari Om Rai, Co-Chairman Task Force on Mobile Phone Manufacturing,
  11. Shri Suraj Saharan Ajit Pai, COO,Delhivery,
  12. Shri Sumandro, the Centre for Internet & Society and
  13. Shri Vikas Jain, Member, Task Force on Mobile Phone Manufacturing

While most of the suggestions were related to tax breaks, subsidies, and trade issues, I was able to introduce the idea that the Government of India’s data is an economic asset that can help create markets, increase innovation, and allow for more accountability in scheme implementation. In order for the data to do these things it has to be opened up and that means the government must invest in the NDSAP policy and focus on data standardization, cleanup,  and collection. Also policies need to be reviewed and revamped in order to keep up with demand and use of data. Like the mapping policy should allow for more contributions from private sources and crowdsourcing so the Survey of India can keep up with demand for geospatial information. The Copyright Act also needs a clarification on the status of data and the Ministries must be willing to release data under open licenses.

In all the meeting was short, with the main focus being toward how to encourage manufacturing sectors because of the Make in India initiative. I was happy to be there and mention ideas and concepts that were not being discussed in rooms like that one and to also offer a perspective on open data.

We hope to keep in touch with the Ministry and continue to take advantage of any opportunity to share our experiences and views on how an investment in data can be a huge economic asset to India.

You can see the Government’s Press Release here.

2nd Open Data Camp Delhi!

Last Su23024327289_8965388572_znday DataMeet Delhi hosted their 2nd Open Data Camp!  60 people decided to spend their Sunday with us to discuss Digital India and find ways to make this programme more Open and Transparent.

The Delhi chapter decided to examine the role of openness in Digital India, especially how the open data agenda should be integrated into the initiative.  Digital India is the flagship programme of the Government of India to harness the possibilities of information technologies for accountable governance, effective citizenship, and a productive and job-creating digital economy.

This event also explored the recent international push towards better global availability of interoperable 22569224613_8e3f363c28_zand comparable data, such as the Data Revolution for Sustainable Development initiative of UN and the International Open Data Charter introduced by the Open Data Working Group of Open Government Partnership.  The discussion looked at these wider conversation in the keynote and the morning panels.

 

22802412497_c26edf6786_z

Keynote: Honourable MP from Sikkim P.D. Rai.

The MP from Sikkim started off the day by talking about his experience setting up the first state level Open Data Policy, Sikkim Open Data Acquisisiton and Accessibility Policy (SODAAP),, and why it was important for them to take control of the state’s data through openness.

He stated the the “lack of reliable, structured, and proactively available data is a key barrier to good governance.”  So the SODAAP would allow state legislators to get access to data as they need it instead of having to go through the current structure of asking the Centre for data.  “Why is it that we have fancy phones but we can’t get data on public policy & schemes on it for good decisions.”

When asked how to get government to change he stated, “I’m not the executive, I’m a lawmaker. I don’t represent the government.  I question it as much as you do.”

 

22878462477_005b2bb8d1_z

Open Data and Digital Governance

Anoop Aravind, Konatham Dileep, and Nikhil Pahwa

 

This panel focused on the Digital India from a government and journalistic point of view of Digital India.  The panel had a representative from Telegana, KPMG who is implementing E-Panchayats, and from Media Namma.

Dileep the Digital Media Director for Telegana pointed out that the government is the biggest creator of data but they are not set up to share, and are not encouraged to.  Anoop from e-Panchayats pointed out that there are technical issues with implementation and technology infiltration at the local level.  He said the biggest problem for them is the lack of mapping data that can used to help with planning.

Nikhil from Media Namma made the point that the government should proactively disclose data, “why do we need to get personal relations to get the data?” but this doesn’t replace people’s right to ask for information and not just rely on information provided by open data. Right to Information is still vital and this includes an expanded effort to protect people’s privacy.

When asked what are the challenges of openness for Digital India? That despite the big fanfare there is uneven implementation and issues that have to be solved before the dreams of Digital India are realized, and that people have to work with the government to show them the reason to be open.

22573760283_24bbf8c618_z

Open Data and Digital Citizenship 

Bhanupriya Rao, Dr. Biplav Srivastava, Nic Dawes, and Shashank Srinivasan

Bhanupriya Rao an RTI activist described out RTI has a pro-active disclosure requirement, however, it is not in practice and without that RTI is the best tool for now.  There is no right to data concept.

Nic Dawes described journalism as a constitutional mandate and went on say that that open data and journalism communities must work together more.  Journalists can deal with biases, data interpretation issues, graphic presentations, and tell compelling stories using tech and design.

Biplav Srivastava spoke about the need to move toward smart data consumption, for policy decisions and  individual decisions. That the next steps are data integration/re-use/standards, and linked data for analytics.

Shashank Srinivasan shared his experience with open data for conservation (WWF), how they consume OSM data for needs of protecting wildlife. What are risks for crowdsourcing for wildlife conservation?  Open data can be a problem for conservation, control over the end user is needed.

Questions to consider:

How can open data improve our work? How can academia and open data converge? Can donors influence on releasing data? What does it mean to be a digital citizen?

Lightning talks

22904542059_ebbabfda5f_z

Guneet from Akvo shared their smart phone app that detects Fluoride levels in water.

 

 

23272573835_0385565697_zManing from  HotOSM shared their work around the world providing maps during natural disasters, including the Nepal Earthquake.

23246505476_516b00ee42_z

 

 

Transport Working Group shared the work looking at bus data in Delhi.

 

 

 

 

22644205814_d6371e4e92_z

Bihar Gender Watch shared the work of looking at the gender split in elected bodies.

 

 

 

 

 

22644166574_0762baa26b_z

 

NewsPie is an online news site, they shared the data work they have done in roads and around net neutrality.

 

23309891061_7dd47539fb_z

 

 

Aditya Dipankar shared his work designing information.

 

 

 

 

 

 

23096714180_58a2a19d0b_zAruna from MapBox shared their work mapping road naming.

 

 

 

 

 

 

23246355216_623bf9ac92_z

 

Turam shared his project that built more data collection tools on the Open Data Kit.

 

 

 

 

 

23366323936_8626537cba_z

 

Yogesh from Random Hacks of Kindness  (RHOK) on his vision for an open revolution! Also the work of RHOK in India bridging gaps between organizations on the ground and technologists.

 

 

 

23392480785_b93d014558_zMonish Khetrimayum a PHD student spoke about big data, governance and citizenship.

 

 

 

 

 

22765320053_cb6f372d33_z

 

Rakesh from Factly describes how they use RTI information and open data to make sense of information for journalists and citizens.

 

 

 

 

Group Activity: Response to Digital India

23392506025_19a0dc0b66_m23096736470_c0465ba3d0_m
23024525899_51faa6ca1e_m23284061752_aa07440544_m

Groups were formed to discuss each pillar and come up with questions.

We have gathered all the questions and put them in the DataMeet hackpad, you can find each pillar here.

Please feel free to take a look and add more questions and dataset requests.

After a week’s time we will be gathering everything and writing a letter of request for openness to Digital India and the various departments, DIETY, to ask them to make this information available.

It was a fantastic day! DataMeet Delhi did an amazing job putting together really interesting speakers to make this a well rounded interactive event.

Thank you especially to the sponsors for helping make this event great!

  • SARAI for the space
  • AKVO for travel
  • ICFJ for food and other support.
  • RHOK for travel

Map of Electoral districts of Sri Lanka

SriLankan maps for Electoral districts are available for download now. I initially made this for a friend who wanted to analyze the election results. The Electoral districts are derived from the administrative maps.

via GIPHY

You can check the diff on github to see how the maps were changed.

GADM database of Global Administrative Areas is the source of administrative data. I used three simple online tools

  • GeoJSON.io for converting from KML to GeoJSON and adding attributes.
  • MapShaper for merging the areas
  • GitHub for storing the map files.

Note: I don’t provide any guarantee on the accuracy of the maps. So don’t use if you want accurate maps. I have made notes on how these maps were derived. Use it if you think the process is right. Raise an issue if you find anything.

Guest Post: Varun Goel- Releasing Data for Agriculture

RRAN_logoVarun serves as the chief data scientist at a research team led by Dr. Ashwini Chhatre, serves as the Research Node of the Revitalizing Rainfed Agricultural Network – an India wide network of NGOs, civil society organizations, researchers, policy makers and think-tanks that aim to reconfigure the nature, amount and delivery of public investments for productive and resilient rainfed agriculture. 

The Combined Finance and Revenue Accounts (CFRA) report is an annual report prepared by the office  of the Comptroller and Auditor General (CAG) of India to provides comprehensive Union and State government data on audited receipts, revenue expenditures and capital outlay for different major, minor and sub-minor heads.

Since the figures for actual expenditures on different heads may differ from actual  budget allocation by as much 15 to 20 percent, and that each state might have different procedures of auditing, the CFRA data provides reliable and fairly disaggregated figures of public expenditure, audited by a central authority.

The research team at the Revitalizing Rainfed Agricultural Network (RRAN) has scraped and processed the CFRA data from 2005-06 to 2010-11 for all general and economic services to understand statewide public investments in agriculture and allied activities, and highlight the mismatch in investment and needs on the ground.

The processed data, along with detailed information for each head can be forked here.

Although the data is only available at the state level, it can provide valuable insight on not just public expenditure in other domains such as urban development, health, central and state sponsored schemes, but also highlight the differences in budget allocation and actual spending of various government heads.

Revitalizing Rainfed Agricultural Network (RRAN) has practice and policy node that generates ground based evidence and block, district and state level for policy engagement, the research node’s objective is to generate evidence for testing key hypotheses to enable an articulation of the nature and magnitude of public support needed to fuel growth of India’s rainfed agriculture. To facilitate this, a Data Center has been set up with the aim of acquiring, reconciling, processing, visualizing and disseminating pan India datasets to assist in exploratory analysis and develop research hypothesis, backing up policy advocacy through scientifically rigorous data analysis, and implementing data-driven decision-making tools for program implementation by grass-roots level organizations.

Open Access Week – Open a Dataset with Srinivas Kodali

Cross post from Lost Programmer

Starting today it is International Open Access Week, I have been associated with concepts of open data and open access since 2012 and was hoping to bring some serious attention to it in India. This week I intend to showcase a serious of datasets which several departments of Govt. of India publishes in there web portals through NDSAP apart from Open Government Data Platform

Today’s dataset which I want to bring attention is of Indian Customs. Indian customs maintains records of every product imported and exported through land, sea and air. They publish this data through their commerce portal. They should be highly appreciated for maintaining this website and publishing the data. The data is published as per Notification No. 18/2012-Customs (N.T) dated: 5th Mar, 2012

The data being published includes origin, destination ports, name of the product, Harmonized System code of the product, quantity of product, unit quantity of the product, customs valuation of the product. For imported goods, the origin country is published instead of the port, while for export you get to know the exact destination city.

Read the rest over at Srinivas’s blog here

And if you are using the data for anything please let us know! Stay tuned for tomorrow’s release!

Open Access Week 2015 India Events

It’s Open Access Week! This week there are events around the country to celebrate openness and explore how far we have to go.

MapBox is putting up an amazing Open Data Gallery Tuesday the 20th in Bangalore. Come and hangout look at incredible art and projects from around the country!

In celebration DataMeet is doing its first MULTI CITY EVENT!

Join us Saturday 24th at 6:30pm for talks from Data.Gov.In, Ahmedabad and Bangalore with livestreaming between the cities!

  • Data.Gov.In will talk about the latest updates to Open Data in India.
  • Bangalore will discuss open access in general and open data projects.
  • Ahmedabad will talk about the status of Open Access in their part of the world.
  • Srinivas Kodali will talk about releasing datasets.

Bangalore’s event will be at Centre for Internet and Society.

Ahmedabad will be at CEPT University. 

Please RSVP on Facebook or Meetup.

Let’s celebrate all we have been able to accomplish as a community and look forward to continuing to promote a culture of openness, sharing, learning and collaboration.

 

Nobel prize Winner Angus Deaton on the importance Open Data in India

On Data{Meet} we have been talking about the importance of Open Data and quality of it. This year’s winner of the Nobel Prize for Economics Angus Deaton has similar point of view on the quality of open data. Whole article is worth reading, I am quoting a paragraph.

My work shows how important it is that independent researchers should have access to data, so that government statistics can be checked, and so that the democratic debate within India can be informed by the different interpretations of different scholars. High quality, open, transparent, and uncensored data are needed to support democracy.

I have used data from India’s famous National Sample Surveys to measure poverty. Perhaps the biggest threat to these measures is that there is an enormous discrepancy between the National Accounts Statistics and the surveys. The surveys “find” less consumption than do the national accounts, whose measures also grow more rapidly. While I am sure that part of the problem lies with the surveys—as more people spend more on a wider variety of things, the total is harder to capture—but there are weaknesses on the NAS side too, and I have been distressed over the years that critics of the surveys have got a lot more attention than critics of the growth measures. Perhaps no one wants to risk a change that will diminish India’s spectacular (at least as measured) rate of growth?

Source: TheWire
Picture credit: Nobel Prize

The first GeoDel meetup

On the 2nd of September, 2015, DataMeet-Delhi spun off a small side project known as GeoDel. Following GeoBLR‘s example, GeoDel is a Delhi-based group/community that meets to discuss open spatial data in the Indian context.

Akvo very kindly hosted us at their beautiful Delhi office, and we began with a very short talk by me (Shashank) on a quilt my mother made, based on OpenStreetMap data of South Delhi. Riju then spoke about mental maps, using a slideshow with some beautiful maps. He ended his talk with a participatory mapping exercise using FieldPaper maps of Delhi, where everyone who attended the meet had a chance to shout out a random place in Delhi, and everyone else had to mark it on their maps. It was a good way to learn about places in Delhi with arcane names such as ‘Rohini‘ and ‘Patparganj‘, and to end our first GeoDel as well.

GeoDel will have bi-monthly meets, so stay updated on its spatio-temporal coordinates via the MeetUp and FaceBook groups!

Data{Meet} Pune, Second Meetup – Let’s talk Mapping

The 9th of August, 2015 marked 11 years of the OSM project. On the same weekend Datameet Pune fittingly held its second meetup, ‘Let’s talk Mapping’. The session was led by Devdatta (Dev) Tengshe, a veteran of the Bangalore Datameet group who has several years of experience in GIS and remote sensing having worked previously for ISRO. Dev initiated with a primer on what spatial data is and what can be done with spatial data, then followed with an introduction to GIS, a demonstration of OSM and information on sources for spatial data in the Indian context. His presentation can be found here. Below are the highlights of the session.

What is spatial data? Its uses?

Spatial (data) is not necessarily ‘special’ as many say. It is simply data with a spatial element to it, this could be latitude-longitude but pin codes and postal addresses could be used as spatial formats too. There are numerous advantages to viewing/analyzing social sector data spatially, whether it is census data, land records, city water supply/sewerage networks or other datasets. Spatial representation helps detect patterns and trends that may otherwise go unnoticed. Spatial data in the social sector also comes with its set of challenges.  Maps of land parcels for example are not recorded in any standardized way across the country, but instead using local landmarks (turn left at this tree, go straight for 50m, then turn right and head towards the banyan tree) Much of census data is also not easily available at the finer local levels, but only at the district level.

Spatial data can be used to solve spatial problems. Spatial data visualizations work with the strength of the human eye, which is to detect patterns visually. In the exploratory stage you may visualize it to detect patterns, e.g. a map of a user’s Facebook friends may unknowingly reveal areas of low internet penetration, a comparison of Bangalore’s bus routes vs Pune’s bus routes show a stark difference in connectivity. In further analysis you may also find spatial correlations. Spatial modelling is yet another application. These processes are in fact the same ones you would use with regular data, and like all other data, spatial data too requires a lot of cleaning.

IMG_20150808_174421

GIS 101

The real world is infinitely complex. To represent this spatial world in data we have to develop simplified models. These can be either Vector or Raster models. In vector models, we use points, lines and polygons to represent real world features (e.g. bus stops, bus routes, ward boundaries) whereas in raster models we use images of the earth’s surface taken by satellites, or UAVs which are composed of pixels to view the earth’s surface.

File formats for spatial data:

Vector

shapefiles are used within desktop softwares (QGIS, ArcGIS), geojson is used for web mapping (these are light, human and machine readable), kml (first developed by Keyhole, later bought by Google) is also a common format.

Raster

tiff (multiple bands) format allows for storage of larger datasets.

Spatial databases are now able to handle spatial data, allows spatial queries related to it, so a user doesn’t have to write out the logic for such operations (e.g. of spatial queries: Find the nearest school/hospital to this village?). Spatial databases are used by retail businesses, housing, utilities and many other commercial ventures.

Where do I get spatial data?

The Beg-Borrow-Steal theory

Beg

Create it yourself. In the process of field work you can use field kits to collect spatial data for your area of interest. Tools available for this include Locus map free – Outdoor GPS (App) OR Open Data Kit (Software suite). As an alternative, you may also digitize from satellite maps

Borrow and convert it

Data that may be available freely but not in a form that is easily usable and may need to be georeferenced.

“Steal”

Spatial data can be ‘scraped’ from websites that contain this data but do not make it easily available, see github datameet maps for examples of data collected from census websites. Although permission may not explicitly be given for this, since it is already up on the web and no copyright exists on the data it is implicitly understood to be open source.

Open Street Map (OSM)

The Wikipedia for spatial data, OSM, counts more than two million users who voluntarily contribute to the project. OSM was first aimed to collect just street data, but it has now expanded tremendously. City data in OSM is of high quality however for rural areas, only major roads can be guaranteed.

Unlike Google maps which does not allow a user direct access to its data, OSM raw data is available for download as well as editing. Within OSM users can tag different aspects of any object, giving others more information about it. Users can also introduce new key:value pairs if needed. OSM scripts monitor changes and an IRC chat room verifies these changes. OSM updates frequently and is therefore used in humanitarian situations (HOT OSM). Only 12 servers run all of OSM

IMG_20150808_173926

Wikimapia in comparison is limited, it allows you to draw on google maps, but there is no verification of additions and limited data download.

There are independent initiatives to make available raw data download from OSM [See slide 47] Similarly other apps use and make available OSM data, Map quest for instance gives directions based on OSM data. If you are unsure of the final use of your data you can download data in OSM XML format, since it contains everything. GeoJSON is useful only when you need shapes, not other features of spatial data.

Sources

  • Downloading OSM data for a country: Geofabrik
  • Downloading OSM data for any custom polygon: BBBike
  • Raw data based on particular data queries: Overpass Turbo

Spatial data in the Indian context

Districts/Tehsils

Shapefiles for districts and tehsils are available on Github, Datameet maps. However maps must be verified against other sources of data. In reality there is dispute even within the Indian government on how many districts India has.

Village boundaries

In reality, in many cases no fixed village boundaries exist, the Census uses blocks and settlements for reference. Some states however make available static maps showing village boundaries that can be georeferenced.

Pin codes

Can we divide the country into pin codes? Pin codes do not represent an area, they are points along a line where the postman will deliver. Hence the assignment of addresses to the last  three digits of a pin code is a decentralized decision. The lowest level of post offices decides. Pin codes also do not cover the entire country. Post offices to Pin codes do not have a one-to-one relation.

Census data

Census data at the finest spatial level comes down to census ward boundaries. Nobody outside the census department actually knows these boundaries. Pune city has 700 census ward boundaries (which do not correspond to administrative/electoral ward boundaries) mostly hand drawn. District level offices may have maps with these boundaries as hard copies.

Nothing in national policy disallows them from sharing them, but nevertheless government officials aren’t inclined to share such information. Certain limitations however do exist on government data sharing, protected military areas, areas near the national boundaries, topography maps etc. are prohibited.

Basemaps and DEMs (Digital Elevation Models)

The Open data initiative of the Government of India has created some 5400 odd ‘Open Series maps’ i.e. toposheets without height information. None of these are done digitally or printed. They can however be used with gps data since the lat-long is accurate.

Since GoI topography data isn’t made openly available, alternatives available are SRTM, ESTER and Bhuvan Cartosat. These are good for example for larger rural areas, but not feasible for urban areas. Private companies work with UAVs for very high resolution elevation data. For satellite imagery as basemaps, Landsat imagery, going back to 1970 is available.

Closing Remarks

In following up with our discussions on mapping, for those of you who are interested, we have several Pune specific mapping tasks that individuals can contribute to. E-mail us at pune@datameet.org for more information. We hope that everyone found the discussion useful and thank you for coming, thanks to Dev for the informative session! Thanks to Shraddha and Thoughtworks Pune for hosting us. Do connect with us via social media [Twitter] or join our mailing list for information on the next meeting.

{Ahmedabad} – 3rd Meetup

This meetup was special as this was on my way back from the long drive. Since I was doing quite a bit of Open Data work on my trip, I thought I would talk about the same. So we had a long conversation about how we can contribute while on a long drive.

IMG_20150705_191653

The presentation is embedded below or you can check the presentation.

We discussed in detail about the following services to which any one can contribute

We also discussed about the Apps for Android that can be used to collect and submit data.