Tag Archives: Water Data

Global Open Data Index: Water Quality

Last year I helped assess the water quality section of the Global Open Data Index (GODI). Given the news of lead poisoning in Flint, Michigan and increasingly beyond, safe drinking water is no longer assured even in countries where it’s been guaranteed, so I am very glad they included it in GODI.

GODI is a survey of 122* countries that look at the status of ‘high priority datasets’ and whether they are truly open according to the Open Data Criteria. Water quality was included last year for the first time. So my job was to examine each country’s submission  and assess if the data submitted was what was asked for and met the criteria for being open. This was a daunting task but I figured if I could find water quality data in India of all places it wouldn’t be impossible.

Assessment Criteria/Methodology

GODI looked for very specific parameters:

While there are a lot more parameters that could be asked for, these were a good sample of parameters to assess if there is robust water monitoring in the country.

After the initial submission phase there were a lot questions about why wouldn’t the survey just ask for drinking water quality data or environmental monitoring data?

Choosing parameters instead of programmes is important because monitoring the environment and drinking water quality are connected. Some countries haven’t really established large nationalized water treatment strategies, drinking water comes directly from a natural resource so the environmental monitoring data inadvertently applies to the drinking water scenario.  Which means that if a country really has robust water quality data they must have these 5 parameters because they cover surface and ground water sources and also reflect safe drinking water standards.

The assessment would be rejected if a submitter only found the surface water body monitoring stations (environmental water monitoring) for instance because arsenic and fluoride are only found in groundwater. So the submitter would either ideally find the treated drinking water quality data which will cover all the parameters or the source water quality data for both surface and ground water.

For a full look at the methodology of the entire survey go here.

Some background

There is no one way to create water management systems but there are two major ways by which people get water – directly from the source or piped in from a source or a treatment facility. The origins of the water source is important. If you are getting water from the ground there are different quality issues  than from surface water (lake or river). If water is from a treatment plant there is a possibility that plant is getting water from both surface water, ground water, and in some cases recycled water. Usually water quality is measured at source and after treatment (treatment plants take multiple water quality samples during the treatment process.)

A full water quality assessment means lots of parameters and not all of them are tested the same way; some parameters take several days and require specific conditions, others can be taken easily through filters or litmus papers.  Water quality is a deliberate process of sampling and testing, and it not as easy as sticking a sensor into the water and monitor a continuous feed of data (although the potential for these approaches is quickly growing as technology improves.)

What I looked for

Since water quality was a scientific process I figured if I found any proof of water treatment or quality monitoring, a dataset would not be far off. After going through a few countries I noticed that the different water management approaches and policies affected where you would find the data.

Most countries give drinking water treatment responsibilities to local bodies but sometimes is monitored by central government under public health regulation so aggregated data could lie with the public health ministry or the environmental protection body.  In most cases responsibility for environmental monitoring fell to a central government Environmental Ministry.

So this scenario means that multiple datasets exist – a centralized dataset for surface and groundwater that  usually lies with the environmental ministry that could have all the parameters but sometimes doesn’t, or it doesn’t have real time data (this means data  may be available but from less frequent data collection such as quarterly or half yearly efforts). Or the Public Health Ministry has reports of water quality with all the parameters but these are aggregated, and usually in a report form (not a dataset) and not updated in a timely manner.

The US, for instance, falls under this group and can produce confusing submissions. The US has a robust geological survey of surface and ground water sources. However, the drinking water reports are supposed to go to the Environmental Protection Agency but no one seems to be updating the database with information. In my assessment I reduced the score because both are supposed to be available in the public domain.

There are countries like Belgium where water management and monitoring are completely left to the local body and there is no central role for monitoring at all, which meant there is no dataset.

There are countries where there is a strong central role in water management and a dataset could be made open like in France. Korea stood out, because they have live real time water quality information from their treatment plants that gets updated to a website.

Then there are the ‘unsures’: which are countries that seem to treat water to some degree or have national drinking water monitoring programmes but don’t have data online, reports or any mention of data at all. This is not restricted to the developing world. I was very frustrated with several European countries with newspaper articles riddled with reports of how pristine and delicious their water is that don’t have a single public facing dataset.

Take Aways

United Kingdom and the US, both pioneers of the open data movement had terrible water quality data for water treatment, and no effort has been made to bring the data together or make it available in a real time fashion.  Also it is not clear to citizens who holds local bodies accountable for not updating their reports, making reports public or finding ways to bring this data into the light so it can be usable. It is no wonder that the US is now on the cusp of a public health crisis.

It is frustrating that the open data movement hasn’t quite been able to reconcile decentralization and local responsibility with national level accountability and transparency. Public health is a national level issue even though local and regional contexts are required for management. How do we push for openness and transparency in systems like this?

In places like India where water quality treatment is largely left to private players and huge populations are not receiving treated water, the need for data to be available, open, and in the hands of central bodies but also local players is a must, because people need to try to find solutions and where to intervene. Given the huge problems with water borne diseases, the slow but epic arsenic and fluoride poisonings gripping parts of India, and the effects this will have for generations, making this data public, usable and demystified is no longer an option.

All in all, I have to say this was an enlightening experience, it was cool to be able to learn something about each country. In our continuous push for open data we sometimes get lost in standards, formats, and machine readability, but taking a moment to really prioritize our values in society and have open data reflect that is essential. Public health outcomes and engaging with complex issues like it are an essential part of how to grow the open data movement and make it relevant to millions more.

*(Correction: Previous version said the survey included 148 countries, the actual number is 122.)

Delhi Jal Board and Open Water Data: Report from DataMeet-Up

A DataMeet-Up was held on Thursday, April 30, 2015, at the Akvo office to discuss the Summer Action Plan prepared by the Delhi Jal Board (DJB) and the data concerns thereof. Sundeep Narwani of Delhi Dialogue Commission presented the Action Plan. Kapil Mishra, MLA and Vice-Chairman of DJB, participated in the discussions and described the planned activities at DJB.

Here are the minutes of the meeting, prepared by Sandeep Mertia.

Sundeep Narwani, started with a presentation on Delhi Jal Board’s Summer Action Plan (SAP) 2015. Some important point from his presentation were:

  • SAP is a short term measure for three months of summer
  • Big problem: 40% of Delhi does not have enough piped water networks, and thus tanker services and unauthorised supply exist
  • They have planned several measures for improving the systems of – tube wells, infrastructure (replacing old lines) and repairs, grievance redressal and sewage treatment.

The details are available in the Summer Action Plan document.

This was followed by a long discussion session on several issues and concerns – related to water problems in Delhi. Some of the important questions, comments and suggestions were:

  • What is the authenticity of data which DJB has?
    Answer: Doubtful.
  • What’s the organisational structure of DJB, and its relationship with the MCD?
    Answer: DJB is a state body, independent from MCD
  • There is no data on bulk supply to colonies
  • Very little end user data. Lack of meter reading and averaged bills are part of the reasons for this problem
  • There is no way to interconnect supply between localities
  • No data on quality of water
  • Dr. Rajinder Kaur spoke about using existing spatial data maps of NCT/NCR, and the research conducted by the Indian Agricultural Research Institute on using spatial data for classifying ground water depth and quality
  • Dr. Renu Khosla, from Centre for Urban and Regional Excellence spoke about using GIS data for slums
  • Mr. Kapil Mishra, the Vice-Chairman of the DJB spoke at length about how they plan to transform the DJB. Also, he promised all data sharing from DJB’s side.

After a general discussion on various water related issues in Delhi, in the last segment we focused on framing the data problems associated with SAP 2015.

  • Need to think about the water data which already exists.
  • Sundeep will put up a list of people and organisations which have data on water in Delhi
  • A suggestion was made to focus on Gram Sabha level data as well
  • Need to prioritize the issue of water access to all, the missing data on ‘access’ related problems and appropriate mechanisms
  • Some private bodies have been collecting data from GPRS meters, let’s try to open this data
  • Need to map the borewells
  • Need to interpret and understand the data which DJB requires.

To Do list

  • We will list out all data sets that have informed the SAP document [Time: 2 weeks]
  • Sundeep will share the data sets already available from DJB
  • Once the list of data sets is prepared by us, it will be submitted to DJB via Sundeep and Kapil, who will then see if the mentioned data sets can be opened up. [Approximate time: 1 month]
  • Once these data sets are available, we will evaluate the quality of these data sets
  • Identify data sets that are missing and the ones that require a better collection process


We are using a Google spreadsheet to list out all data sets that informed the Summer Action Plan.

We are using HackPad to collect various resources.

Images of notes takes by Namrata Mehta and Sumandro Chattapadhyay at the meeting: