Tag Archives: featured

Olacabs at GeoBLR

Last week, we gathered at the Paradigm Shift cafe in Koramangala, to learn about the location data infrastructure at Olacabs.com. The meetup was particularly interesting in the light of Ola’s recent move adding autorickshaws to their offering. Location is at the center of Ola’s business.

Vijayaraghavan Amirisetty, Director of Engineering at Olacabs, introduced how they collect data in real-time from cars fitted with smartphones. With over a lakh vehicles online at any given time, Ola’s primary challenge is to build an infrastructure to allocate taxis to customers quickly and reliably. Vijay highlighted some of the issues around collecting location data via GPS and cell networks. Even though both the technologies have matured since their inception, they are highly unreliable in various scenarios. Ola uses a combination of algorithms to build a reliable layer over GPS and network. One thing to note is that the smartphones are of variable quality and the system needs to work regardless of these metrics.

olameetup

Even though Ola is using Google Play services as their location aggregator, in India, network is a bigger challenge. Quality varies from city to city and also reception within a city in unpredictable. Ola falls back to SMS, driver’s phone and a set of offline algorithms if the network is unavailable. Ola’s infrastructure is built using technologies like MongoDB, MySQL, Cassandra, Redis and Elastic Search. They are also exploring integrating web sockets and an experimental custom Android mod.

There was a lot of feedback from the audience specifically around why it is difficult for the drivers to locate the customer. Driver training is not an easy task – there are a lot of logistical and operational challenges. Vijay emphasised on the amount of work Ola does to improve the drivers’ experience with the whole process of on-boarding their cars.

Everything at Ola is realtime – why would anyone book an auto through Ola if they can just walk out and get one in less than a minute. They are continuing to improve and innovate to revolutionize transportation in Indian cities.

Autorickshaw photo CC 2.0 Spiros Vathis

{Delhi} Jan 28th Planning Meeting

Delhi DataMeet – Jan 28th

IMG-20150128-WA0005

On the 28th of Jan Delhi DataMeet met to discuss their plans for the next year and to pick new organizers since Sumandro is leaving Delhi.

  • Nisha then summarized what was happening with DataMeet central and the other chapters.
  •  Amitangshu – when you think of DataMeet we think about through the sector we work in.  How does translate to the overall goal and become a common idea to all of us?
  • Updates on NIC
    • NDSAP Cell new’s head BN Satyapati
    • Data.Gov.iIn team is the team that is managing MyGov.in – resources have been moved to MygGov – can we push mygov to do open data things?  People have to suggest open data activities on MyGov.in.  Something we can do?

We started out with introductions and with the following questions.

  1. What should be the purpose of DataMeet Delhi?
  2. How can DataMeet Delhi add value to your own work?
  3. What activities/events should DataMeet Delhi do during the next year?

In order to find out the answer to the above questions post it notes were passed out and people were asked what they thought the purpose of DataMeet Delhi should be and what activities do they want to do in the following year.

delhidatameet

Purpose/Value

  1. awareness and knowledge about the global movement of open data – nuances/ policy/politics
  2. open data advocacy intervention in policy decision making, (think about other policies and portals in other ministries) (if you get buy in and it can move it forward) workshops, [bridging to osm – Satya] – advocating with various public officials – important thing to do.
  3.  DataMeet has members with excellent data skills – students what to learn data tools – people interested in teaching and students learning more about data skills – help people use and learn data tools- help students – basic and small workshops – Ravi
    1. when working with students – causes more due diligence – do it with an outside audience  –
    2. hold them in north and south campus – in a college.
    3. focused result workshops with journalists – focused with practical ends – training
    4. Student Workshops – Ravi, Krishnan, Nisha, Guneet,  will work together to plan a few for March, April
  4. open community, talks, organization, regular talks, conference
  5.  act like a platform – teach learn and doing
  6. talks, half day and 1 day hacks, areas to work, data entreauapneirship, public policy (budget state and local) – WASH, tie up with startups  – startups and private sector come in and talk about their issues and using data – start up tie in does help with advocacy – lobbying/legal political advisories – open source data conversation
    1. let’s think about specific audiences, there are so many catalogues – list of all the catalogues of different sources and mash up data
    2. use gov data, collect public data and speak with them about publishing data, data tools service
  7. team up with non for profits and help them solve their problems – break thru -woman’s group – support them  – ashoka, idrc, go thru them and see who needs help  – later in the year – happy to solve them – take them up as they challenging etc
    1. Non Profits –  submit problems and we can take them up as they come up and make that a focus on the DataMeet
    2. helping NGOS – on board with what they want to achieve – figure out the larger things
  8.  data problems – data successes – inspire and learn and connect
  9. work more with start ups – spread open data ideas
    1. side meetings – working with gov, ngo’s and startups – sumandro, raman,
  10. work with lamp fellows, prime minister rural fellows
    1. Government  – we should be willing to help the government as well – government isn’t easy to work with and is time intensive

TEACHING, LEARNING, DOING – The above purpose and values can be categorized under this mantra.

  • teaching students
  • doing advocacy and ngo solves problems
  • compiling casestudies
  • give rewards – push it to happen
  • make sure datameet gets credited
  • open data success stories

CALENDAR 

Feb –

    • Gurgoan budget data – Namrehta – interested in knowing more about what is happening – ongoing – state prs people – ADR – state chapter – city budget – ties into the overall state budget – income into city from state – and vice versa – long campaign – filing a PIL – ongoing – do a non profit – learning training thing
    • Assembly election in FEb

March

  • budget data ramanjit cheema,- bring groups who work on budget together – AI, CBGA
  • out in the open themed pecha kucha

June

  • capacity building with civil society groups
  • session of learning and- need to dismystify the tech heavy agenda of the group

July

  • AKVO event – water sanitation – manifesto what are the data gaps – bring them together – talk about data and shareing and advocacy – formats – INdia Wash Forum

 Oct

  • Public transport data – Guneet, Namrehta,

 Dec

  • Malnutrition

 Nov

  • Open Data Camp

 New Organizers of Delhi Data Meet!!!

Guneet, Isha, and Prachi!!!

Thanks to everyone for a great meeting! If I missed anything please add to the comments!

 

Open Transit Data for India

(Suvajit is a member of DataMeet’s Transportation working group, along with Srinivas Kodali, we are working on how to make more transit related data available.)

Mobility is one of the fundamental needs of humanity. And mobility with a shared mode of transport is undoubtedly the best from all quarters – socially, economically & environmentally. The key to effective shared mode of transport (termed as Public Transport) is “Information”. In India cities, lack of information has been cited as the primary reason for deterrence of Public Transport.

Transport Agencies are commissioning Intelligent Transport Systems (ITS) in various mode and capacity to make their system better and to meet the new transport challenges. Vehicle Tracking System, Electronic Ticketing Machines, Planning & Scheduling software are all engines of data creation. On the other side, advent of smart mobile devices in everyone’s hand is bringing in new opportunities to make people much more information reliant.

But the demand for transit data is remarkably low. The transit user and even transit data users like City Planners should demand for it.
The demand for Public Transport data in India should be for the following aspects:

A. Availability
To make operation and infrastructure data of Transport operators easily available as information to passengers in well defined order to plan their trip using available modes of Public Transport.

B. Interoperability
To make transit data provided by multiple agencies for different modes (bus, metro, rail) usable and make multi modal trip planning possible.

C. Usability
To publish transit oriented data in standard exchange format across agencies in regular frequencies to provide comprehensive, accurate and updated data for study, research, analysis, planning and system development.

D. Standardisation
To be a part of Passenger charter of Transport Operators to publish their data in standard format and frequency. This can also serve as a guideline for Transporter Operator while commissioning any system like Vehicle Tracking System, ITS, Passenger Information System, website etc.

What kind of Transit data is needed ?

  • Service Planning data

It will comprise of data on bus stops, stations, routes, geographic alignment, timetables, fare charts. With this dataset, general information on transit service can be easily gathered to plan a journey. Trip Planning mobile apps, portals etc can consume this data to provide ready and usable information for commuters.

  • Real time data

A commuter is driven by lot of anxieties when they depend on public transport mode. Some common queries; “When will the bus arrive ?”, “Where is my bus now?”, “Will I get a seat in the bus ?”, “Hope the bus has not deviated and not taking my bus stop.”.

Answer to all this queries can be attended via real time data like Estimated Time of Arrival (ETA), Position of the vehicle, Occupancy level , Alert and Diversion messages etc. Transport Operator equipped with Tracking systems should be able to provide these data.

  • Operational & Statistical Data

A Transport Operators operational data comprises of ticket sales, data of operation infrastructure and resources like Depots, Buses, Crew, Workshops etc. As operatore are tending towards digital mode of managing these data it also makes a good option to publish them at regular intervals.

A general commuter might not be interested in this data, but it will very useful for City Planners to analyse the trend of commute in the city and make informed decision. City transport infrastructure can be planned to orient it towards transit needs and demands.

The transport agency can benefit highly by demonstrating accountability and transparency. They can uplift their image as a committed service provider thereby gaining for passengers for their service.

So, together it will make a thriving landscape, if the data creators of Public Transport in India provide their data in Open which can be consumed by a larger set of people to build platforms, applications, solutions for transport study, analysis & planning across different section of users.

Open Transit Data is the tipping point for Smart Mobility in India.

That is why we have started putting our thoughts together and began writing an Open Transport Data Mainfesto.

GeoBLR in 2015 – Mapping Unmapped Places!

Dholera, Ahmedabad

To kick things off in 2015, we met at the offices of the Centre for Internet and Society (CIS), Bengaluru to map the unmapped/less-mapped settlements along the proposed Delhi-Mumbai Infrastructure Corridor (DMIC) project. The DMIC, a 1,483 km-long development corridor spanning several states in northern and western India, has been attracting a lot of curiosity and criticism from the national and international participants and observers. The project will have built a dedicated freight corridor, several industrial and logistics hubs, and smart cities at its completion. The project has been structured to be constructed in phases. The pilot project for an integrated smart city, Dholera Special Investment Region (SIR), is underway.

B61DDQzCUAACknX

The quality of mapping in many regions relies on a very active mapping community, or a strong interest from a collectives and local networks. We think it is important regardless to map the assets that pre-exist around the proposed sites of developments. With this in mind, we decided to take a look at the areas earmarked for the Dholera SIR (Gujarat), Shendra (Maharashtra), Mhow (Madhya Pradesh), and Dadri/ Greater Noida (NCR). The evening began with Tejas introducing the DMIC project, the scale of new development, and the need to capture these changes for years to come on OpenStreetMap (OSM). Sajjad provided a rapid tutorial on signing up for OSM, and using the browser-based map editor. The party was attended by guests at CIS as well as remotely from Bangalore and Dharamsala.

B61Hg57CIAAmBmV.jpg_large

As the party progressed, several guests ended up mapping roads, buildings, and water bodies in the Dholera region. Others chose to similarly map Shendra, and Dadri.

OpenDataCamp Delhi 2014 in Tweets


https://twitter.com/ajantriks/status/533225676774449152


https://twitter.com/Sramach9/status/533465685624913920


https://twitter.com/Shobha_SV/status/533473456147664896


https://twitter.com/Sramach9/status/533475979570585600


https://twitter.com/Shobha_SV/status/533478678382919682


https://twitter.com/Shobha_SV/status/533479268425023488


https://twitter.com/Shobha_SV/status/533479741232119808


https://twitter.com/Shobha_SV/status/533480948361228290
https://twitter.com/Shobha_SV/status/533484852734349312


https://twitter.com/Sramach9/status/533491064523743232
https://twitter.com/ZahirKoradia/status/533491133335105536


https://twitter.com/Sramach9/status/533501991201153026


https://twitter.com/Shobha_SV/status/533504601379442688


https://twitter.com/Shobha_SV/status/533505132206370817


https://twitter.com/Shobha_SV/status/533506490640777218


https://twitter.com/Shobha_SV/status/533507470870589441


https://twitter.com/Shobha_SV/status/533508246233829376


https://twitter.com/Shobha_SV/status/533509041427722242


https://twitter.com/Shobha_SV/status/533511798700654593


https://twitter.com/Sramach9/status/533513382054199296


https://twitter.com/ZahirKoradia/status/533517016200904705


https://twitter.com/Sreechand/status/533522447799042049


https://twitter.com/Shobha_SV/status/533523564872220672
https://twitter.com/Shobha_SV/status/533524293879988225


https://twitter.com/mtwestra/status/533526864233771009


https://twitter.com/ZahirKoradia/status/533528993450844161


https://twitter.com/ysprem/status/533530134859374593
https://twitter.com/Sramach9/status/533549017167179776
https://twitter.com/Sramach9/status/533549424761245696


https://twitter.com/Sramach9/status/533557403997200384


https://twitter.com/ayushkray/status/533585062512443392


https://twitter.com/rohithjyo/status/533628393678319616

Open Data India Watch – 16

Stories

  • SoilGrids1km is a collection of updatable soil property and class maps of the world at a relatively coarse resolution of 1 km produced using state-of-the-art model-based statistical methods: 3D regression with splines for continuous soil properties and multinomial logistic regression for soil classes. SoilGrids1km is a global soil information system based on automated mapping.

Tech

Events

Rebuilding the Karnataka Learning Partnership Platform

The Karnataka Learning Partnership recently launched a new version of their platform. This post talks about why they are building this and also some of the features and details. This is cross-posted from their blog.

Over the past five months we have been busy rearchitecting our infrastructure at Karnataka Learning Partnership. Today, we are launching the beta version of the website and the API that powers most of it. There are still a few rough edges and incomplete features, but we think it is important to release early and get your feedback. We wanted to write this blog post along with the release to give you an overview of what has changed and some of the details of why we think this is a better way of doing it.

Data

We have a semi-federated database architecture. There is data from Akshara, Akshaya Patra, DISE and other partners; geographic data, aggregations and meta-data to help make sense of a lot of this. From our experience PostgreSQL is perhaps the most versatile open-source database management system out there, Especially when we have large amounts of geographic data. As part of this rewrite, we upgraded to PostgreSQL 9.3, which means better performance and new features.

Writing a web application which reads from multiple databases can be a difficult task. The trick is make sure that there is the right amount of cohesiveness. We are using Materialized Views in PostgreSQL. Materialized View is a database object that stores the result of a query in a on-disk table structure. They can be indexed separately and offer higher performance and flexibility compared to ordinary database views. We bring the data in multiple databases together using Materialized Views and refreshing them periodically.

We have a few new datasets – MP/MLA geographic boundaries, PIN code boundaries and aggregations of various parameters for schools.

API

The majority of efforts during the rewrite went into making the API, user interface and experience. We started by writing down some background. The exhaustive list of things that the API can do are here.

We have a fairly strong Python background and it has proven to be sustainable at many levels. Considering the skill-sets of our team and our preference for readable, maintainable code, Django was an obvious choice as our back-end framework. Django is a popular web development framework for Python.

Since we were building a fairly extensive API including user authentication, etc., we quickly realized that it would be useful to use one of the many API frameworks built on top of Django. After some experimentation with a few different frameworks, we settled on using Django-Rest-Framework. Our aim was to build on a clean, RESTful API design, and the paradigms offered by Rest-Framework suited that perfectly. There was a bit of a learning curve to get used to concepts like Serializers, API Views, etc. that Rest-Framework provides, but we feel it has allowed us to accomplish a lot of complex behaviours while maintaining a clean, modular, readable code-base.

Design

For our front-end, we were working with the awesome folks at Uncommon, who provided us gorgeous templates to work with. After lengthy discussions and evaluating various front-end frameworks, we felt none of them quite suited what we were doing, and involved too much overhead. Most front-end frameworks are geared toward making Single Page Apps and while each of our individual pages have a fair amount of complexity, we did not want to convert everything into a giant single page app, as our experience has shown that can quickly lead to spiraling complexity, regardless of the frame-work one uses.

We decided to keep things simple and use basic modular Javascript concepts and techniques to provide a wrapper around the templates that Uncommon had provided and talk to our API to get and post data. This worked out pretty well, allowing us to keep various modules separated, re-use code provided by the design team as much as possible, and not have to spend additional hours and days fighting to fit our code into the conventions of a framework.
All code, design and architecture decisions are in the open, much like how rest of our organisation works. You can see the code and the activity log in our Github account.

Features

For the most part, this beta release attempts to duplicate what we had in v10.0 of the KLP website. However, there are a few new features and few features that have not yet made it through and a number of features and improvements due in future revisions.

Aside from the API, there are a few important new features worth exploring:

  1. The compare feature available at the school and pre-school level. This allows you to compare any two schools or pre-schools.

    1. Planned Improvements: The ability to compare at all and any levels of hierarchy; a block to a block or even a block to a district etc.

  2. The volunteer feature allows partner organisations to post volunteer opportunities and events at schools and pre-schools. It also allows users to sign up for such events.

    1. Planned Improvements: Richer volunteer and organisation profiles and social sharing options.

  3. The search box on the map now searches through school names, hierarchy (district, block etc.) names, elected representative constituency names and PIN Codes.

    1. Planned Improvements: To add neighbourhood and name based location search.

  4. An all new map page powered by our own tile server.

  5. Our raw data page is now powered by APIs and the data is always current unlike our previous version which had static CSV files.

    1. Planned Improvements: To add timestamps to the files and to provide more data sources for download.

Now that we have a fairly stable new code base for the KLP website, there are a few features from the old site that we still need to add:

  1. Assessment data and visualisations of class, school and hierarchy performance in learning assessments needs to be added. The reason we have chosen not to add it just yet is because we are modifying our assessment analysis and visualisation methodology to be simpler to understand.

  2. Detail pages for higher levels of aggregation – like a cluster, block and district with information aggregated to that level.

  3. A refresh of the KLP database to bring it up to date with the current academic year. All these three have not been done for the same reason; because this requires an exhaustive refactor of the existing database to support the new assessment schemas and aggregation and comparison logic.

 

Aside from the three above, we have a few more features that have been designed and written but did not make it in to the current release.

  1. Like the volunteer workflow, we have a donation workflow that allows partner organisations to post donation requirements on behalf of the schools and pre-schools they work with for things these schools and pre-schools require and other in-kind donations. For example, a school might want to set up a computer lab and requires a number of individual items to make it happen. Users can choose to donate either the entire lab or individual items and the partner organisation will help deal with the logistics of the donation.

 

Our next release is due mid-October to include the volunteer work flow and squish bugs. Post that, we will have a major release in mid-January with the refactored databases and all of the changes that it enables and all the planned improvements listed above. And yes, we do have a mobile application on our minds too.

The DISE application will be updated with the current years data as well by November. We will also add the ability to be able to compare any two schools or hierarchies by December.

So that’s where we are, four years on. The KLP model continues to grow and we now believe we have a robust base on which to rapidly build upon and deploy continuously.

For the record, this is version 11. 🙂

Meet a DMer: Siddharth Desai

SidPhoto

Meet a DMer.

On the DataMeet list we have started referring to each other as DMers.  So I wanted to start highlighting people who are pretty interesting and have a great insights into open data.

Siddharth Desai is one of our super volunteers, he is steadfast in his commitment to helping out with Open Data  Camps and coming to any event in Bangalore that he can.  I was really happy to interview him and learn about why open data is such an interest to him.

Where are you from? What do you do?

I am from a town in Goa called Vasco-da-gama. Moved to Bangalore 10 years ago for professional reasons. Currently, I am working as a Software Architect with Nokia(formerly NSN). My job involves building solutions in the telecom domain. I do quite a bit of data analysis and visualization as part of my work. The type of data involved is mostly engineering and planning related data.

How did you find out about DataMeet?

I have been following the Open Data Movement for some time now. I realized there were some interesting things happening here in India when I saw the event notification for the first Open Data Camp in 2012. That’s when I heard about the DataMeet and have been on the list ever since.

Do you believe in open data? and why?

I believe in open data. It’s simply a great leveler. For most part of human history, the masses have been fooled and controlled because they didn’t have access to information that a select few did. Then came along Gutenberg who invented the printing press. Suddenly, knowledge could get out of the confines of a few and into the hands of many. And that empowered people and eventually led to greater equity.

The Internet and Wikipedia have done something similar in our times. The Open Data movement is another (huge) step forward in putting an end to all un-necessary information asymmetry.

What do you hope to learn? Contribute?

As part of my work, I have acquired the skills for making sense of complex data sets. I am hoping to put those skills to good use by contributing to any initiative that requires support.

Everytime I am at a data meet or data camp, I get to learn so much about life – about challenges in different non technical areas of data, like social and political contexts around data and information.

What is your impression of the datameet community?

Where else do people from such a diverse background meet. We have Academics and Hackers, NGOs and Bureaucrats, Journalists and Businessmen, Designers and more. With such an impressive line-up , there is huge potential to make an impact.

What kind of civic projects do you work on? What kinds of civic projects are you interested in working on?

Really anything that does good. Particularly, if anyone has any ideas in medical or healthcare spaces, I’d be glad to join. I’ve noticed during various illnesses in the family, that a lot of information on treatment efficacy, side effects, doctor/hospital failures, is shrouded in secrecy. This really needs to be available openly to all for closer scrutiny.

Share a visualization that you saw recently that made a big impression? Share an article you have read recently that made a big impression? (does not have to be data related)

There is this visualization by David McCandless that I love (partly because I enjoy sci-fi a lot).  It visualizes time travel in popular films and tv series. The approach to displaying a non-linear timeline is pretty creative.

Tool Review: WebScraper

Usually when I have any scraping to do I ask Thej  if he can do it and then take a nap. However, Thej is on vacation so I was stuck either waiting for him to come back or I could try to do it myself. It was basic text, not much html, no images, and a few pages, so I went for it with some non coder tools.

I checked the School of Data scraping section for some tools and they have a nice little section on using browser based scraping tools. I did a chrome store search and came across WebScraper.

I glanced through the video sort of paying attention got the gist of it and started to play with the tool.  It took awhile for me to figure out.  I highly recommend very carefully going through the tutorials.  The videos take you through the process but are not very clear for complete newbies like me so it took a few views to understand the hierarchy concept and how to adapt their example to the site I was scraping.

I got the hang of doing one page and then figuring out how to tell it to go to another page, again I had to spend quite a bit of time rewatching the tutorial.

At the end of the day I got the data in neat columns in CSV without too much trouble.  I would recommend WebScraper for people who want to do some basic scraping.

It is as visual as you can get though the terminology is still very technical.   You have to do into the developer tools folder which can feel intimidating but ultimately satisfying in the end.

Though I’ll probably still call Thej.