Category Archives: hackathon

To Hack or Not to Hack….

Hackathons are a source of confusion and frustration for us. DataMeet actively does not do them unless there is a very specific outcome the community wants like freeing a whole dataset or introducing open data to a new audience. We feel that they cause burn out, are not productive, and in general don’t help create a healthy community of civic tech and open data enthusiasts.

That is not to say we feel others shouldn’t do them, they are very good opportunities to spark discussion and introduce new audiences to problems in the social sector. DataKind and RHOK and numerous others host hackathons or variations of them regularly to stir the pot, bring new people into civic tech and they can be successful starts to long term connections and experiments. A lot of people in the DataMeet community participate and enjoy hackathons.

However, with great data access comes great responsibility. We always want to make sure that even if no output is achieved when a dataset is opened at least no harm should be done.

Last October an open data hackathon, Urban Hack, run by Hacker Earth, NASSCOM, XEROX, IBM and World Resource Institute India wanted to bring out open data and spark innovation in the transport and crime space by making datasets from Bangalore Metropolitan Transport Corporation (BMTC) and the Bangalore City Police available to work with. A DataMeet member (Srinivas Kodali) was participating, he is a huge transport data enthusiast and wanted to take a look at what is being made available.

In the morning shortly after it started I received a call from him that there is a dataset that was made available that seems to be violating privacy and data security. We contacted the organizers and they took it down, later we realized it was quite a sensitive dataset and a few hundred people had already downloaded it. We were also distressed that they had not clarified ownership of data, license of data, and had linked to sources like Open Bangalore  without specifying licensing, which violated the license.

The organizers were quite noted and had been involved with hackathons before so it was a little distressing to see these mistakes being made. We were concerned that the government partners (who had not participated in these types of events before) were also being exposed to poor practices. As smart cities initiatives take over the Indian urban space, we began to realize that this is a mistake that shouldn’t happen again.

Along with Centre for Internet and Society and Random Hacks of Kindness we sent the organizers, Bangalore City Police and BMTC a letter about the breach in protocol. We wanted to make sure everyone was aware of the issues and that measures were taken to not repeat these mistakes.

You can see the letter here:

We are very proud of the DataMeet community and Srinivas for bringing this violation to the attention of the organizers. As people who participate in hackathons and other data events it is imperative that privacy and security are kept in mind at all times. In a space like India where a lot of these concepts are new to institutions, like the Government, it is essential that we are always using opportunities not only to showcase the power of open data but also good practices for protecting privacy and ensuring security.

Mumbai Meet 6: Data Science Hackathon

DataMeet 6 was a 2 day, Data Science Hackathon that was organised by a BFSI company, Zone Startups and DataMeet Mumbai. The Hackathon took place in the Bombay Stock Exchange Building at Zone Startup’s office. Twelve teams participated. These included teams of young data enthusiasts and specialist data scientists teams from companies like TCS and Housing.com.

The BFSI company opened up 80GB of it’s real transactional data in a secure environment to the participating data enthusiasts.

The teams were expected to analyze the data and draw out insights that would be relevant to their use case scenarios such as Health Bankruptcy or pull out a trend which is hidden and unknown to the BFSI company. Teams were free to use any tool of their choice from R, Python, Tableau, etc.

Each team was provided an individual secure Oracle DB connection from which they could query the data but not download the data. The Oracle DB connections were opened only to the Static IPs of Zone Startups Office and the data to and fro from the servers was monitored to ensure against downloading of the data.

Day 1

The day started with various teams analysing the raw data, tables, meaning of columns. The representatives from the BFSI company also gave a briefing about objectives.

600_437353359

Day 2

Many of the young teams did not turn up on Day 2 due to complexity of the problem. At the end of Day 2, the judges from the BFSI company evaluated each team’s progress, gave feedback and suggestions.

600_437353364