Hong Kong’s smart city goals thwarted by lack of access to open , user-friendly data
By Rivers Zhang, Brianna To
Anyone who has tried to work out how to get to a new destination will realise there is no one single smartphone application in Hong Kong that will tell you what public transport to take, how long it will take you, and how much it will cost. Despite all the talk about developing Hong Kong into a smart city, the lack of easily accessible and usable open data remains an impediment to that goal.
Charles Mok, the legislative councillor representing the IT functional constituency, says the government should be leading the way when it comes to promoting and providing open data. “Usually the first step [to open up data] should be done by the government, because they possess the largest amount of data,” says Mok.
Mok says that the government has the most data because it is also the major service provider and is responsible for oversight of data sources such as the Hong Kong Observatory and various public transport franchises and operations.
Four years ago, the government took the initiative to open up data sets through the platform Data One, which has now been renamed data.gov.hk. But Mok says the data sets on the site are neither sufficient nor user-friendly.
According to the Open Data Handbook founded by Open Knowledge International, open data should be free to use, re-use and redistributed by anyone. In addition, the format of data sets is important. The most significant feature of open data is that the data set be “machine readable”.
Mart Van de Ven is a co-founder of Open Data Hong Kong, a group of volunteers who advocate for greater availability and quality of open data. Van de Ven explains that in order to be machine-readable, data should be in a format which can be understood and processed by a computer. Whereas, if for example, data is presented in PDF format, it would have to first be extracted before it can be used. This is costly and time-consuming.
Van de Ven says the government’s annual budget is an example of a document which is in PDF format and is not machine readable. “You can imagine a government budget, if the table has many different columns and many different rows, to get the right level in the right row becomes very problematic,” he says.
As the conversion process is complex and errors are inevitably made during the process, Van de Ven adds, “you will lose confidence about [whether] the data you use is correct”.
Van de Ven says some of the preferred formats for data are XML, CSV, Excel and Application Programming Interface (API). These formats can be used without conversion in different programming systems. On the other hand, PDFs and even pictures produced by scanning hard copies present problems to those who want to use the data because they require conversion.
He also emphasises the importance of obtaining raw data. Raw data refers to data that has not been converted or processed after it is collected. Processed data can refer to data that has undergone calculations like summarising and average- taking. Taking housing prices in Hong Kong as an example, Van de Ven says web and software developers can often only get processed data sets. That is, they usually get summaries of housing prices in an area that can be as big as Kowloon, the New Territories and Hong Kong Island.
“Ideally, what we want to have is, this house sold for this much on this date and this address, because then you have thousands of data points, and you can make your model, and you can do your own analysis.”
Unfortunately, many developers are unable to access the data sets they want. Cyrus Wong Chun-yin is an advisor for data-hk.com, a cloud service that serves as a database to store open data. He is also the developer of an online self-diagnostic medical tool called AWS Cloud Virtual Smart Doctor (Dr Cloud). Wong and his collaborators created Dr Cloud to make simple assessments of patients’ health through facial analysis, and information provided through linked medical instruments such as thermometers, glucose and cholesterol monitors and blood pressure monitors.
Analysis of the patient’s information is based on medical school training scripts and on information from medical data sets. However, the biggest flaw of the system is that it is inapplicable in Hong Kong. Wong says that despite their best efforts, he and his team cannot obtain suitable data sets from the Hospital Authority.
“The system now works well to diagnose foreign patients but not Chinese ones, because the units of the height and weight [used in the foreign data] are totally different,” he says.
Wong and his associates were so frustrated by delays and their inability to get suitable data that they have given up on trying to get it for now.
Gene Soo, the general manager of Citymapper in Hong Kong has also encountered difficulties in getting data. Citymapper is a well-known urban navigation app that provides information on how to get around in 39 cities around the world. Soo says it has been difficult to run the app in Hong Kong due to incomplete data.
He takes bus fares as an example. On the government website, only fares from the terminals are shown, but not from different stops. Individual bus companies do post detailed prices but they are in PDF format. Unlike the Singapore and UK version of the app, the column for estimated fares is left blank in the Hong Kong version of the app.
Another difficulty is the lack of access to real-time data. Soo says that up till now, Hong Kong Tramways is the first and only public transport operator in Hong Kong to release official real-time data to them. Real-time data refers to data that is collected and released instantly. The application of real-time data in Citymapper would help users calculate how long they will need to complete their journey, as well as when the next bus, train or tram will arrive at a particular stop.
Soo says real-time data would have been very useful during the Tsim Sha Tsui station MTR fire incident on February 10. The station had to be closed, severely affecting services but because there was no open real-time data, the Citymapper app was unable to respond to the incident effectively.
“We have to manually put [information] into our system, that some problems occurred on the Tsuen Wan line, the train is running with delays, and, because there’s no real-time arrival data, we cannot show the train is running slower to another stop,” says Soo.
He adds the release of real-time data would also help to prevent overcrowding when such incidents occur because people would be able to check transport apps for alternative routes and receive updates about the delays. At the moment, Soo says, Citymapper has to rely on updates from an unofficial MTR page to get data on MTR services.
Apart from apps developers, journalists, and especially data journalists, have also been thwarted by the difficulties of obtaining data in Hong Kong, particularly from the government.
Cedric Sam, an interactive graphic journalist at Bloomberg who was based in Hong Kong until recently, says what data the government does provide is insufficient and it is hard to ask for information from official sources.
Sam worked as a data journalist at the South China Morning Post from 2013 to 2015, and as a researcher in the University of Hong Kong (HKU) between 2010 and 2012. At HKU, he worked on projects about land use, planning and new housing estates. For the project, Sam and his colleagues at HKU created an online map showing the agricultural and other sites earmarked for redevelopment. In 2012, they requested raw data from the Town Planning Board (TPB) and received an excel version of the PDF files they needed from the TPB website.
However, the same request was made and declined in 2015 when Sam was working for the South China Morning Post. The reason given for the rejection was a “lack of spare capacity”.
In an interview conducted through email, Sam says: “Access to datasets of public interest is of great importance for citizens to participate in their society.” For him, data journalism is work that challenges authority. To be able to do this work well, he adds, we need to have legislation on freedom of information access and open data sets from financial and real estate sectors.
Mak Yin-ting, a former chairperson of the Hong Kong Journalist Association (HKJA), says the association and other journalists have been advocating for freedom of information legislation since the 1990s. The government enacted the Code of Access to Information in 1995 but it lacks the authority of a law and is of little use when it comes to requesting data from different government departments.
Even if the code were to be upgraded into a law, Mak says the presence of exemptions in the code would make it unacceptable. These include exemptions from disclosure of any information where disclosure would harm or prejudice the proper and efficient conduct of the operations of a department, information which could only be made available by unreasonable diversion of a department’s resources and information that involves external affairs, privacy concerns, defence and security.
“Our stance on legislation is to maximise data disclosure, and minimise the exceptional conditions and scopes, and an appeal system is compulsory,” says Mak. An appeal system would involve an independent party that could monitor the quantity and quality of data disclosure, and could act as a channel to resolve disputes.
The candidates for the Chief Executive election in 2012 all signed the HKJA’s charter promising to support legislation on freedom of information during their campaigns but, to date, the government has not taken any initiative to do so.
Mak says the only reason she can see for the government’s inaction is “that they do not want the public to access the data, so that they have the ability to hide.” She adds legislation for freedom of information access is crucial to safeguard the public’s right to know about the internal decisions made by the government.
“If we do not have access to this data, we could be easily fooled by what the government says; but if we get the data, we can figure things out for ourselves.”
According to the Open Knowledge Foundation’s Global Open Data Index, Hong Kong is currently listed 37 among 97 countries in terms of the openness of data, while Taiwan is on the top of the list. Despite promoting the development of a “smart city” and “open city”, Hong Kong has yet to be seen as really “open” in terms of the disclosure and utilisation of data.
Edited by Maggie Suen