Should the bikeshare industry adopt an open data standard? As bikesharing spreads to more cities, having a common method for accessing and analyzing data will become more important.
We know that transit systems work best when agencies concentrate on their core mission. Transit agencies are not in the information technology business; all they should do is release their data to let third parties build apps that let passengers use the systems.
To use open data, programmers need to know: Where is the data? What are the files called? Which fields are available? What are the fields called?
Bikesharing systems should adopt the standard of having a “data” page which can be found by appending “data” immediately after the main URL. This is what many U.S. government web sites are doing (like justice.gov/data, dot.gov/data, state.gov/data, etc.) It would be awesome to have consistent URLs like capitalbikeshare.com/data and velib.paris.fr/data.
To standardize what the files are called, we have to decide how many files are used, and what formats to use. Some systems do not separate the station information data (which is static) from the station status data (which is dynamic). The Capital Bikeshare XML file and the Bixi Montreal XML file are examples of combining both static and dynamic data in a single file (both use the Bixi public bike system). This might be more convenient in some cases, but for systems that frequently update their displays, it wastes a lot of bandwidth. This process could be made more efficient by using two files. JCDecaux, which manages many bikesharing systems in Europe, separates the static data from the dynamic real-time data.
I’ve often encountered systems with no data at all. With Denver’s B-cycle, their website has no links to open data, though Denver’s Open Data Catalog does offer a variety of formats for data about B-cycle Stations. I doubt this is the true, live, system data, because the coordinates are given as street addresses and not latitude and longitude coordinates.
In addition to information needed by apps, we also need historic data in order to analyze how people use the system. The most common kind is system metrics, such as the type released by Bay Area Bikeshare. This typically shows ridership and membership totals, and is good for showing how the system has grown. It would be updated at the end of each day.
Planners and analysts rely on two other types of historic data: trip history information shows every trip made within a certain period, and station history data shows the status of the stations within a certain period. The best example of the former is the Capital Bikeshare trip history data page, which releases a new data set every quarter. The latter is sometimes recorded by enthusiasts on their own initiative, such as the CaBi Tracker website. In San Francisco, Eric Fisher keeps a daily log of Bay Area Bikeshare stats at trafficways.org/babs (I used his data in my article entitled Probing Data from Bay Area Bikeshare).
By publishing and standardizing bikesharing open data, developers and analysts can make it easier for the public to make use of and discover bikesharing systems across the globe, such as the Bike Share Map by Oliver O’Brian. The vendors, operators, and managing jurisdictions should work together to create a standard that can be used by everyone.
For more detailed technical information about how the tech community can help make better bikeshare open data happen, check out this longer article from my blog Mystery Incorporated.
Photo by M.V. Jantzen