Data Sharing


This past weekend, I had a little bit of time to work on a hobby project:

Sony Ericsson MBW-150 bluetooth watch showing San Francisco Muni Arrival Times

This is my Sony Ericsson MBW-150 bluetooth watch, showing the next few SF Muni bus arrival times for a nearby stop. The code to fetch the arrival times is running on my Droid phone, and communicating with the watch using Marcel Dopita’s OpenWatch software for the Android platform.

Using a secondary display like a watch could allow a rider to keep tabs on when their bus is coming without constantly having to take their phone out of their pocket and unlock its display—particularly nice if it’s cold enough and they’re wearing gloves.

It’s also worth mentioning that a few months ago, I wouldn’t have been blogging about this. On November 7, the San Francisco MTA finally gave formal permission to developers to build apps using their realtime arrival data. Prior to that, developers who spoke publicly about their experiments with the Muni realtime data risked threats from a company that claimed a contractual right to charge for access to the arrival data for Muni’s vehicles. People were still building interesting things, but because of these chilling effects, no one outside of their circle of trusted friends would ever know about them.

Moral of the story for agencies: if you want to encourage innovative realtime transit apps in your city, read your contracts carefully, and insist on the right to provide realtime data about your vehicles to creative and energetic developers. You’ll be in good company, alongside the Chicago’s CTA, San Francisco’s Muni and BART, Boston’s MBTA, and Portland’s TriMet.

Note: this post was updated to replace the original image with an improved one on December 18, 2009.

18 Comments

This site has been in semi-retirement for a while, as I focus more on things like the Transit Developers group, but I wanted to tell you about a site that just launched:
City-Go-Round

City-Go-Round is in many ways a successor this site’s own Headway Wiki, in that it makes it easier to find transit apps that people have built for a particular area or agency. However, CGR adds things like location-based search, screenshots, and better information about platforms and locations for each app. Even more importantly, it helps show how open data can really enable developers to build apps that help transit agencies and their riders, and provides a way for transit riders to tell their local agencies that they want open data.

The site came about because the guys at Front Seat started adding public transit information to their Walk Score site, and they started wondering how they could get access to more transit information to make their site more useful. So they rounded up some of the most passionate open data advocates in the Transit Developers community: Brandon Martin-Anderson, Jehiah Czebotar, Dave Peck, and Josh Livni, and put together a great resource in a few short weeks. (I also helped in an advisory role.)

The site has the most data about the US, but there’s already a bit of information about Canadian and Australian agencies. The site is also open source, and the team hopes that developers in other countries will help expand the site towards more global coverage.

1 Comment

A couple of months ago, a Chicago software developer named Harper Reed (who also happens to be CTO of skinnycorp, purveyors of some of my favorite t-shirts) did some reverse-engineering of the Chicago Transit Authority’s Bus Tracker web applications to figure out how to access the realtime information that they displayed.  He documented what he learned in a blog post, so that other developers could use that information to build their own bus tracking applications.
He also set up a proxy site for the API, to add a few functions, to return faster responses, and to reduce the load on the CTA’s servers.

His effort enabled others to create useful things like an iPhone application, a Mac OS X dashboard widget, a text messaging service, and a website that automatically shows you the stop you care about depending on the time of day. All of these things were developed in the space of two months, at no cost to the CTA, and at almost no cost to the riders (the iPhone app costs 99 cents to download—the other things I mentioned are free).

Of course, as Harper himself points out, his API is unofficial and illegitimate, done in one developer’s free time without the involvement or approval of the CTA or Clever Devices (its bus-tracking vendor).  These sorts of efforts are generally vulnerable to getting shut down under the auspices of contractual, copyright, or server load concerns.

For instance, here in San Francisco, developers of unofficial SF Muni tracking applications have received legal threats in the past: Steven Peterson (author of the handy Routesy iPhone app) mentioned it here and here, and Robert Dampho of muniriders.net told his story here.

Some of this is motivated by the fear of giving away what you might be able to sell.  It’s hard to blame perpetually underfunded transportation agencies for looking for additional sources of income—if they could find someone who’s willing to pay significant sums for access to their realtime information, then cutting off a few software developers (for whom transit applications are often a side project) might seem like a small price to pay.

However, this could be a false economy: consider how much time and money it would take for the CTA to contract out the development of all the applications I mentioned above. By allowing third-party developers to work with their information, a transit agency effectively gains a very motivated external R&D lab for almost no cost or risk.  A few forward-looking agencies have come to this conclusion, and have started offering official developer resources: Portland’s TriMet and the Bay Area’s BART have been the most progressive so far.

Until the day that the CTA decides to join them in explicitly offering access to their transit data, however, Harper and other Chicago transit developers are innovating on borrowed time.

4 Comments

Update (3/5/08): TriMet sent me an updated version of the presentation; I’ve updated the version embedded on this page, or you can download the PDF.

Earlier today at the APTA TransITech conference, TriMet‘s Tim McHugh gave a heartening talk about their experiences with making their raw schedules and and real-time information available to developers. Here are the slides:

Since you don’t get to hear the spoken half of the talk, here are a few points that he made that aren’t in the slides:

  • Riders always want more ways of accessing transit information, but TriMet has limited development cycles; releasing schedule feeds and APIs is way to allow outside developers to close the gap.
  • Chances are, outside developers are already scraping your transit site anyway, so why not give them a less error-prone direct feed of the information?
  • In the future, they plan to release an API to their trip planner.
  • Since they’ve launched their developer site, they’ve only received positive feedback on the resources; there’s been no negative impact on them from doing this!

The significance of this talk lay partly in the audience of technical staff from other agencies and transit vendors–this is the strongest endorsement that I’ve ever seen from an agency of the virtues of working with outside developers. In time, I hope that stories like TriMet’s will convince other agencies that they have much more to gain than they have to lose by sharing their data.

1 Comment

One of the biggest benefits of transit agencies making their raw schedule data publicly available, as TriMet and others have done, is that riders are free to do interesting things with the information that the agency itself might not have thought of or have taken the time to do themselves.

Case in point: Brett Warden in Portland is using TriMet’s GTFS feed to create a POI (points of interest) file for his dashboard-mounted GPS. This means that the very latest TriMet stop data now forms a clickable layer on his Garmin StreetPilot c580. Here are a few screenshots:

TriMet bus stops on the map
Bus stops are shown alongside driving directions.

Clickable stop icons
Stop icons on the GPS map can be clicked on to show…

Stop details
…the stop name and description, into which Brett has packed the stop ID, fare zone, and lines serving that stop.

Brett told me how he got started on the project:

At first I saw a POI collection, made by hand, of
all TriMet’s light rail stops. That got me thinking — if they made
the data available to Google, maybe they’d let me see it too, and make
a comprehensive map of ALL transit stops. They responded, and pointed
me to the GTFS developer site… by far the easiest experience I’ve
had getting information from a public agency.

To generate the file, he imports the GTFS feed into an SQLite DB and runs a few simple queries to generate the POI file. He plans to post the code soon, which will allow it to be used with other agencies’ GTFS feeds. In the meantime, the resulting TriMet stops POI file is available on the POI Factory site.

1 Comment

TriMet Developer Resources

Last month, Portland, Oregon’s TriMet agency became one of the first transit agencies to open a dedicated site for third-party users of their data. This site (along with BART’s GTFS page) marks a milestone for the transit field, demonstrating that agencies are starting to understand the benefits of sharing their data with outside developers.

To be fair to the folks at TriMet, they’ve been making this information available more unofficially, on request, for quite some time now. However, it’s significant that they’ve chosen to invest the time to publish a dedicated site with the necessary CYA legal text and API key mechanisms; it will no doubt encourage developers who weren’t previously aware of TriMet’s forward-looking stance on data sharing.

Right now, TriMet is providing the following:

They’re off to a great start. Applying for an API key is painless (I got mine within 5 minutes of signing up), and the fact that the services are in REST form makes it easy to experiment with them by just typing in different URLs. (Still, it would be nice to have more sample queries, or perhaps even an interactive web form, to demonstrate the expected query parameters and corresponding output before even having to sign up.)

Congratulations to TriMet on their launch—I’m looking forward to seeing what creative uses developers will have for these offerings!

2 Comments

Jaap Weel‘s recent posts about data sharing in public transit are worth a read. Here are some excerpts:

Dutch transit data locked up

Under traditional (and current American) copyright law, public transit timetables cannot be copyrighted (IANAL, but I’m fairly sure of this). With the European database directive (the one that was supposed to stimulate the knowledge economy and bladiblah), though, it is probably true that REISinformatiegroep can indeed control the timetable data, not only by refusing to provide it to Google and others in easily readable form, but also by suing you if you try to extract it from publicly available timetable books or web sites. One guy who tried to run “spoorboekje.nl” to provide an alternative to the heavyrail trip planner at ns.nl, which was not accessible to the disabled or to Linux users at the time, got nastygrammed into shutting it down.

Open transit data is good for transit agencies

Making transit data accessible means that the agencies can get trip planners, integrated with cell phones, portable navigation devices and other gizmos, accessible to the disabled, easy to use, and so on, all without lifting a finger. This saves them the cost of having to develop these things, which can be quite high and distracting from core business, and at the same time they get more real, paying customers for the service that they were set up to provide in the first place, viz. transit. This should also be especially interesting for for-profit transit companies such as Greyhound and Eurolines that have to compete with subsidized train and bus services.

Transit agencies and operators should think of third-party transit efforts as extremely cost-effective marketing and outreach programs, since the marginal cost of each new effort is practically zero (particularly if the agency is exporting their schedule data in a well-known format).

Comment on this post