Robert's Tech Lab

trains

#transit #trains #python #sideprojects #raspberrypi #led

Seattle is celebrating it's newest extension to our light rail system on this upcoming Friday. In just 3 days, 4 new stations will open connecting the northern suburb of Lynnwood to Link's 1 line, which will connect down to Seattle and further down to SeaTac.

Now, many of you know. I like trains. That's right, I'm 100% a true train nerd, ever since I was a kid, I just thought they were neat. Growing up in Iowa though, we didn't really have any. So, moving to Washington and watching our light rail be built out has been a ton of fun for me.

I like trains meme

So, I was sitting at home and realized I had a spare string of programmable LED lights. I was probably watching a video of the new expansion, and it popped into my head, I want to build a real-time transit map of our light rail system.

General Approach

To build a realtime transit map, I knew I'd be dealing with my stack of raspberry pis and led strips. I could run pretty much any language I wanted to on the pi, but I decided on python. I'm still relatively new to python and this would shore up some of my learning on it. I've only done scripts on my computer in it, having an actual “deployed” application would be a step up for me.

For now, I will be doing a simple 10-light strip as a proof of concept. If I'm happy, I'll extend it out to the entire line. I also want to build it in a way that not only works for our current extension, but works for the next upcoming extensions as well. (The 2 line should connect to the one line hopefully late next year, and I don't want to have to rebuild the whole thing)

Finally, I'm not planning on labeling the strip. I think it would look better on the wall as more of an art project than a map in this case, so I want the LEDs to do the talking for me. So to do that I want to have LEDs that are constantly on for each station, with a different color LED indicating the trains on the line.

Gathering Data

So first step in this problem is the data itself. Where do we get data on transit networks? Well, funny enough, this isn't my first transit project. (Shocking, right?).

Transit orgs publish their data to Google and other map providers using a standard called GTFS – or the General Transit Feed Specification. This feed is essentially 2 main parts.

GTFS Static File

The first part is the Static File. This file contains data that is (mostly) static – Routes, Stops, Schedules, and Trips. If you think about your standard transit agency, these are mostly all standard and unchanging. (Unchanging in this case means you could pull the feed daily without much worry).

The file is delivered to you as a .zip file, containing several .txt files, which are honestly more .csv files. These txt files are broken into their respective parts, routes, stops, trips, and describe every detail about the system. The 1-line, our light rail line, has a route in the route file, there are many trips on the route, and those trips contain multiple stops. So as you learn how the data works, you start to see how it all interconnects. (If you're curious, the full spec is here.

So after retrieving the static file and building that functionality, I built out several concrete classes for each item, Route, Trip, etc. My original approach was to load each item into memory. On each load of the application pull down the static file, load the routes into memory, and go. Examining the data more, however, I realized the static file is small (30-40MB), but has a lot of data. Those trip stops above? That's a list of stops for every bus/train/tram/streetcar in the system for every trip for every route. Turns out even my main development machine took 5 minutes to load it all into memory. This wasn't going to work in a tiny raspberry pi.

So, I went online and tried to see if anyone had invented the wheel yet, pre-loading this GTFS data into some easier to query format for me, and wouldn't you know it someone did. I found GTFSDB, which is exactly what I needed. GTFSDB, cleverly enough, loads the static file of the GTFS feed directly into a DB. In this case I chose sqlite because I didn't really want to host an entire database for this, sqlite hosted on the pi would be fine.

The tool worked extremely well, simply pointed it to a file, gave it the URL to the zip for the feed, and within a few minutes I had a fully formed sqlite database of the entire feed.

GTFS Realtime File

The realtime file is the “right now” file for the transit feed. This is where we get data on vehicles and their current locations. This file is much smaller, usually only a couple of KB, and it's in protobuf format. For this there is a python package called gtfs_realtime_pb2 which reads in the protobuf format, which then using another package protobuf3_to_dict (the 3 is important for python3), can all be read into a dictionary. Finally, we can use pandas to iterate through and pull out our vehicles.

    feed = gtfs_realtime_pb2.FeedMessage()
    response = requests.get(realtime_url, allow_redirects=True)
    feed.ParseFromString(response.content)

    dict = protobuf_to_dict(feed)

    df = pd.DataFrame(flatten(record, '.')
        for record in dict['entity'])

    vehicles_by_route = {}
    for index, row in df.iterrows():
        vehicle = Vehicle(row)

At this point I have a sqlite database with all of the static data, and with this function I can get the current realtime info from the transit agency, it's finally time to start writing my own code!

Light Strip configuration

The configuration took probably the most thought – how do I lay out my light strips in a way that is dynamic, but not so much that I can't be precise about it. My first approach was to do just approximations. If there are 23 stations and 100 lights in a strip, just take round(100/23) for the stations and then move the trains between them. This would certainly work, and it would even work when Sound Transit opens the Federal Way extension soon when our stations become 24, and when 130th street opens and becomes 25. However, it does not support the 2 line.

The 2 line will merge into the 1 line at International District, heading all the way up to northgate, which means at some point in the strip there is a constant LED that will light up for a 2 line train heading south from ID, and then take a right, and need to seamlessly merge onto a second horizontal light strip. No amount of math or approximations will make that work, I need to define a config file.

So I landed on this rough schema:

{
    "E Line": [{
        "direction": 0,
        "stops": [
            {
                "code": 538,
                "led": "1:0"
            },
            {
                "code": 558,
                "led": "1:2",
                "loading": [
                    {
                        "led": "1:1",
                        "percentage": 1
                    }
                ]
            },
            {
                "code": 575,
                "led": "1:4",
                "loading": [
                    {
                        "led": "1:3",
                        "percentage": 1
                    }
                ]
            },

This schema has a few basic components. First it defines what line we're paying attention to. For testing, that's the King County Metro RapidRide E line. Good service, every 5-10 minutes or so. Then in that is an array, GTFS splits the trips into direction: 0 and direction: 1, to designated which direction on the line it's traveling. More on how I'm handling that later. Then there are stops, these will actually map the stops in the feed to LEDs we care about.

The code is a short code from the GTFS feed. 538 is a stop at 3rd Ave & Columbia St in Seattle.

Inter-stop algorithm

“loading” is where things get really interesting. So, if I can't use an approximation between stops, then I need to define each LED in some way between each stop. My original plan was to use bounding boxes on a map, from this lat/lon to this lat/lon make a box, if the vehicle's lat/lon is in that box, light up this pixel. I may still do that, but I realized that it is much simpler if I just do percentages.

We have two lat/lon coordinates. The stop before, and the stop the vehicle is currently moving towards. For now, assume a straight line between them. Then, we have a lat/lon for the vehicle itself. Given that point, calculate roughly what percentage it's traveled between these two points. That percentage then take the largest possible block in that loading block and return LED.

In this case, they are all at percentage: 1, 100%. I only have 10 LEDs and I'm using them sparingly, but for the next one I could have say, 5 LEDs between each station, and then each light would be 0.2, 0.4, 0.6, 0.8, and 1. Then each one should light up as it moves.

This should hold me over for the proof of concept. There's a couple more approaches I want to look into, but this will be good enough for now.

The Primary Loop

The entire meat of the program exists in a large loop, looping every 10 seconds to retrieve the latest info. In there, we grab the GTFS realtime feed, parse it, and grab each vehicle. If the vehicle is on a trip that belongs to one of the lines in our config file, we process it. The main code is:

while(True):

    vehicles_by_route = get_latest_feed()

    for route_short_name in led_config:
        route_config = led_config.get(route_short_name)
        vehicles = vehicles_by_route.get(route_short_name)
        route_stops = get_all_route_stops(route_short_name)

        clear_lights()
        for route_stop in route_stops:
            set_single_led(route_stop.get('led'), LightStatus.STATION)

        for vehicle_item in vehicles:
            vehicle: Vehicle = vehicle_item.get('vehicle')
            route: Route = vehicle_item.get('route')
            stop: Stop = route.stops.get(vehicle.stop_id)
            stop_bounding_area = BoundingArea.FromPoint(stop.latitude, stop.longitude, stop_radius)
            vehicle_is_at_stop = stop_bounding_area.contains(vehicle.latitude, vehicle.longitude)
            stop_config = get_stop_config_by_stop_code(route.short_name, vehicle.direction_id, stop.code)
            label = 'is at' if vehicle_is_at_stop else 'is heading to'
            if stop_config is not None:
                print('Vehicle {} {} stop {}'.format(vehicle.label, label, stop.name))
                if vehicle_is_at_stop:
                    set_single_led(stop_config.get('led'), LightStatus.OCCUPIED)
                else:
                    # Calculate the distance from the last stop to this one
                    prev_stop_config = get_prev_stop_config_by_current_stop_code(route.short_name, vehicle.direction_id, stop.code)
                    if (prev_stop_config is None):
                        continue
                    prev_stop = get_stop_by_code(prev_stop_config.get('code'))
                    prev_bounding_area = BoundingArea.FromPoint(prev_stop.latitude, prev_stop.longitude, stop_radius)
                    percentage = stop_bounding_area.calculate_percentage(prev_bounding_area, (vehicle.latitude, vehicle.longitude))
                    # We know we're not at the stop, now just figure out which light to light up
                    led = find_largest_object(stop_config.get('loading'), percentage)
                    if led is not None:
                        set_single_led(led.get('led'), LightStatus.OCCUPIED)
    time.sleep(loop_sleep)

Let's step through that piece by piece.

For each route in our config get the vehicles on that route, it's stops, and get the config for that route. Then, clear our light strip. This could be cleaned up later for sure, I could load those first before every loop, but we're in proof of concept land.

    vehicles_by_route = get_latest_feed()

    for route_short_name in led_config:
        route_config = led_config.get(route_short_name)
        vehicles = vehicles_by_route.get(route_short_name)
        route_stops = get_all_route_stops(route_short_name)

        clear_lights()

For each stop in all of the route's stops, set the LEDs for the station colors. In my case it's a gold color.

        for route_stop in route_stops:
            set_single_led(route_stop.get('led'), LightStatus.STATION)

For each vehicle in the vehicles in the feed, get the route the vehicle is on and the stop it's at. Then, get the rough area that the stop is at (a radius of about half a city block) and that will be our guide for if a vehicle is actually at a stop. (The feed doesn't tell you it's made it to a stop, it only tells you what the next stop is, so we have to do some lifting here).

Load the config for the stop and this route from the original json, this will tell us if we care about this stop (maybe it's too far off our lightstrip, and we don't care)

for vehicle_item in vehicles:
            vehicle: Vehicle = vehicle_item.get('vehicle')
            route: Route = vehicle_item.get('route')
            stop: Stop = route.stops.get(vehicle.stop_id)
            stop_bounding_area = BoundingArea.FromPoint(stop.latitude, stop.longitude, stop_radius)
            vehicle_is_at_stop = stop_bounding_area.contains(vehicle.latitude, vehicle.longitude)
            stop_config = get_stop_config_by_stop_code(route.short_name, vehicle.direction_id, stop.code)

If the vehicle is at the stop (in our bounding area), set the LED associated with that stop from the station color to the occupied color.

Otherwise, find the previous stop (which I do by looking in my config, rather than the trip, maybe the last stop wasn't on our strip?). If we don't have it in our config, continue on, it's before our strip.

If we do have it, get the previous stop from the config, grab it's bounding area, and then do the percentage based algorithm between the two areas.

We then find the largest config we can for the inter-station light's travel percentage, and return that LED.

Finally, if that found LED exists, we set that LED to the occupied state.

LED configuration

Last part of this post (long I know), is that we'll need to actually set the LEDs. This ended up being a bit trying for me. I started with a very old Raspberry Pi 1B. It turns out that it's so old that even just installing python dependencies took hours. So, unfortunately that one won't do. I then moved over to one of my raspberry pi zeros, but there was actually an issue with the wifi on it, and ssh kept dropping, and I just got frustrated using it. After 5 hours of trying to figure out why SSH keeps dropping you realize it might be worth doing something else.

So I switched to my LibreComputer Renegade, and usually I really love these guys, actually even in this case I do. These things are rock solid, they're fast, and they're cheap. Unfortunately after setting everything up I learned that even though the underlying GPIO pins are exactly the same as the Pi, there was a hard-coded block on the LED python library to block anything non-pi related. I didn't feel like building my own library for this project, and I had already sunk a weekend into this so I headed over to my electronics shop.

Now, quick plug, there's a small shop in Bellevue, WA on Northup Way called Vetco Electronics. This place is a nerds dream. If you're in the Seattle area and interested in electronics, retro tech, side projects, or anything in between go check them out.

I went over there and picked up a Raspberry Pi 5. I had decided at this point that I didn't care about the cost, I just wanted something that would work.

And it didn't. At first. Turns out the 5 changed most of their underlying architecture around the pins that rendered most libraries obsolete. I was about ready to throw in the whole project running on a pi when I finally found this little python library `neopixel_spi. Turns out the original neopixel project needs to be overhauled for the Pi 5, but the SPI version (which is the protocol that actually talks to the chips, I believe), works perfectly. So finally, I could set some lights.

Setting the LED states is surprisingly easy, once you can light up one light at all, everything else is very easy. So much that the actual code for setting a light is simply:

# Define a board, this one is using the SPI interface, and I'm using 10 LEDs in my strip
pixels = neopixel_spi.NeoPixel_SPI(board.SPI(), 10)

# Set the color of LED 2 to a hex color value
pixels[2] = 0x00ff00

Proof of concept

With this simple 10-LED strip I finally have a working PoC. This is real data of King County Metro's E Line, from Columbia to Bell, sped up about 10x. If you watch closely you'll see a bug I'm tracking down too, but we'll get there.

In the next update, I hope to have a more full size demo, with a full size lightstrip. I'm also waiting on an API key from Sound Transit to access their data, instead of KC metro's. (That's why I had to use the E line for now).

I also have some really fun LED strips on order (at Vetco of course), and we're working on finding a neat way to display it. Baby steps!

This one turned out to be a long one. I'll have more updates soon, and as the project matures I'll fill out more info. If you want a sneak peak at the code, it's not ready yet and there is no documentation, but it can be found here.

Have a good day, and don't forget if you're in Seattle, on Friday our light rail extends north!