My newest certification this year gave me an opportunity to learn more about graph databases in Neo4j. I highly recommend the training: it’s rare to find useful training and a cert that is absolutely free (with a free shirt to boot!). I really enjoyed the graph data modeling course. It’s nice to learn a slightly different way of thinking about data. Maybe I’ll get more into my thoughts on that at a later date, but my intention for this blog post was for it to be the kickoff of a fun personal project.
To be clear, this is more of an exploration than a tutorial. I’ve gotten my cert in Neo4j, but other than the included tutorials, I haven’t had much experience with the software. This series is going to be me applying my data modeling chops to an interesting problem in a new software.
Inspiration
Since taking in the trainings and getting my cert, an idea has been hanging with me. In my time off, I enjoy playing a variety of board games, and I’ve got a decent collection. One of the modern classics of the board game hobby is Ticket To Ride. In Ticket to Ride, players compete to build the greatest rail network by collecting and playing cards of different suits to pay for the right to place train cars connecting cities. In addition to earning points per train car laid down, players draw destination cards which challenge them to connect two specific cities for a certain number of points. Ticket to Ride has spun off several different map packs with slight rules variations and regions of the globe. The original game focuses on North America, and has sold multiple millions of copies.

On its face, the Ticket To Ride board already looks something like a graph, and indeed, connecting different nodes of geography is the origin of graph theory, which underpins the entire technology of graph data modeling. In this way, it makes a lot of sense to attack the problem of strategy and understanding this game using a graph modeling perspective.
On the other hand, the more I thought about the use cases for such a model, I began running into challenges and it became clear that building a data model based on Ticket to Ride would be a more interesting task than I had thought. What follows is my thought process approaching this data model. There are definitely other valid ways to approach creating a model like this, and I doubt that what I wind up creating will be the most optimal way to do it. However, I hope you will enjoy joining me on this journey.
The Goal
To begin, I want to briefly start by outlining some of the key questions that I want my model to be able to answer, as well as the key features that I want to keep in mind as I’m building.
In the end, a fully built-out graph model would be able to serve as the basis for a ticket-to-ride AI model (in the video game design sense, not the plagiarism machine sense, but bully for me for including a hot searchable buzzword). Obviously, we have a long way to go before we get there, but in the meantime modeling out the map and being able to accurately display a current gamestate will allow us to assist a human player with key intel that will offer a strategic edge in play.
Design Considerations
In general, there are some things I want my data model to be capable of, and an end vision that I have for it. It’s useful to state these here so that I can use these as guideposts to influence the design.
- The model should be generic to all expansions. This means that we should be able to not just tackle the North American board, but also any other one we wish to feed into it. Maybe we even would want to design the model so that it holds multiple maps at once, or write scripts to load the correct map data when it is needed. Some maps may require additional properties due to the rule tweaks. We should be aware of this ahead of time and make that as easy as possible to facilitate.
- The model should provide useful information to players. This should go without saying, but it’s important enough to reiterate. Ticket to Ride has a complicated board, and it’s sometimes difficult to weigh options and different possibilities. The goal of this model should be to help a player digest all that info and make better deciisions. The “Final Form” of that might include a full-blown AI with differing strategies, but in the meantime, there are simpler questions that a person would find useful, for example
- What destination cards do I have that are blocked by other players?
- Which of these destination cards am I closest to completing?
- Is this destination card completed?
- What number of train cars are required to complete this destination card?
- How many different routing options are available between two cities?
- What cards are most likely to be left in the deck? (card counting)
- Are there any routes that I must purchase to avoid being blocked?
- What networks of cities do I have connected on the board?
- The model must reflect the board at its current state. This is implied a few times above, but just showing the cities on the board and their connections is not very useful in helping win the game. knowing which connections are available for purchase, and tracking one’s own score in comparison with others’ is critical to providing any useful intel.
The Map
With all that being said, I’ll begin with a stab at the most enticing bit of the data model, the map. Here is a snippet of the relationships between five cities on the original North American map.

This is a good start, but it doesn’t tell the full story. Each of these cities are actually connected by different routes, requiring different color cards for payment. Here’s a view of the actual board:

Connecting New York and Washington, for example, is a double route, one Orange, one Black. The map is more than just connections between cities. The connections themselves have properties that we need to store. And it’s at this point that we need to make the first few important decisions about our data model. While it’s true that relationships can have properties, and that it would probably be possible to create properties that describe the differing routes, there are a few reasons that I am leaning toward making the routes themselves nodes.
- Relationships in Neo4j are directional. This means that for each individual route, we’d need two relationships.
- Duplicate relationships may lead to inconsistencies in their descriptive fields or status.
- Since we’re planning on tracking the status of the board over the course of a game, accurately knowing who owns a given route, or if it is available. This is another good reason to design our model to look at the routes as their own object.
Designing Labels and Relationships
Quick Graph Notation overview
In Neo4j, Nodes are thought of as nouns, and relationships are thought of as transitive verbs. Assume that Frankie goes to Hollywood for example is a real world fact. We take this fact and then break it down into its constituent parts: Frankie, goes to, and Hollywood. Neo4j and its internal scripting language Cypher notate nodes with () and relationships with ->. So in terms of modeling the graph from the example fact, we could write it out as:
(Frankie) -[:goes_to]->(Hollywood)
While it’s possible to just use anonymous nodes with the required properties, we can define labels to genericize our model. Say we need a model to store where different people go. We would create labels to represent when we need to query *ANY* person or *ANY* location. So now in in addition to Frankie, we can have Mr. Smith goes to Washington, or Ernest goes to School. By writing a query, we can search and return nodes based on their relationship to other nodes.
MATCH (p:Person)-[g:goes_to]->(c.Location)
RETURN p.*, c.*
First Draft Model
With all that groundwork laid, I’m just about ready to merge some nodes and actually create some relationships. I’ve identified two labels I want to create for now. I’ve established that I want to treat routes as nodes of their own.
So, for the bare minimum for this map, I want to have Cities and Routes as my two labels. Later on as I work on adding more elements of the actual game, I will add things like Player, Deck, Destination tickets, etc.
Cities will connect to each other through the Route label. The cities will be connected to the routes with a relationship. For now I’ll call that connected_by, since we’re describing the relationship between cities and routes. In the game there isn’t much we need to track related to the cities themselves, while for the routes we will eventually need to create some properties to describe them. Given the length of this so far, I might dedicate a future entry to fleshing that out.
For now, let me merge some city nodes into the database, and connect them. Here’s some simple code to merge a few nodes representing two cities and connect them.
// Create Pittsburgh node.
MERGE (p:City
{
name:'Pittsburgh',
code:'PIT'
}
)
;
// Create Toronto node
MERGE (p:City
{
name:'Toronto',
code:'TOR'
}
)
;
// Create Route joining them,
// and the relationships between all three nodes.
MATCH
(p:City {code:'PIT'})
MATCH
(t:City {code:'TOR'})
MERGE
(p)-[con1:CONNECTED_BY]->
(x:Route {
code:'PIT_TOR_2'
})<-[con2:CONNECTED_BY]-(t)
return (p)
;
At this point, I ran into the first snag in my model, which I’ll resolve next time. I had planned on having ROUTE contain an array of subroute objects, but when I tried to add that property, Neo4j was kind enough to remind me that properties cannot be arrays of objects. With that in mind, I’ll have to find another way to account for double routes. But with those few nodes in place, I’ll close out this post with a screenshot of my graph database so far. A lot of work to do, but a decent start.



Leave a Reply