Graph is the optimal representation of information. In a graph, there are nodes (maybe “entities”) which are connected by edges (say, “relationships”). For example, take a moment to imagine what your Facebook network looks like: there’s you in the middle connected to all of your friends, who are, in turn, connected to each other in various permutations. Representing these friendships in a table is an arbitrary (and detrimental) way to express this information. Friendships don’t look like a bunch of rows in a table; they look like this:
Predicting Node Properties
Now that we have your Facebook friends represented optimally, how do we go about the business of optimally predicting what your friends are interested in? We use graph convolutional networks (GCNs). GCNs are a new deep learning architecture which leverage both the information contained in the data and the information contained in the relationships between data.
By way of example, let’s say we want to predict which of your friends are republicans and which are democrats. We’d apply a GCN to your Facebook network above to assign the predicted node property, “political stance” to the nodes with missing labels in the graph. The GCN takes two inputs: first, it takes a list of the “features” of each of your friends, which could include things like their alma mater, their zip code, the groups they’re a member of, etc. Obviously, this information is characteristic for learning a person’s political leaning. Second, the GCN takes a condensed form of the graph’s structure. This helps the GCN learn how the friends’ friendships influence their political stance.
Arguably, both of these pieces of information alone are enough to make predictions about a person’s political leaning. Traditionally, deep learning systems do not even make use of the relationships between entities to make predictions, they only use the properties of individuals. But by incorporating both the properties and structure of the data in GCN we build an incredibly powerful predictive tool.
The application of GCN’s is pretty simple. They operate in layers, which you can stack together to be as deep as you want. Inside of each layer, there are three things happening: first, the structure of the graph is normalized. Second, the normalized graph structure is multiplied by the node properties. Finally, the last thing that happens inside of each GCN layer is that we apply a nonlinearity function to the node properties and weights:
In application, we stack in a dropout layer and use leaky ReLU’s with a softmax output activation function to build something which looks like this:
I want to highlight the simplicity of this neural network design by including some Tensorflow code which implements a GCN layer. I’ve marked it up with annotation for those readers unfamiliar with Tensorflow:
The above cartoon example is too simple to demonstrate the complexity of this problem. Below, find a real Facebook friendship network, colorized by political leaning.
Remember that the GCN doesn’t care what the input data is for each node, or what the relationships between nodes look like. We could use anything! Using information about the financial transactions between businesses and the financial standing of the businesses themselves, we can use a GCN to tell us what transactions are fraudulent. Even more powerfully, building a supernode graph of businesses, we can use GCNs to tell use where analysts should look for money laundering rings.
We can apply GCNs to banks’ customer data to predict which customers should not be approved for loans based on some customer characteristics and some risk thresholds. Again, the power of using between-customer relationships is that we can leverage the structure of historical bank data.
I’ve shown that leveraging the natural structure of data improves the predictive power of machine learning analysis, so it’s important to recognize the ubiquity of graphs. I’d be willing to bet your organization works on data every day which is perfect for a graph structure. Any situation in which pieces of information are connected together, there’s a graph problem. And anywhere there’s a graph, there’s a beautiful, powerful analytical insight waiting to be discovered.
Want to chat about your use case? Send Graham an email!