# Node Classification by Graph Convolutional Network

Leveraging the natural structure of data, graph convolutional networks produce optimal predictions of node properties.

## Node Classification by Graph Convolutional Network

Leveraging the natural structure of data, graph convolutional networks produce optimal predictions of node properties.

Fill out form to continue
All fields required.
Enter your info once to access all resources.
By submitting this form, you agree to Expero’s Privacy Policy.
Thank you! Your submission has been received!
Oops! Something went wrong while submitting the form.

Graph is the optimal representation of information. In a graph, there are nodes (maybe “entities”) which are connected by edges (say, “relationships”). For example, take a moment to imagine what your Facebook network looks like: there’s you in the middle connected to all of your friends, who are, in turn, connected to each other in various permutations. Representing these friendships in a table is an arbitrary (and detrimental) way to express this information. Friendships don’t look like a bunch of rows in a table; they look like this:

# Predicting Node Properties

Now that we have your Facebook friends represented optimally, how do we go about the business of optimally predicting what your friends are interested in? We use graph convolutional networks (GCNs). GCNs are a new deep learning architecture which leverage both the information contained in the data and the information contained in the relationships between data.

By way of example, let’s say we want to predict which of your friends are republicans and which are democrats. We’d apply a GCN to your Facebook network above to assign the predicted node property, “political stance” to the nodes with missing labels in the graph. The GCN takes two inputs: first, it takes a list of the “features” of each of your friends, which could include things like their alma mater, their zip code, the groups they’re a member of, etc. Obviously, this information is characteristic for learning a person’s political leaning. Second, the GCN takes a condensed form of the graph’s structure. This helps the GCN learn how the friends’ friendships influence their political stance.

Arguably, both of these pieces of information alone are enough to make predictions about a person’s political leaning. Traditionally, deep learning systems do not even make use of the relationships between entities to make predictions, they only use the properties of individuals. But by incorporating both the properties and structure of the data in GCN we build an incredibly powerful predictive tool.

The application of GCN’s is pretty simple. They operate in layers, which you can stack together to be as deep as you want. Inside of each layer, there are three things happening: first, the structure of the graph is normalized. Second, the normalized graph structure is multiplied by the node properties. Finally, the last thing that happens inside of each GCN layer is that we apply a nonlinearity function to the node properties and weights:

In application, we stack in a dropout layer and use leaky ReLU’s with a softmax output activation function to build something which looks like this:

I want to highlight the simplicity of this neural network design by including some Tensorflow code which implements a GCN layer. I’ve marked it up with annotation for those readers unfamiliar with Tensorflow:

# Applications

The above cartoon example is too simple to demonstrate the complexity of this problem. Below, find a real Facebook friendship network, colorized by political leaning.

Remember that the GCN doesn’t care what the input data is for each node, or what the relationships between nodes look like. We could use anything! Using information about the financial transactions between businesses and the financial standing of the businesses themselves, we can use a GCN to tell us what transactions are fraudulent. Even more powerfully, building a supernode graph of businesses, we can use GCNs to tell use where analysts should look for money laundering rings.

We can apply GCNs to banks’ customer data to predict which customers should not be approved for loans based on some customer characteristics and some risk thresholds. Again, the power of using between-customer relationships is that we can leverage the structure of historical bank data.

I’ve shown that leveraging the natural structure of data improves the predictive power of machine learning analysis, so it’s important to recognize the ubiquity of graphs. I’d be willing to bet your organization works on data every day which is perfect for a graph structure. Any situation in which pieces of information are connected together, there’s a graph problem. And anywhere there’s a graph, there’s a beautiful, powerful analytical insight waiting to be discovered.

Want to chat about your use case? Send Graham an email!

## Technologies

Graham Ganssle, Ph.D., P.G.

January 30, 2018

# Node Classification by Graph Convolutional Network

Leveraging the natural structure of data, graph convolutional networks produce optimal predictions of node properties.

Graph is the optimal representation of information. In a graph, there are nodes (maybe “entities”) which are connected by edges (say, “relationships”). For example, take a moment to imagine what your Facebook network looks like: there’s you in the middle connected to all of your friends, who are, in turn, connected to each other in various permutations. Representing these friendships in a table is an arbitrary (and detrimental) way to express this information. Friendships don’t look like a bunch of rows in a table; they look like this:

# Predicting Node Properties

Now that we have your Facebook friends represented optimally, how do we go about the business of optimally predicting what your friends are interested in? We use graph convolutional networks (GCNs). GCNs are a new deep learning architecture which leverage both the information contained in the data and the information contained in the relationships between data.

By way of example, let’s say we want to predict which of your friends are republicans and which are democrats. We’d apply a GCN to your Facebook network above to assign the predicted node property, “political stance” to the nodes with missing labels in the graph. The GCN takes two inputs: first, it takes a list of the “features” of each of your friends, which could include things like their alma mater, their zip code, the groups they’re a member of, etc. Obviously, this information is characteristic for learning a person’s political leaning. Second, the GCN takes a condensed form of the graph’s structure. This helps the GCN learn how the friends’ friendships influence their political stance.

Arguably, both of these pieces of information alone are enough to make predictions about a person’s political leaning. Traditionally, deep learning systems do not even make use of the relationships between entities to make predictions, they only use the properties of individuals. But by incorporating both the properties and structure of the data in GCN we build an incredibly powerful predictive tool.

The application of GCN’s is pretty simple. They operate in layers, which you can stack together to be as deep as you want. Inside of each layer, there are three things happening: first, the structure of the graph is normalized. Second, the normalized graph structure is multiplied by the node properties. Finally, the last thing that happens inside of each GCN layer is that we apply a nonlinearity function to the node properties and weights:

In application, we stack in a dropout layer and use leaky ReLU’s with a softmax output activation function to build something which looks like this:

I want to highlight the simplicity of this neural network design by including some Tensorflow code which implements a GCN layer. I’ve marked it up with annotation for those readers unfamiliar with Tensorflow:

# Applications

The above cartoon example is too simple to demonstrate the complexity of this problem. Below, find a real Facebook friendship network, colorized by political leaning.

Remember that the GCN doesn’t care what the input data is for each node, or what the relationships between nodes look like. We could use anything! Using information about the financial transactions between businesses and the financial standing of the businesses themselves, we can use a GCN to tell us what transactions are fraudulent. Even more powerfully, building a supernode graph of businesses, we can use GCNs to tell use where analysts should look for money laundering rings.

We can apply GCNs to banks’ customer data to predict which customers should not be approved for loans based on some customer characteristics and some risk thresholds. Again, the power of using between-customer relationships is that we can leverage the structure of historical bank data.

I’ve shown that leveraging the natural structure of data improves the predictive power of machine learning analysis, so it’s important to recognize the ubiquity of graphs. I’d be willing to bet your organization works on data every day which is perfect for a graph structure. Any situation in which pieces of information are connected together, there’s a graph problem. And anywhere there’s a graph, there’s a beautiful, powerful analytical insight waiting to be discovered.

Want to chat about your use case? Send Graham an email!

## Similar Resources

#### Expero CoNNected Financial Crimes

Recent events have created increased focus on Financial Crimes attacks as well as Cyber, AML, and fraud attacks that are growing in sophistication creating losses in the billions. This session will identify how to reduce false positives by 60%, increase accuracy by 70% and improve overall team productivity by 80% with Expero CoNNected software that bolts onto current on premise technology.

Watch Demo

#### Cyber and Graph Analytics

• Cyber & Malware Fraud Avoidance
• Graph Algorithms & Boolean Logic
• Advanced Visualization
• Real Time Intervention
Watch Demo

#### A Fraud Series - Part Two: Adapting Technology to Fight Fraud

This post looks at the different technology approaches and adaptations to finding and detecting fraud, and the technology behind Expero's Fraud Product.

Watch Demo

#### C360 for Retail

• Graph Analytics
• Churn Avoidance, Upsell, Cross Sell
• Loyalty, Clickstream Demographics
• Focused Campaign Management
Watch Demo