Message passing, graph embeddings, and link prediction

Graph neural networks

Graph neural networks are machine-learning models built for data made of nodes and edges. They learn from both the attributes of each item and the pattern of relationships around it, which makes them useful for recommendations, molecules, knowledge graphs, fraud detection, traffic systems, and other connected data.

Input shape

Nodes, edges, and optional node, edge, or graph-level features

Core idea

Update each node from information aggregated across its neighborhood

Common outputs

Labels for nodes, scores for edges, or predictions for whole graphs

A simple graph diagram with nodes and edges used to illustrate how graph structure can be fed into a neural model. — Graph neural networks map node and edge structure into embeddings to support graph-level predictions.View image on Wikimedia Commons

What graph neural networks are

A graph neural network, or GNN, is a neural model that works directly with graph-structured data. A graph represents items as nodes and relationships as edges: people connected by friendships, atoms connected by bonds, pages connected by links, products connected by purchases, or roads connected by intersections. The key difference from a plain table or image model is that the connections are part of the input. A GNN does not only ask what a node looks like by itself. It also asks what surrounds that node, how information might travel through nearby edges, and what larger pattern the node sits inside.

Why graphs need special handling

Graphs are awkward for ordinary neural networks because they do not have a fixed rectangular shape. Two graphs can have different numbers of nodes, each node can have a different number of neighbors, and the same graph can be written with many different node orderings. A useful graph model should give the same kind of answer even if the node IDs are renamed. It also needs to avoid wasting work on a giant mostly empty adjacency matrix when the real graph has far fewer edges than possible node pairs. GNNs address this by learning operations over neighborhoods rather than relying on one fixed input layout.

Message passing and embeddings

Many GNNs are built around message passing. In one layer, each node gathers messages from its neighbors, combines them with an aggregation rule such as sum, mean, max, or attention, and then updates its own representation with a learned function. After one layer, a node representation contains information from immediate neighbors. After two layers, it can include information from neighbors of neighbors. The learned vector that results is called an embedding, and it can represent a node, an edge, or an entire graph depending on the task.

What GNNs predict

GNNs are often grouped by the level of prediction. A node-level model might classify papers in a citation network, accounts in a fraud graph, or proteins in an interaction network. An edge-level model might predict whether two users, products, drugs, or documents should be connected. A graph-level model might predict whether a molecule has a property of interest or whether a whole network belongs to a category. The same basic machinery can support several tasks. What changes is the readout step: the model may keep one embedding per node, compare two node embeddings for a link score, or pool many node embeddings into one graph representation.

Architecture choices

Different GNN families make different tradeoffs. Graph convolutional networks use efficient neighborhood aggregation inspired by convolution. GraphSAGE samples and aggregates local neighborhoods so a model can generalize to nodes or graphs not seen during training. Graph attention networks learn weights for neighbors instead of treating all neighbors the same. Real systems may also need heterogeneous graphs with multiple node and edge types, temporal graphs whose edges change over time, or geometric graph models that respect distances and 3D structure. The right design depends on the graph, the labels, the compute budget, and how much the model must generalize beyond the training graph.

Where they help

GNNs are useful when relationships carry signal that would be lost in a flat table. Recommendation systems can use user-item graphs, fraud teams can look for suspicious transaction neighborhoods, search and knowledge systems can model entities and relations, and chemists can represent molecules as atoms and bonds. They are not magic relationship detectors. A GNN can only learn from the graph it is given, and that graph reflects measurement choices, missing data, sampling rules, and sometimes human bias. In practice, graph construction is often as important as model selection.

Limits and evaluation

GNNs can struggle when graphs are noisy, labels are sparse, or high-degree nodes dominate the neighborhood signal. Deeper models can also run into oversmoothing, where node embeddings become too similar, or oversquashing, where too much distant information is compressed into too little representational space. Evaluation needs care. Random train-test splits can leak information in connected data, while time-based splits may reveal that a model performs worse on future graph states. For sensitive uses such as credit, security, hiring, health, or policing, teams also need privacy review, bias checks, uncertainty estimates, and explanations that domain experts can inspect.

Why it matters

Graph neural networks make relationships a first-class part of machine learning. That matters because many important systems are not just lists of independent examples; they are networks of interactions, dependencies, paths, groups, and flows. Used carefully, GNNs can improve recommendations, support scientific discovery, flag unusual behavior, and connect information across knowledge graphs. Their value comes from matching the model to the real structure of the problem, not from adding a graph label to data that does not need one.

Key concepts

Grapha collection of nodes connected by edges, often with features on nodes, edges, or the whole graph.
Message passinga repeated process where nodes receive, aggregate, and update information from their neighbors.
Embeddinga learned vector representation of a node, edge, or graph.
Permutation invariancethe model should not change its meaning just because node IDs are reordered.
Readoutthe final step that turns node or edge representations into a prediction.

Model families

Graph convolutional networks aggregate local neighborhoods with efficient graph-aware operations.
GraphSAGE learns aggregation functions that can produce embeddings for previously unseen nodes.
Graph attention networks learn which neighboring nodes deserve more weight during aggregation.
Heterogeneous and temporal GNNs handle multiple relation types or changing graph structure.

Common misconceptions

A GNN is not just a normal neural network with an adjacency matrix attached.
More layers are not automatically better; depth can make node representations blur together.
A graph model does not prove causation simply because it uses relationships.
Good benchmark scores do not guarantee performance on future graph states or messy production data.

Open questions

How can GNNs scale to massive, fast-changing graphs without losing important structure?
Which architectures best handle long-range dependencies without oversquashing information?
How can graph models provide explanations that are useful to domain experts?
What privacy methods work best when relationships themselves may reveal sensitive information?