🚀 LEVEL UP TO SENIOR:Unlock 500+ Advanced Practical Challenges & Exercises.
🎓 COURSERA PARTNER:Earn professional Google, Meta, and IBM certificates to supercharge your resume.
HTML MASTER CLASS /// LEARN TAGS /// BUILD STRUCTURE /// SEMANTIC WEB /// HTML MASTER CLASS /// LEARN TAGS ///
Total XP: 0|💻 artificialintelligence XP: 0

Graph Pooling and Hierarchical Reps in AI & Artificial Intelligence

Master the architectures used for graph-level summarization. Explore Global Readout operations (Sum, Mean, Max) and their theoretical expressivity limitations. Dive into advanced Hierarchical Pooling techniques like DiffPool and SAGPool, learning how to preserve structural motifs and community groupings while compressing network dimensionality.

LOADING ENGINE...

Skill Matrix

UNLOCK NODES BY LEARNING NEW TAGS.

Pooling Hub

Summary logic.

Quick Quiz //

What is the primary purpose of a Readout Layer in a GNN?


How do you represent a complex, million-node graph with a single vector? Pooling is the critical bridge from local node-level features to global, graph-level intelligence.

1Global Readout: The Flat Summary

For graph-level tasks — like predicting whether a molecule is toxic, or classifying a sub-reddit's community type — you eventually need to compress all node embeddings into a single fixed-size vector. This operation is called the Readout Layer or Global Pooling.

The simplest approaches are order-invariant aggregations over the entire graph: Global Mean, Global Sum, and Global Max. Each has different properties. Mean calculates the average feature distribution but is blind to graph size (a graph with 10 carbon atoms looks identical to a graph with 100 carbon atoms). Sum preserves graph volume, which is critical for chemistry tasks where atom count dictates physical properties. Max extracts the strongest isolated signals (useful for detecting a single toxic substructure). However, all global readouts are 'flat' — they instantly destroy all community and motif structures. You compress a complex topology into a single dense vector in one brutal step.

+
// Simple Global Readout
// h_nodes shape: [num_nodes, features]

function graphReadout(h_nodes, method) {
  if (method === 'sum') {
    return sumCols(h_nodes); // Volume aware
  }
  if (method === 'mean') {
    return meanCols(h_nodes); // Distribution
  }
  if (method === 'max') {
    return maxCols(h_nodes); // Strongest signal
  }
}
// Result: single [features] vector
// Topology is completely forgotten.
localhost:3000
localhost:3000/global-readout
Molecule Property Task
Mean Pooling: FAILS on mass prediction
Sum Pooling: PASSES on mass prediction
Always match pooling to the physics.

2Hierarchical Pooling (DiffPool & SAGPool)

Hierarchical Pooling attempts to mirror the natural organization of networks, much like a CNN pools pixels into edges, edges into textures, and textures into objects. A graph is pooled down progressively across multiple layers.

DiffPool (Differentiable Pooling) trains a secondary GNN to learn an assignment matrix *S*. This matrix maps the N nodes of the current graph into C clusters (where C < N). The features of the nodes in a cluster are aggregated to form a new super-node, and the edges between clusters form a new coarsened adjacency matrix. This preserves structural motifs (like functional groups in a molecule) at intermediate layers.

SAGPool (Self-Attention Graph Pooling) takes a different approach: node dropping. It uses a GNN to calculate an attention score for every node. It simply sorts the nodes by score, drops the bottom 50% (the 'noise'), and keeps the top 50% (the 'signal'), preserving the edges between the retained nodes. This is much more memory efficient than DiffPool, which requires dense N×C assignment matrices.

+
// SAGPool: Top-K Node Dropping
function sagPool(X, A, ratio = 0.5) {
  // 1. Score nodes using a GCN
  const scores = sigmoid(A_norm @ X @ W_att);
  
  // 2. Find cut-off threshold
  const k = Math.floor(X.length * ratio);
  const topKIndices = argsort(scores).slice(0, k);
  
  // 3. Drop uninformative nodes
  const X_pooled = X[topKIndices] * scores[topKIndices];
  const A_pooled = A[topKIndices][topKIndices];
  
  return { X: X_pooled, A: A_pooled };
}
localhost:3000
localhost:3000/sag-pool
SAGPool Reduction (Ratio = 0.5)
Input: 10,000 nodes, 45,000 edges
Output: 5,000 nodes, 11,200 edges
Computations halved for next layer ✓

?Frequently Asked Questions

Pascual Vila

Pascual Vila

Frontend Instructor // Code Syllabus

Lesson Glossary

[01]Global Pooling (Readout)

The final layer of a graph-level GNN that combines all node features into a single graph vector.

Code Preview
TOTAL_SUM

[02]DiffPool

Differentiable Pooling; a method that learns to cluster nodes into a hierarchy of sub-graphs.

Code Preview
LEARNED_CLUSTERS

[03]SAGPool

Self-Attention Graph Pooling; uses a GNN to score node importance and drops the low-scoring nodes.

Code Preview
SELECTIVE_DROP

[04]Assignment Matrix

A matrix learned by DiffPool that defines which nodes belong to which clusters.

Code Preview
NODE_TO_GRP

[05]Structural Motif

Recurring sub-graph patterns (like triangles or cliques) that carry functional meaning.

Code Preview
GRAPH_PATTERN

[06]Global Attention

A readout mechanism that weights the importance of nodes based on their global significance to the task.

Code Preview
WEIGHTED_READOUT

Continue Learning