Python · SQL · Web Dev · Java · AI/ML tracks launching soon — your one platform for all of IT
DSAAdvanced+200 XP
Unit 15 — Graphs
Nodes connected by edges in any direction. Maps, social networks, flight routes, dependency graphs — every complex relationship is a graph. BFS, DFS, Dijkstra's, topological sort — all from scratch.
UNIT 15Prerequisite: Units 05 + 07 + 14120 min read
Every data structure we have covered — arrays, linked lists, trees — has been a special case of a graph. A linked list is a graph where each node connects to exactly one other. A tree is a graph with no cycles and one root. A graph removes all these restrictions. Any node can connect to any other node. Connections can go in both directions or just one. There can be cycles. This generality makes graphs the most powerful and most widely used data structure in computer science.
In this unit we build graphs from scratch, implement BFS and DFS, find shortest paths with Dijkstra's algorithm, detect cycles, and sort dependencies with topological sort — all in C.
// Section 1
What is a Graph?
A graph is a collection of vertices (also called nodes) connected by edges (also called links or connections). That is all. No root, no parent-child, no left-right. Just nodes and connections.
🗺️
Maps
Cities are nodes. Roads are edges. Distance is the edge weight.
👥
Social networks
People are nodes. Friendships are edges. LinkedIn, Instagram, Facebook.
✈️
Flight routes
Airports are nodes. Direct flights are edges. Cost is the weight.
🌐
The internet
Web pages are nodes. Hyperlinks are directed edges. Google PageRank uses this.
📦
Build systems
Tasks are nodes. Dependencies are edges. Build A before B before C.
🧬
Neural networks
Neurons are nodes. Synaptic connections are weighted edges.
// Section 2
Types of Graphs — Know These Cold
Undirected Graph
Edges have no direction — if A connects to B then B also connects to A. Friendships on Facebook. Roads you can drive both ways.
A — B — C
Directed Graph (Digraph)
Edges have a direction — A → B does NOT mean B → A. Twitter follows. One-way streets. Web page links.
A → B → C
Weighted Graph
Each edge has a number (weight) — typically representing distance, cost, or time. Used in shortest path algorithms.
A —5— B —3— C
Cyclic Graph
Contains at least one cycle — a path that starts and ends at the same node. Most real-world graphs have cycles.
A → B → C → A
Acyclic Graph (DAG)
No cycles. Directed Acyclic Graph. Used for dependency resolution, build systems, course prerequisites.
A → B → C (no way back)
Connected Graph
Every node can be reached from every other node. If any node is isolated — the graph is disconnected.
All nodes reachable
// Section 3
Graph Representation — Adjacency Matrix vs Adjacency List
How do you store a graph in memory? There are two standard ways. Each has different tradeoffs for space and time. Choosing the right one depends on whether your graph is dense (many edges) or sparse (few edges).
// Example undirected graph — 5 nodes, 6 edges
Edges: 0-1, 0-2, 0-3, 1-3, 1-4, 2-3
Method 1 — Adjacency Matrix
A 2D array where matrix[i][j] = 1 means there is an edge from i to j. For weighted graphs, store the weight instead of 1.
// Adjacency matrix for our 5-node graph
0
1
2
3
4
0
0
1
1
1
0
1
1
0
0
1
1
2
1
0
0
1
0
3
1
1
1
0
0
4
0
1
0
0
0
Space: O(V²)Edge lookup: O(1)List all neighbours: O(V)
Method 2 — Adjacency List
An array of linked lists. Each index holds the list of neighbours for that node. Only actual edges are stored — much more space-efficient for sparse graphs. This is what almost every real graph algorithm uses.
// Adjacency list for same graph
0
1→2→3→ NULL
1
0→3→4→ NULL
2
0→3→ NULL
3
0→1→2→ NULL
4
1→ NULL
Space: O(V + E)Edge lookup: O(degree)List all neighbours: O(degree)
Use Matrix when
Graph is dense (many edges). Need O(1) edge lookup. V is small (fewer than ~1000).
Use List when
Graph is sparse (few edges). Memory matters. V is large. Almost every real-world case.
c
#include <stdio.h>
#include <stdlib.h>
#define V 5 /* number of vertices */
/* node in the adjacency list */
struct AdjNode {
int dest;
int weight; /* 0 if unweighted */
struct AdjNode *next;
};
typedef struct AdjNode AdjNode;
/* graph: array of adjacency list heads */
AdjNode *graph[V];
void initGraph() {
int i;
for (i = 0; i < V; i++) graph[i] = NULL;
}
/* add an undirected edge between u and v */
void addEdge(int u, int v, int weight) {
/* add v to u's list */
AdjNode *newU = malloc(sizeof(AdjNode));
newU->dest = v; newU->weight = weight;
newU->next = graph[u]; graph[u] = newU;
/* add u to v's list (undirected — both directions) */
AdjNode *newV = malloc(sizeof(AdjNode));
newV->dest = u; newV->weight = weight;
newV->next = graph[v]; graph[v] = newV;
}
void printGraph() {
int i;
for (i = 0; i < V; i++) {
printf("%d: ", i);
AdjNode *current = graph[i];
while (current) {
printf("%d ", current->dest);
current = current->next;
}
printf("\n");
}
}
int main() {
initGraph();
addEdge(0, 1, 0);
addEdge(0, 2, 0);
addEdge(0, 3, 0);
addEdge(1, 3, 0);
addEdge(1, 4, 0);
addEdge(2, 3, 0);
printGraph();
return 0;
}
// Section 4
BFS — Breadth First Search
BFS explores a graph level by level — all neighbours at distance 1 first, then distance 2, then distance 3, and so on. It uses a queue — exactly like level order traversal of a tree from Unit 11. BFS finds the shortest path in an unweighted graph — the first time it reaches a node, it has found the fewest-step route.
DFS goes as deep as possible before backtracking. Start at a node, pick any unvisited neighbour and go there, then repeat. When you reach a dead end (all neighbours visited), backtrack and try another path. DFS uses the call stack naturally through recursion — or an explicit stack if iterative. DFS is used for cycle detection, topological sort, maze solving, and finding connected components.
// DFS from node 0 — go deep first
visit(0) → neighbours: 1, 2, 3 → go to 1 visit(1) → neighbours: 0✓, 3, 4 → go to 3 visit(3) → neighbours: 0✓, 1✓, 2 → go to 2 visit(2) → neighbours: 0✓, 3✓ → backtrack → backtrack from 3 → go to 4 visit(4) → neighbours: 1✓ → backtrack DFS order: 0 → 1 → 3 → 2 → 4
c
int visitedDFS[V];
void dfsRecursive(int node) {
visitedDFS[node] = 1;
printf("%d ", node);
AdjNode *curr = graph[node];
while (curr != NULL) {
if (!visitedDFS[curr->dest]) {
dfsRecursive(curr->dest); /* go deep */
}
curr = curr->next;
}
/* when all neighbours visited — backtrack (return) */
}
void dfs(int start) {
int i;
for (i = 0; i < V; i++) visitedDFS[i] = 0;
printf("DFS from %d: ", start);
dfsRecursive(start);
printf("\n");
}
/* DFS on ALL nodes — handles disconnected graphs */
void dfsAll() {
int i;
for (i = 0; i < V; i++) visitedDFS[i] = 0;
printf("DFS (all): ");
for (i = 0; i < V; i++) {
if (!visitedDFS[i]) {
dfsRecursive(i); /* start new DFS from unvisited node */
}
}
printf("\n");
}
int main() {
initGraph();
addEdge(0,1,0); addEdge(0,2,0); addEdge(0,3,0);
addEdge(1,3,0); addEdge(1,4,0); addEdge(2,3,0);
dfs(0); /* DFS from 0: 0 1 3 2 4 */
return 0;
}
BFS — use when
Finding shortest path in unweighted graph. Level-order exploration. Minimum steps between nodes.
DFS — use when
Detecting cycles. Topological sort. Finding connected components. Exploring all possible paths (backtracking).
// Section 6
Cycle Detection — Is There a Loop?
Detecting whether a graph has a cycle is a classic problem. The approach is different for undirected and directed graphs. For an undirected graph: during DFS, if you visit a neighbour that is already visited AND it is not the parent you just came from — you have found a cycle.
c
int hasCycleHelper(int node, int parent, int visited[]) {
visited[node] = 1;
AdjNode *curr = graph[node];
while (curr != NULL) {
if (!visited[curr->dest]) {
/* unvisited — continue DFS */
if (hasCycleHelper(curr->dest, node, visited)) return 1;
} else if (curr->dest != parent) {
/* visited AND not the parent → back edge → CYCLE */
Dijkstra's Algorithm — Shortest Path in Weighted Graph
BFS finds the shortest path by number of hops in an unweighted graph. But what if edges have different costs — distances, travel times, prices? Dijkstra's algorithm finds the shortest path from one source node to all other nodes in a weighted graph with non-negative edge weights.
// Weighted graph — find shortest path from 0 to all nodes
Dijkstra's idea — always process the closest unvisited node:
1
Set distance to source = 0, all others = ∞
2
Pick the unvisited node with smallest distance (greedy choice)
3
For each neighbour: if dist[current] + edge weight < dist[neighbour], update dist[neighbour]
4
Mark current node as visited. Repeat until all visited.
// Dijkstra trace from node 0
Step
Current
dist[0]
dist[1]
dist[2]
dist[3]
Init
—
0
∞
∞
∞
1
0
0
4
2
∞
2
2
0
3
2
5
3
1
0
3
2
5
4
3
0
3
2
5
Final: 0→0=0, 0→1=3, 0→2=2, 0→3=5
c
#include <stdio.h>
#include <limits.h>
#define V 4
#define INF INT_MAX
/* weighted adjacency matrix for Dijkstra */
int wGraph[V][V] = {
{0, 4, 2, 0}, /* 0→1: weight 4, 0→2: weight 2 */
{4, 0, 1, 5}, /* 1→0: 4, 1→2: 1, 1→3: 5 */
{2, 1, 0, 3}, /* 2→0: 2, 2→1: 1, 2→3: 3 */
{0, 5, 3, 0}, /* 3→1: 5, 3→2: 3 */
};
/* find the unvisited node with minimum distance */
int minDistance(int dist[], int visited[]) {
int min = INF, minIdx = -1, i;
for (i = 0; i < V; i++) {
if (!visited[i] && dist[i] <= min) {
min = dist[i]; minIdx = i;
}
}
return minIdx;
}
void dijkstra(int src) {
int dist[V], visited[V], i, count;
for (i = 0; i < V; i++) { dist[i] = INF; visited[i] = 0; }
dist[src] = 0;
for (count = 0; count < V - 1; count++) {
int u = minDistance(dist, visited); /* pick closest unvisited */
visited[u] = 1;
/* relax all neighbours of u */
for (int v = 0; v < V; v++) {
if (!visited[v] && wGraph[u][v] && dist[u] != INF
&& dist[u] + wGraph[u][v] < dist[v]) {
dist[v] = dist[u] + wGraph[u][v]; /* shorter path found */
}
}
}
printf("Shortest distances from node %d:\n", src);
for (i = 0; i < V; i++) {
printf(" to %d: %d\n", i, dist[i]);
}
}
int main() {
dijkstra(0);
/* Output:
to 0: 0
to 1: 3 (0→2→1, cost 2+1=3)
to 2: 2 (0→2, cost 2)
to 3: 5 (0→2→3, cost 2+3=5)
*/
return 0;
}
Time (matrix):O(V²)
Time (with min-heap):O((V+E) log V)
⚠️ Important
Dijkstra does not work with negative edge weights. If any edge weight is negative, use the Bellman-Ford algorithm instead. Negative weights can trick Dijkstra into missing shorter paths it already marked as "visited."
// Section 8
Topological Sort — Ordering Dependencies
Topological sort orders the nodes of a Directed Acyclic Graph (DAG)such that for every directed edge u → v, node u comes before v in the ordering. It is the algorithm that figures out the correct order to do things when some tasks depend on others.
Real examples of topological sort:
📚
Course prerequisites
Must take DSA before System Design. Must take OS before Compilers. Topo sort gives a valid study order.
🔧
Build systems (make, npm)
Compile module A before B because B depends on A. npm install resolves package dependencies using topo sort.
🧾
Task scheduling
Task C requires A and B to be done first. Topo sort gives the valid execution order.
The DFS-based approach: run DFS on every node. When you finish processing a node (all its descendants are done), push it onto a stack. When all nodes are processed, pop the stack — that is the topological order.
c
#include <stdio.h>
#include <stdlib.h>
#define VDAG 6
/* directed graph for topological sort */
/* Represents: 5→2, 5→0, 4→0, 4→1, 2→3, 3→1 */
int dagAdj[VDAG][VDAG] = {
{0,0,0,0,0,0}, /* node 0 — no outgoing edges */
{0,0,0,0,0,0}, /* node 1 — no outgoing edges */
{0,0,0,1,0,0}, /* node 2 → 3 */
{0,1,0,0,0,0}, /* node 3 → 1 */
{1,1,0,0,0,0}, /* node 4 → 0, 4 → 1 */
{1,0,1,0,0,0}, /* node 5 → 0, 5 → 2 */
};
int topoStack[VDAG];
int topoTop = -1;
int topoVisited[VDAG];
void topoSort(int node) {
topoVisited[node] = 1;
int i;
for (i = 0; i < VDAG; i++) {
if (dagAdj[node][i] && !topoVisited[i]) {
topoSort(i); /* go deeper first */
}
}
/* all descendants done — push this node */
topoStack[++topoTop] = node;
}
int main() {
int i;
for (i = 0; i < VDAG; i++) topoVisited[i] = 0;
/* run DFS from every unvisited node */
for (i = 0; i < VDAG; i++) {
if (!topoVisited[i]) topoSort(i);
}
printf("Topological order: ");
while (topoTop >= 0) {
printf("%d ", topoStack[topoTop--]);
}
printf("\n");
/* Output: 5 4 2 3 1 0 — one valid topological order */
return 0;
}
💡 Note
Topological sort only works on DAGs. If the graph has a cycle, there is no valid topological order — a cycle means A depends on B which depends on A which is impossible. If you try to topo-sort a graph with a cycle (like circular npm dependencies), the algorithm detects the cycle instead of producing an order.
// Section 9
Graph Algorithms — Side by Side
Algorithm
Data structure
Time
What it solves
BFS
Queue
O(V + E)
Shortest path (unweighted), level traversal
DFS
Stack / recursion
O(V + E)
Cycle detection, topological sort, connectivity
Dijkstra's
Min-heap + visited
O((V+E) log V)
Shortest path (weighted, non-negative)
Topological Sort
Stack + DFS
O(V + E)
Dependency ordering on DAGs
// Errors You Will Hit
Common Graph Mistakes
⚠ Forgetting to mark nodes as visited in BFS/DFS
Symptom: Infinite loop — the algorithm visits the same node repeatedly and never terminates
Fix: Always initialise a visited[] array before BFS or DFS and mark visited[node] = 1 immediately when you enqueue (BFS) or enter (DFS) a node.
⚠ Using Dijkstra with negative edge weights
Symptom: Incorrect shortest paths — some routes are missed because their source was already marked visited
Fix: Dijkstra requires all edge weights to be non-negative. For negative weights, use Bellman-Ford algorithm instead.
⚠ Running topological sort on a cyclic graph
Symptom: Produces an incomplete or incorrect ordering — not all nodes appear
Fix: Check for cycles first (using DFS cycle detection). Topological sort is only valid on DAGs.
// What's Next
You Are Ready for Unit 16
You now understand graphs completely — types, representations, BFS, DFS, cycle detection, Dijkstra's shortest path, and topological sort. These algorithms appear in every serious technical interview and power every map application, social network, and build system on earth.
In Unit 16 we cover Dynamic Programming — the hardest and most rewarding topic in all of DSA. Remember what you already computed so you never compute it twice. We go from naive recursion to memoization to tabulation, solving the knapsack, LCS, coin change, and edit distance problems from first principles.
UP NEXT → UNIT 16
Dynamic Programming — Remember, Don't Recompute
Memoization, tabulation, knapsack, LCS, coin change, edit distance — in C.