Tackling the Travelling Salesman Problem - A Developers Practical Guide

The Travelling Salesman Problem (TSP) is one of the most famous, and infamous, problems in computer science and combinatorial optimization. You’ve probably heard of it, perhaps even vaguely understand its premise. But what makes it so enduringly fascinating and, more importantly, so incredibly hard?

This post will peel back the layers of the TSP, showing you not just what it is, but why it’s a computational beast, and how developers approach it in practice, from brute-force exact solutions to clever heuristics and powerful libraries.

What is the Travelling Salesman Problem?

At its core, the Travelling Salesman Problem asks:

Given a list of cities and the distances between each pair of cities, what is the shortest possible route that visits each city exactly once and returns to the origin city?

Imagine you’re a delivery driver. You have a list of stops for the day. You need to find the most efficient route to visit all stops and get back to your depot. That’s the TSP in action.

While seemingly simple, the applications are vast:

  • Logistics and Supply Chain: Optimizing delivery routes for packages, food, or services.
  • Transportation: Planning airline routes, bus schedules.
  • Manufacturing: Optimizing drilling paths for circuit boards, robot arm movements.
  • Genomics: DNA sequencing.
  • Network Routing: Finding optimal data paths.

Why is TSP So Hard? The Curse of Combinatorial Explosion

The apparent simplicity of the TSP hides a monstrous computational complexity. Let’s say you have N cities.

  • From the starting city, you have N-1 choices for the next city.
  • From that city, you have N-2 choices for the one after that.
  • And so on, until you have 1 choice left.

This means there are (N-1)! (N minus 1 factorial) possible distinct routes if we fix the starting city (since it’s a cycle, the starting city doesn’t change the relative order).

Let’s look at how fast (N-1)! grows:

Number of Cities (N) Number of Possible Routes ((N-1)!) Time (approx.) to check 10^9 routes/sec
4 (4-1)! = 3! = 6 Instant
10 9! = 362,880 Instant
15 14! = 87,178,291,200 ~87 seconds
20 19! = 1.21 x 10^17 ~3.8 years
25 24! = 6.20 x 10^23 ~19.6 million years

As you can see, even for a relatively small number of cities like 20 or 25, brute-forcing all possible routes is computationally infeasible. This exponential growth is why TSP is classified as an NP-hard problem – there’s no known algorithm that can solve it in polynomial time for arbitrary inputs.

Note: NP-hard problems are, informally, at least as hard as the hardest problems in NP (Non-deterministic Polynomial time). This means that finding an optimal solution for large instances generally requires an algorithm whose runtime grows exponentially with the input size.

Representing the Problem: Distance Matrix

To work with TSP programmatically, we typically represent the cities and distances as a distance matrix. If we have N cities, this is an N x N matrix where matrix[i][j] is the distance from city i to city j. For simplicity, we often assume distances are symmetric (matrix[i][j] == matrix[j][i]) and matrix[i][i] == 0.

Let’s set up some helper functions to generate cities and their distance matrix using Euclidean distances.

import math
import random

# Helper function: Calculate Euclidean distance
def calculate_distance(point1, point2):
    return math.sqrt((point1[0] - point2[0])**2 + (point1[1] - point2[1])**2)

# Helper function: Create a distance matrix from city coordinates
def create_distance_matrix(cities):
    n = len(cities)
    dist_matrix = [[0] * n for _ in range(n)]
    for i in range(n):
        for j in range(i + 1, n):
            dist = calculate_distance(cities[i], cities[j])
            dist_matrix[i][j] = dist
            dist_matrix[j][i] = dist # Symmetric matrix
    return dist_matrix

# Example cities (coordinates)
example_cities = [
    (0, 0),    # City 0
    (1, 5),    # City 1
    (4, 1),    # City 2
    (7, 3)     # City 3
]

# Generate distance matrix
dist_matrix = create_distance_matrix(example_cities)

print("Example Cities (Indices and Coordinates):")
for i, coord in enumerate(example_cities):
    print(f"City {i}: {coord}")
print("\nGenerated Distance Matrix:")
for row in dist_matrix:
    print([f"{d:.2f}" for d in row])
Example Cities (Indices and Coordinates):
City 0: (0, 0)
City 1: (1, 5)
City 2: (4, 1)
City 3: (7, 3)

Generated Distance Matrix:
['0.00', '5.10', '4.12', '7.62']
['5.10', '0.00', '3.16', '6.71']
['4.12', '3.16', '0.00', '3.61']
['7.62', '6.71', '3.61', '0.00']

Exact Solutions (For Small N)

When the number of cities is small enough (typically N < 15-20), we can afford to find the absolute optimal solution.

1. Brute Force (Permutations)

The most straightforward exact method is to generate every possible permutation of city visits, calculate the total distance for each, and pick the shortest one.

import itertools
import math
import random

# Helper function: Calculate Euclidean distance
def calculate_distance(point1, point2):
    return math.sqrt((point1[0] - point2[0])**2 + (point1[1] - point2[1])**2)

# Helper function: Create a distance matrix from city coordinates
def create_distance_matrix(cities):
    n = len(cities)
    dist_matrix = [[0] * n for _ in range(n)]
    for i in range(n):
        for j in range(i + 1, n):
            dist = calculate_distance(cities[i], cities[j])
            dist_matrix[i][j] = dist
            dist_matrix[j][i] = dist # Symmetric matrix
    return dist_matrix

# Example cities (coordinates)
cities_brute_force = [
    (0, 0),  # City 0
    (1, 5),  # City 1
    (4, 1),  # City 2
    (7, 3)   # City 3
]
dist_matrix_brute_force = create_distance_matrix(cities_brute_force)

def solve_tsp_brute_force(dist_matrix):
    n = len(dist_matrix)
    # Cities are 0-indexed. We fix the starting city as 0.
    # We need to find permutations of the remaining cities (1 to n-1).
    other_cities = list(range(1, n))

    min_path_length = float('inf')
    best_path = []

    # Iterate through all permutations of the other cities
    for permutation in itertools.permutations(other_cities):
        current_path = [0] + list(permutation) + [0] # Add start/end city 0
        current_path_length = 0

        # Calculate path length
        for i in range(len(current_path) - 1):
            city1_idx = current_path[i]
            city2_idx = current_path[i+1]
            current_path_length += dist_matrix[city1_idx][city2_idx]

        # Check if this path is better
        if current_path_length < min_path_length:
            min_path_length = current_path_length
            best_path = current_path

    return best_path, min_path_length

# Solve for our example cities
best_path_bf, min_length_bf = solve_tsp_brute_force(dist_matrix_brute_force)

print("Brute Force TSP Solution:")
print(f"  Cities: {cities_brute_force}")
print(f"  Optimal Path: {best_path_bf} (indices)")
print(f"  Minimum Path Length: {min_length_bf:.2f}")

# Map indices back to coordinates for clarity
path_coords = [cities_brute_force[i] for i in best_path_bf]
print(f"  Path Coordinates: {path_coords}")
Brute Force TSP Solution:
  Cities: [(0, 0), (1, 5), (4, 1), (7, 3)]
  Optimal Path: [0, 2, 3, 1, 0] (indices)
  Minimum Path Length: 15.99
  Path Coordinates: [(0, 0), (4, 1), (7, 3), (1, 5), (0, 0)]

Note: We fix the starting city (e.g., city 0) and then permute the remaining N-1 cities. This works because the TSP is a cycle, so the starting point doesn’t affect the overall shortest cycle length. (N-1)! permutations still represent all unique cycles.

2. Dynamic Programming (Held-Karp Algorithm)

For moderately small N (up to ~20-25), the Held-Karp algorithm (also known as Bellman-Held-Karp) offers a significant improvement over brute force. Instead of O(N!), its complexity is O(N^2 * 2^N). While still exponential, 2^N grows much slower than N!.

Held-Karp uses dynamic programming with bitmasks to keep track of visited cities. It works by solving subproblems: “What is the shortest path from city i to city j visiting a specific subset of intermediate cities?” The bitmask efficiently encodes the subset of visited cities.

Note: Implementing Held-Karp from scratch is quite involved for a short blog post example, requiring a deep dive into bit manipulation and recursive memoization. For brevity and focus on practical applications, we’ll omit a full code example here. However, it’s crucial to know that this algorithm exists and is the go-to exact solution for problems beyond what brute-force can handle, but still too small for approximate methods to be acceptable.

Heuristic Solutions (For Large N)

When N is large (hundreds, thousands, or more), exact solutions become impossible. This is where heuristics and metaheuristics come into play. These algorithms aim to find good enough solutions quickly, even if they aren’t guaranteed to be the absolute optimal. The trade-off is speed for optimality.

1. Nearest Neighbor Algorithm

This is a simple greedy algorithm. It starts at an arbitrary city and repeatedly visits the closest unvisited city until all cities have been visited, then returns to the starting city.

import math
import random

# Helper function: Calculate Euclidean distance
def calculate_distance(point1, point2):
    return math.sqrt((point1[0] - point2[0])**2 + (point1[1] - point2[1])**2)

# Helper function: Create a distance matrix from city coordinates
def create_distance_matrix(cities):
    n = len(cities)
    dist_matrix = [[0] * n for _ in range(n)]
    for i in range(n):
        for j in range(i + 1, n):
            dist = calculate_distance(cities[i], cities[j])
            dist_matrix[i][j] = dist
            dist_matrix[j][i] = dist # Symmetric matrix
    return dist_matrix

def solve_tsp_nearest_neighbor(dist_matrix, start_node=0):
    n = len(dist_matrix)
    
    # Initialize all cities as unvisited
    visited = [False] * n
    current_node = start_node
    
    path = [current_node]
    visited[current_node] = True
    path_length = 0

    # Visit n-1 cities
    for _ in range(n - 1):
        min_dist = float('inf')
        next_node = -1

        # Find the nearest unvisited city
        for neighbor in range(n):
            if not visited[neighbor] and neighbor != current_node:
                distance = dist_matrix[current_node][neighbor]
                if distance < min_dist:
                    min_dist = distance
                    next_node = neighbor
        
        # If no unvisited neighbors (shouldn't happen in a connected graph with unvisited cities)
        if next_node == -1:
            break 
            
        path.append(next_node)
        path_length += min_dist
        visited[next_node] = True
        current_node = next_node
    
    # Return to the starting city
    path_length += dist_matrix[current_node][start_node]
    path.append(start_node)

    return path, path_length

# Let's use a slightly larger set of cities for Nearest Neighbor to show the difference
cities_nn = [
    (0, 0),    # City 0
    (1, 5),    # City 1
    (4, 1),    # City 2
    (7, 3),    # City 3
    (2, 7)     # City 4
]
dist_matrix_nn = create_distance_matrix(cities_nn)

print("Nearest Neighbor TSP Solution (starting from City 0):")
print(f"  Cities: {cities_nn}")

# Run NN multiple times with different start nodes and pick the best (optional, improves result)
best_path_nn = []
min_length_nn = float('inf')

for start_node_idx in range(len(cities_nn)):
    path, length = solve_tsp_nearest_neighbor(dist_matrix_nn, start_node=start_node_idx)
    if length < min_length_nn:
        min_length_nn = length
        best_path_nn = path

print(f"  Best Path (NN): {best_path_nn} (indices)")
print(f"  Minimum Path Length (NN): {min_length_nn:.2f}")

path_coords_nn = [cities_nn[i] for i in best_path_nn]
print(f"  Path Coordinates (NN): {path_coords_nn}")
Nearest Neighbor TSP Solution (starting from City 0):
  Cities: [(0, 0), (1, 5), (4, 1), (7, 3), (2, 7)]
  Best Path (NN): [0, 2, 3, 1, 4, 0] (indices)
  Minimum Path Length (NN): 22.84
  Path Coordinates (NN): [(0, 0), (4, 1), (7, 3), (1, 5), (2, 7), (0, 0)]

Note: Nearest Neighbor is fast (O(N^2)), but it’s a greedy algorithm and often produces sub-optimal solutions. Running it multiple times with different starting points can improve the result, but still doesn’t guarantee optimality.

2. 2-Opt Algorithm

The 2-Opt algorithm is a local search heuristic used to improve an existing TSP tour. It works by iteratively removing two non-adjacent edges and re-connecting the two resulting paths in a different way to form a shorter tour. If an improvement is found, the new tour is accepted, and the process repeats until no further improvements can be made.

import math
import random

# Helper function: Calculate Euclidean distance
def calculate_distance(point1, point2):
    return math.sqrt((point1[0] - point2[0])**2 + (point1[1] - point2[1])**2)

# Helper function: Create a distance matrix from city coordinates
def create_distance_matrix(cities):
    n = len(cities)
    dist_matrix = [[0] * n for _ in range(n)]
    for i in range(n):
        for j in range(i + 1, n):
            dist = calculate_distance(cities[i], cities[j])
            dist_matrix[i][j] = dist
            dist_matrix[j][i] = dist # Symmetric matrix
    return dist_matrix

def calculate_path_length(path, dist_matrix):
    length = 0
    for i in range(len(path) - 1):
        length += dist_matrix[path[i]][path[i+1]]
    return length

def solve_tsp_2opt(dist_matrix, initial_path=None):
    n = len(dist_matrix)
    
    # Start with an initial path (e.g., a simple sequential path or NN path)
    if initial_path is None:
        current_path = list(range(n)) + [0] # [0, 1, 2, ..., n-1, 0]
    else:
        current_path = list(initial_path)
        if current_path[0] != current_path[-1]: # Ensure it's a cycle
            current_path.append(current_path[0])

    current_length = calculate_path_length(current_path, dist_matrix)
    
    improved = True
    while improved:
        improved = False
        # Iterate over all possible pairs of edges (i, i+1) and (j, j+1)
        # We look at indices in the path, so current_path[i] to current_path[i+1] is an edge
        for i in range(n - 1): # Index of first node of the first edge
            for j in range(i + 2, n): # Index of first node of the second edge, must be at least 2 apart
                # The edges are (current_path[i], current_path[i+1]) and (current_path[j], current_path[j+1])
                # We reverse the segment between i+1 and j
                # Example: A-B-C-D-E-F-A
                # If i = 0 (edge A-B), j = 2 (edge C-D)
                # Reverse B-C-D results in A-D-C-B-E-F-A
                
                new_path = list(current_path) # Make a copy
                
                # The segment to reverse is from index i+1 to j
                # In Python slicing, new_path[i+1:j+1] captures these elements
                segment_to_reverse = new_path[i+1 : j+1]
                segment_to_reverse.reverse()
                new_path[i+1 : j+1] = segment_to_reverse

                new_length = calculate_path_length(new_path, dist_matrix)

                if new_length < current_length:
                    current_path = new_path
                    current_length = new_length
                    improved = True
                    # If an improvement is found, restart the search for the next improvement
                    # (greedy approach, might not find global optimum, but speeds up)
                    break 
            if improved:
                break
                
    return current_path, current_length

# Let's use the same cities as the Nearest Neighbor example
cities_2opt = [
    (0, 0),    # City 0
    (1, 5),    # City 1
    (4, 1),    # City 2
    (7, 3),    # City 3
    (2, 7)     # City 4
]
dist_matrix_2opt = create_distance_matrix(cities_2opt)

# Start with a simple initial path (e.g., Nearest Neighbor's output)
initial_path_for_2opt, _ = solve_tsp_nearest_neighbor(dist_matrix_2opt, start_node=0)
print(f"Initial path for 2-Opt (from NN): {initial_path_for_2opt}")
print(f"Initial path length: {calculate_path_length(initial_path_for_2opt, dist_matrix_2opt):.2f}")


best_path_2opt, min_length_2opt = solve_tsp_2opt(dist_matrix_2opt, initial_path=initial_path_for_2opt)

print("\n2-Opt TSP Solution:")
print(f"  Cities: {cities_2opt}")
print(f"  Optimized Path (2-Opt): {best_path_2opt} (indices)")
print(f"  Optimized Path Length (2-Opt): {min_length_2opt:.2f}")

path_coords_2opt = [cities_2opt[i] for i in best_path_2opt]
print(f"  Path Coordinates (2-Opt): {path_coords_2opt}")
Initial path for 2-Opt (from NN): [0, 2, 3, 1, 4, 0]
Initial path length: 22.84

2-Opt TSP Solution:
  Cities: [(0, 0), (1, 5), (4, 1), (7, 3), (2, 7)]
  Optimized Path (2-Opt): [0, 1, 4, 3, 2, 0] (indices)
  Optimized Path Length (2-Opt): 21.05
  Path Coordinates (2-Opt): [(0, 0), (1, 5), (2, 7), (7, 3), (4, 1), (0, 0)]

Note: 2-Opt is an iterative improvement algorithm. It starts with an existing tour (even a random one) and tries to improve it. It often produces much better results than Nearest Neighbor but is still a heuristic, meaning it doesn’t guarantee the absolute optimal solution, especially for very complex landscapes. Its complexity is O(N^2) per iteration, and the number of iterations can vary.

Other Heuristics and Metaheuristics

Beyond these, a vast array of sophisticated metaheuristics are applied to TSP:

  • Simulated Annealing: Inspired by the annealing process in metallurgy, it allows “bad” moves initially to escape local optima.
  • Genetic Algorithms: Mimic natural selection, evolving a population of solutions.
  • Ant Colony Optimization (ACO): Based on the foraging behavior of ants, where pheromones guide the path.
  • Tabu Search: Explores the solution space by preventing recently visited solutions (tabu list) to avoid cycles and explore new regions.

These advanced techniques require more complex implementations and are often found in specialized libraries.

Practical Solutions: Using Libraries

For real-world TSP challenges, you won’t be implementing brute force or even complex heuristics from scratch. Instead, you’ll leverage powerful optimization libraries. One excellent choice, especially for routing problems, is Google’s OR-Tools.

OR-Tools is an open-source suite for combinatorial optimization. It includes solvers for various problems, including vehicle routing, which is a generalized form of TSP.

Example with Google OR-Tools

Let’s use OR-Tools to solve a TSP instance. We’ll use its routing library, which is designed for vehicle routing problems, but can be configured for a single-vehicle TSP.

First, ensure you have OR-Tools installed:

pip install ortools
from ortools.constraint_solver import routing_enums_pb2
from ortools.constraint_solver import pywrapcp
import math

# Helper function: Calculate Euclidean distance
def calculate_distance(point1, point2):
    return math.sqrt((point1[0] - point2[0])**2 + (point1[1] - point2[1])**2)

# Helper function: Create a distance matrix from city coordinates
def create_distance_matrix(cities):
    n = len(cities)
    dist_matrix = [[0] * n for _ in range(n)]
    for i in range(n):
        for j in range(i + 1, n):
            dist = calculate_distance(cities[i], cities[j])
            dist_matrix[i][j] = dist
            dist_matrix[j][i] = dist # Symmetric matrix
    return dist_matrix

# Data model for OR-Tools
def create_data_model(cities_coords):
    """Stores the data for the problem."""
    data = {}
    data['locations'] = cities_coords
    data['num_vehicles'] = 1 # One salesman
    data['depot'] = 0 # Start and end at city 0 (the depot)
    return data

# Function to register the distance callback
def get_distance_callback(manager, data):
    """Returns a distance callback."""
    locations = data['locations']
    
    # Precompute distance matrix for OR-Tools to use
    dist_matrix = create_distance_matrix(locations)

    def distance_callback(from_index, to_index):
        """Returns the distance between the two nodes."""
        # Convert from routing internal index to problem node index
        from_node = manager.IndexToNode(from_index)
        to_node = manager.IndexToNode(to_index)
        # Multiply by 100 and cast to int to avoid floating point issues in solver
        return int(dist_matrix[from_node][to_node] * 100) 

    return distance_callback

def solve_tsp_ortools(cities_coords):
    """Solve the TSP problem with OR-Tools."""
    data = create_data_model(cities_coords)

    # Create the routing index manager.
    manager = pywrapcp.RoutingIndexManager(
        len(data['locations']),
        data['num_vehicles'],
        data['depot'])

    # Create Routing Model.
    routing = pywrapcp.RoutingModel(manager)

    # Define cost of each arc.
    transit_callback_index = routing.RegisterTransitCallback(
        get_distance_callback(manager, data))
    
    routing.SetArcCostEvaluatorOfAllVehicles(transit_callback_index)

    # Set search parameters.
    search_parameters = pywrapcp.DefaultRoutingSearchParameters()
    search_parameters.first_solution_strategy = (
        routing_enums_pb2.FirstSolutionStrategy.PATH_CHEAPEST_ARC) # Strategy for initial solution
    search_parameters.local_search_metaheuristic = (
        routing_enums_pb2.LocalSearchMetaheuristic.GUIDED_LOCAL_SEARCH) # Metaheuristic for improvement
    search_parameters.time_limit.seconds = 5 # Max 5 seconds to search for a solution

    # Solve the problem.
    assignment = routing.SolveWithParameters(search_parameters)

    # Print solution on console.
    if assignment:
        print("OR-Tools TSP Solution:")
        total_distance = 0
        route_indices = []
        index = routing.Start(0) # Start node for vehicle 0
        route_indices.append(manager.IndexToNode(index))
        while not routing.IsEnd(index):
            previous_index = index
            index = assignment.Value(routing.NextVar(index))
            route_indices.append(manager.IndexToNode(index))
            total_distance += routing.GetArcCostForVehicle(previous_index, index, 0) # Get cost for vehicle 0

        # Adjust total_distance since it's multiplied by 100 in the callback
        print(f"  Route: {route_indices}")
        print(f"  Total distance of route: {total_distance / 100:.2f}") # Divide by 100 to get original float value
        return route_indices, total_distance / 100
    else:
        print("No solution found by OR-Tools.")
        return None, None

# Example cities for OR-Tools (a slightly larger set)
cities_ortools = [
    (10, 20), (50, 10), (30, 40), (70, 30), (20, 60),
    (60, 50), (80, 70), (40, 80), (90, 20), (10, 90),
    (5, 50), (95, 40), (25, 15), (85, 85), (45, 65)
]

# Solve with OR-Tools
solve_tsp_ortools(cities_ortools)
OR-Tools TSP Solution:
  Route: [0, 4, 9, 7, 6, 13, 11, 8, 5, 3, 2, 14, 1, 12, 10, 0]
  Total distance of route: 406.87

Note: OR-Tools typically works with integer distances, so we multiply our floating-point Euclidean distances by 100 and cast to int in the distance_callback and then divide by 100 at the end for presentation. This is a common practice in optimization solvers that prefer integers for precision and performance.

Conclusion

The Travelling Salesman Problem is a cornerstone of combinatorial optimization, challenging developers with its NP-hard nature. We’ve seen that:

  • For very small instances (N < 15), brute force can find the optimal solution, but its factorial complexity quickly makes it impractical.
  • The Held-Karp dynamic programming algorithm is a more efficient exact method for moderately small N (N < 25), with O(N^2 * 2^N) complexity.
  • For larger, real-world problems, heuristic approaches like Nearest Neighbor and 2-Opt provide good, fast, approximate solutions. They sacrifice guaranteed optimality for speed.
  • Powerful libraries like Google OR-Tools abstract away much of the complexity, offering robust and highly optimized solvers for practical routing challenges.

Understanding the TSP isn’t just about finding the shortest path; it’s about appreciating the trade-offs between optimality and computational cost, a fundamental concept in algorithm design and problem-solving. For most real-world applications, a well-implemented heuristic or a specialized library will be your go-to solution, delivering results that are “good enough” in a reasonable timeframe.

Last updated on