The world of algorithms is full of intriguing problems that, on the surface, seem simple but quickly reveal their depth. One such challenge is the Minimum Partition Problem. If you’ve ever dealt with balancing workloads, distributing resources, or even just trying to split a group of numbers as evenly as possible, you’ve implicitly touched upon this concept.

In this post, we’ll break down the Minimum Partition Problem. We’ll start with its definition, explore a straightforward (but inefficient) recursive approach, and then significantly optimize it using dynamic programming. All with practical, runnable Python code examples.

Let’s dive in.

What is the Minimum Partition Problem?

The Minimum Partition Problem, often simply called the “Partition Problem” or “Number Partitioning Problem,” is a classic combinatorial optimization problem.

Given a set of positive integers, the goal is to partition it into two subsets such that the absolute difference between their sums is minimized.

Let’s clarify with an example:

Suppose you have the set S = {3, 1, 4, 2}. The total sum of elements is 3 + 1 + 4 + 2 = 10.

We want to divide these numbers into two subsets, say S1 and S2, such that |sum(S1) - sum(S2)| is as small as possible.

Consider these partitions:

  • S1 = {3, 1, 2} (sum = 6), S2 = {4} (sum = 4). Difference = |6 - 4| = 2.
  • S1 = {3, 4} (sum = 7), S2 = {1, 2} (sum = 3). Difference = |7 - 3| = 4.
  • S1 = {3, 2} (sum = 5), S2 = {1, 4} (sum = 5). Difference = |5 - 5| = 0.

In this example, the minimum difference is 0, achieved by S1 = {3, 2} and S2 = {1, 4}.

Why is this important? This problem has applications in various fields:

  • Load Balancing: Distributing tasks (with varying processing times) among two processors to minimize the difference in their total workload.
  • Resource Allocation: Dividing resources with different costs or capacities to two teams.
  • Manufacturing: Cutting a raw material into pieces of specific lengths to minimize waste.

Note: The Partition Problem is known to be NP-hard. This means that no known polynomial-time algorithm exists to solve it for arbitrarily large inputs. However, for practical inputs where the sum of numbers isn’t excessively large, dynamic programming provides an efficient pseudo-polynomial time solution.

Brute-Force Approach: Recursion

The most intuitive way to approach this problem is to consider each element and decide whether to include it in the first subset (S1) or the second subset (S2).

Let’s define a recursive function min_diff(index, current_sum1, current_sum2):

  • index: The current element we are considering.
  • current_sum1: The sum of elements chosen for S1 so far.
  • current_sum2: The sum of elements chosen for S2 so far.

The base case is when index reaches the end of the array. At this point, current_sum1 and current_sum2 represent the sums of the two subsets, and we return their absolute difference.

For each element, we have two choices:

  1. Include nums[index] in S1.
  2. Include nums[index] in S2.

We then take the minimum of the results from these two choices.

def min_partition_recursive(nums):
    """
    Solves the Minimum Partition Problem using a brute-force recursive approach.
    """
    n = len(nums)

    # Helper recursive function
    def solve(index, sum1, sum2):
        # Base case: If all elements are processed
        if index == n:
            return abs(sum1 - sum2)

        # Option 1: Include current element in sum1
        choice1 = solve(index + 1, sum1 + nums[index], sum2)

        # Option 2: Include current element in sum2
        choice2 = solve(index + 1, sum1, sum2 + nums[index])

        return min(choice1, choice2)

    return solve(0, 0, 0)

# --- Examples ---

print("--- Recursive Examples ---")

# Example 1: Basic case
nums1 = [3, 1, 4, 2]
result1 = min_partition_recursive(nums1)
print(f"Numbers: {nums1}")
print(f"Minimum difference: {result1}")

print("-" * 20)

# Example 2: Larger set
nums2 = [10, 20, 15, 5, 25]
result2 = min_partition_recursive(nums2)
print(f"Numbers: {nums2}")
print(f"Minimum difference: {result2}")

print("-" * 20)

# Example 3: Single element
nums3 = [42]
result3 = min_partition_recursive(nums3)
print(f"Numbers: {nums3}")
print(f"Minimum difference: {result3}")

print("-" * 20)

# Example 4: All same numbers
nums4 = [7, 7, 7, 7]
result4 = min_partition_recursive(nums4)
print(f"Numbers: {nums4}")
print(f"Minimum difference: {result4}")
--- Recursive Examples ---
Numbers: [3, 1, 4, 2]
Minimum difference: 0
--------------------
Numbers: [10, 20, 15, 5, 25]
Minimum difference: 5
--------------------
Numbers: [42]
Minimum difference: 42
--------------------
Numbers: [7, 7, 7, 7]
Minimum difference: 0

Analysis of the Recursive Approach:

  • Time Complexity: For each of the n elements, we have two choices. This leads to 2^n possible combinations. Thus, the time complexity is O(2^n), which is exponential. This becomes prohibitively slow for n greater than around 20-25.
  • Space Complexity: O(n) due to the recursion stack depth.

This approach, while correct, is not practical for larger sets of numbers. We need a more efficient way.

Optimizing with Dynamic Programming

The Minimum Partition Problem has optimal substructure and overlapping subproblems, making it a perfect candidate for dynamic programming.

The key insight for the DP approach is to transform the problem into a variation of the Subset Sum Problem.

Let total_sum be the sum of all elements in the given set nums. If we form one subset S1 with a sum s1, then the other subset S2 will automatically have a sum s2 = total_sum - s1. Our goal is to minimize |s1 - s2| = |s1 - (total_sum - s1)| = |2 * s1 - total_sum|.

To minimize |2 * s1 - total_sum|, we need s1 to be as close as possible to total_sum / 2.

So, the problem boils down to: Find a subset S1 whose sum s1 is the largest possible, but not exceeding total_sum / 2.

We can use a boolean DP array (or set) to track all possible sums that can be formed using a subset of the given numbers.

Dynamic Programming Steps:

  1. Calculate total_sum of all elements in nums.
  2. Create a boolean DP array, dp, of size total_sum + 1.
    • dp[i] will be True if a sum i can be formed using a subset of nums, otherwise False.
    • Initialize dp[0] to True (an empty set sums to 0). All other dp[i] are False.
  3. Iterate through each number num in nums:
    • For each num, iterate backwards from total_sum down to num:
      • If dp[current_sum - num] is True, it means current_sum - num can be formed.
      • Then, by adding num, current_sum can also be formed. So, set dp[current_sum] = True.
    • Why iterate backwards? To ensure that each number num is used at most once to form a sum. If we iterate forwards, num could be used multiple times within the same inner loop iteration (e.g., dp[num] becomes true, then dp[2*num] becomes true using num again in the same pass).
  4. After populating the dp array, find the largest s1 (a sum for S1) such that dp[s1] is True and s1 <= total_sum / 2.
  5. The minimum difference will be total_sum - 2 * s1.

Let’s see this in action:

def min_partition_dp(nums):
    """
    Solves the Minimum Partition Problem using dynamic programming.
    """
    if not nums:
        return 0

    total_sum = sum(nums)
    
    # dp[i] will be True if sum 'i' can be formed by a subset of nums.
    # The maximum possible sum for one subset is total_sum // 2.
    # We only need to check up to total_sum // 2 because if sum 's1' is achievable,
    # then total_sum - s1 is also achievable as the sum of the other subset.
    dp = [False] * (total_sum // 2 + 1)
    dp[0] = True # An empty set can form a sum of 0

    # Iterate through each number in the input list
    for num in nums:
        # Iterate backwards from the max possible sum down to the current number
        # This ensures each number is considered only once for forming a sum
        for current_sum in range(total_sum // 2, num - 1, -1):
            if dp[current_sum - num]:
                dp[current_sum] = True
    
    # Find the largest s1 (sum of one subset) that is achievable
    # and is less than or equal to total_sum // 2
    s1 = 0
    for i in range(total_sum // 2, -1, -1):
        if dp[i]:
            s1 = i
            break
            
    # The sum of the other subset will be s2 = total_sum - s1
    # The minimum difference is |s1 - s2| = |s1 - (total_sum - s1)| = |2 * s1 - total_sum|
    return abs(total_sum - 2 * s1)

# --- Examples ---

print("\n--- Dynamic Programming Examples ---")

# Example 1: Basic case
nums1 = [3, 1, 4, 2]
result1 = min_partition_dp(nums1)
print(f"Numbers: {nums1}")
print(f"Total sum: {sum(nums1)}")
print(f"Minimum difference: {result1}")

print("-" * 20)

# Example 2: Larger set
nums2 = [10, 20, 15, 5, 25]
result2 = min_partition_dp(nums2)
print(f"Numbers: {nums2}")
print(f"Total sum: {sum(nums2)}")
print(f"Minimum difference: {result2}")

print("-" * 20)

# Example 3: Set with no perfect partition
nums3 = [1, 2, 7, 8] # Total sum = 18. Half = 9. Can we make 9? (1,8) (2,7). Yes. Diff = 0.
result3 = min_partition_dp(nums3)
print(f"Numbers: {nums3}")
print(f"Total sum: {sum(nums3)}")
print(f"Minimum difference: {result3}")

print("-" * 20)

# Example 4: Another example
nums4 = [5, 6, 8, 11] # Total sum = 30. Half = 15.
# Possible sums: 5, 6, 8, 11, 11(5+6), 13(5+8), 14(6+8), 16(5+11), 17(6+11), 19(8+11), 
# 19(5+6+8), 20(5+6+11), 22(5+8+11), 25(6+8+11)
# Largest sum <= 15 is 14 (6+8). S1=14, S2=16. Diff = 2.
result4 = min_partition_dp(nums4)
print(f"Numbers: {nums4}")
print(f"Total sum: {sum(nums4)}")
print(f"Minimum difference: {result4}")
--- Dynamic Programming Examples ---
Numbers: [3, 1, 4, 2]
Total sum: 10
Minimum difference: 0
--------------------
Numbers: [10, 20, 15, 5, 25]
Total sum: 75
Minimum difference: 5
--------------------
Numbers: [1, 2, 7, 8]
Total sum: 18
Minimum difference: 0
--------------------
Numbers: [5, 6, 8, 11]
Total sum: 30
Minimum difference: 2

Explanation of dp array population:

Let’s trace nums = [3, 1, 4, 2] with total_sum = 10. total_sum // 2 = 5. dp array of size 5 + 1 = 6. Initially dp = [True, False, False, False, False, False]

  1. num = 3:

    • current_sum = 5 (range 5 down to 3)
      • dp[5-3] (dp[2]) is False. dp[5] remains False.
    • current_sum = 4
      • dp[4-3] (dp[1]) is False. dp[4] remains False.
    • current_sum = 3
      • dp[3-3] (dp[0]) is True. Set dp[3] = True.
    • dp is now [True, False, False, True, False, False]
  2. num = 1:

    • current_sum = 5 (range 5 down to 1)
      • dp[5-1] (dp[4]) is False. dp[5] remains False.
    • current_sum = 4
      • dp[4-1] (dp[3]) is True. Set dp[4] = True.
    • current_sum = 3
      • dp[3-1] (dp[2]) is False. dp[3] remains True.
    • current_sum = 2
      • dp[2-1] (dp[1]) is False. dp[2] remains False.
    • current_sum = 1
      • dp[1-1] (dp[0]) is True. Set dp[1] = True.
    • dp is now [True, True, False, True, True, False]
  3. num = 4:

    • current_sum = 5 (range 5 down to 4)
      • dp[5-4] (dp[1]) is True. Set dp[5] = True.
    • current_sum = 4
      • dp[4-4] (dp[0]) is True. Set dp[4] = True. (already true, but conceptually refreshed)
    • dp is now [True, True, False, True, True, True]
  4. num = 2:

    • current_sum = 5 (range 5 down to 2)
      • dp[5-2] (dp[3]) is True. Set dp[5] = True. (already true)
    • current_sum = 4
      • dp[4-2] (dp[2]) is False. dp[4] remains True.
    • current_sum = 3
      • dp[3-2] (dp[1]) is True. Set dp[3] = True. (already true)
    • current_sum = 2
      • dp[2-2] (dp[0]) is True. Set dp[2] = True.
    • dp is now [True, True, True, True, True, True]

Finally, find the largest s1 <= 5 where dp[s1] is True. This is s1 = 5. min_diff = abs(total_sum - 2 * s1) = abs(10 - 2 * 5) = abs(10 - 10) = 0. Correct!

Edge Cases and Considerations

  • Empty input array: The DP solution handles this by returning 0, which is correct (difference between two empty sets is 0).
  • Single element array: nums = [42]. total_sum = 42. dp up to 21. Only dp[0] is true. Largest s1 is 0. Diff = |42 - 2*0| = 42. Correct, one set is {42}, other is {}.
  • All numbers are the same: nums = [7, 7, 7, 7]. total_sum = 28. total_sum // 2 = 14. The DP will correctly find that sum 14 is achievable (e.g., {7, 7}), leading to a difference of 0.
  • Negative numbers: The problem, as typically defined, assumes positive integers. If negative numbers are allowed, the total_sum can be negative, and the target total_sum / 2 might also be negative. The DP array index (representing sums) would need to accommodate negative values, often by offsetting the indices or using a dictionary/set instead of a boolean array. For this specific post, we adhere to positive integers as per the common problem definition.
  • Large sum: The DP approach’s time and space complexity depend on the total_sum. If total_sum is extremely large (e.g., 10^9), the dp array would be too big for memory. In such cases, one would need approximation algorithms or heuristics.

Time and Space Complexity of DP Approach

  • Time Complexity:

    • Calculating total_sum: O(n)
    • Initializing dp array: O(total_sum)
    • Populating dp array: n iterations (for each number) and total_sum // 2 iterations in the inner loop. So, O(n * total_sum).
    • Finding s1: O(total_sum)
    • Overall: O(n * total_sum).
    • Note: This is considered a pseudo-polynomial time complexity because it depends on the numerical value of the input (total_sum) rather than just the number of inputs (n). If the numbers in nums can be very large, total_sum can also be very large, making this algorithm slow despite being better than exponential.
  • Space Complexity: O(total_sum) for the dp array.

When to Use Which Approach

  • Recursive (Brute-Force): Almost never for practical applications, unless n is extremely small (e.g., n <= 20) or you’re just demonstrating the concept without optimization. It’s too slow.
  • Dynamic Programming: This is the standard solution for the Minimum Partition Problem when the total_sum of the numbers is within a manageable range (e.g., up to 10^5 to 10^6 for typical memory constraints). It’s efficient and guaranteed to find the optimal solution.
  • Heuristics/Approximation Algorithms: If n is very large, and especially if total_sum is also very large (e.g., numbers are 10^9), then the DP approach becomes infeasible due to memory constraints for the dp array. In such scenarios, you would need to use approximation algorithms (like greedy approaches, simulated annealing, or genetic algorithms) that do not guarantee the absolute optimal solution but find a “good enough” solution within reasonable time/memory limits. These are beyond the scope of this particular post.

Conclusion

The Minimum Partition Problem is a classic example of how a seemingly simple problem can quickly lead to an exploration of algorithmic complexity. While the brute-force recursive solution provides a clear conceptual understanding, its exponential time complexity renders it impractical for most real-world scenarios.

Dynamic programming, by transforming the problem into a variation of the Subset Sum Problem, offers a significantly more efficient solution, exhibiting pseudo-polynomial time complexity. This makes it a powerful tool for solving this problem when the sum of numbers is not excessively large.

Understanding such foundational problems and their efficient solutions using techniques like dynamic programming is crucial for any developer building robust and performant applications. Keep experimenting, keep learning!

Last updated on