The world of algorithms is full of intriguing problems that, on the surface, seem simple but quickly reveal their depth. One such challenge is the Minimum Partition Problem. If you’ve ever dealt with balancing workloads, distributing resources, or even just trying to split a group of numbers as evenly as possible, you’ve implicitly touched upon this concept.
In this post, we’ll break down the Minimum Partition Problem. We’ll start with its definition, explore a straightforward (but inefficient) recursive approach, and then significantly optimize it using dynamic programming. All with practical, runnable Python code examples.
Let’s dive in.
What is the Minimum Partition Problem?
The Minimum Partition Problem, often simply called the “Partition Problem” or “Number Partitioning Problem,” is a classic combinatorial optimization problem.
Given a set of positive integers, the goal is to partition it into two subsets such that the absolute difference between their sums is minimized.
Let’s clarify with an example:
Suppose you have the set S = {3, 1, 4, 2}
.
The total sum of elements is 3 + 1 + 4 + 2 = 10
.
We want to divide these numbers into two subsets, say S1
and S2
, such that |sum(S1) - sum(S2)|
is as small as possible.
Consider these partitions:
S1 = {3, 1, 2}
(sum = 6),S2 = {4}
(sum = 4). Difference =|6 - 4| = 2
.S1 = {3, 4}
(sum = 7),S2 = {1, 2}
(sum = 3). Difference =|7 - 3| = 4
.S1 = {3, 2}
(sum = 5),S2 = {1, 4}
(sum = 5). Difference =|5 - 5| = 0
.
In this example, the minimum difference is 0
, achieved by S1 = {3, 2}
and S2 = {1, 4}
.
Why is this important? This problem has applications in various fields:
- Load Balancing: Distributing tasks (with varying processing times) among two processors to minimize the difference in their total workload.
- Resource Allocation: Dividing resources with different costs or capacities to two teams.
- Manufacturing: Cutting a raw material into pieces of specific lengths to minimize waste.
Note: The Partition Problem is known to be NP-hard. This means that no known polynomial-time algorithm exists to solve it for arbitrarily large inputs. However, for practical inputs where the sum of numbers isn’t excessively large, dynamic programming provides an efficient pseudo-polynomial time solution.
Brute-Force Approach: Recursion
The most intuitive way to approach this problem is to consider each element and decide whether to include it in the first subset (S1
) or the second subset (S2
).
Let’s define a recursive function min_diff(index, current_sum1, current_sum2)
:
index
: The current element we are considering.current_sum1
: The sum of elements chosen forS1
so far.current_sum2
: The sum of elements chosen forS2
so far.
The base case is when index
reaches the end of the array. At this point, current_sum1
and current_sum2
represent the sums of the two subsets, and we return their absolute difference.
For each element, we have two choices:
- Include
nums[index]
inS1
. - Include
nums[index]
inS2
.
We then take the minimum of the results from these two choices.
def min_partition_recursive(nums):
"""
Solves the Minimum Partition Problem using a brute-force recursive approach.
"""
n = len(nums)
# Helper recursive function
def solve(index, sum1, sum2):
# Base case: If all elements are processed
if index == n:
return abs(sum1 - sum2)
# Option 1: Include current element in sum1
choice1 = solve(index + 1, sum1 + nums[index], sum2)
# Option 2: Include current element in sum2
choice2 = solve(index + 1, sum1, sum2 + nums[index])
return min(choice1, choice2)
return solve(0, 0, 0)
# --- Examples ---
print("--- Recursive Examples ---")
# Example 1: Basic case
nums1 = [3, 1, 4, 2]
result1 = min_partition_recursive(nums1)
print(f"Numbers: {nums1}")
print(f"Minimum difference: {result1}")
print("-" * 20)
# Example 2: Larger set
nums2 = [10, 20, 15, 5, 25]
result2 = min_partition_recursive(nums2)
print(f"Numbers: {nums2}")
print(f"Minimum difference: {result2}")
print("-" * 20)
# Example 3: Single element
nums3 = [42]
result3 = min_partition_recursive(nums3)
print(f"Numbers: {nums3}")
print(f"Minimum difference: {result3}")
print("-" * 20)
# Example 4: All same numbers
nums4 = [7, 7, 7, 7]
result4 = min_partition_recursive(nums4)
print(f"Numbers: {nums4}")
print(f"Minimum difference: {result4}")
--- Recursive Examples ---
Numbers: [3, 1, 4, 2]
Minimum difference: 0
--------------------
Numbers: [10, 20, 15, 5, 25]
Minimum difference: 5
--------------------
Numbers: [42]
Minimum difference: 42
--------------------
Numbers: [7, 7, 7, 7]
Minimum difference: 0
Analysis of the Recursive Approach:
- Time Complexity: For each of the
n
elements, we have two choices. This leads to2^n
possible combinations. Thus, the time complexity isO(2^n)
, which is exponential. This becomes prohibitively slow forn
greater than around 20-25. - Space Complexity:
O(n)
due to the recursion stack depth.
This approach, while correct, is not practical for larger sets of numbers. We need a more efficient way.
Optimizing with Dynamic Programming
The Minimum Partition Problem has optimal substructure and overlapping subproblems, making it a perfect candidate for dynamic programming.
The key insight for the DP approach is to transform the problem into a variation of the Subset Sum Problem.
Let total_sum
be the sum of all elements in the given set nums
.
If we form one subset S1
with a sum s1
, then the other subset S2
will automatically have a sum s2 = total_sum - s1
.
Our goal is to minimize |s1 - s2| = |s1 - (total_sum - s1)| = |2 * s1 - total_sum|
.
To minimize |2 * s1 - total_sum|
, we need s1
to be as close as possible to total_sum / 2
.
So, the problem boils down to: Find a subset S1
whose sum s1
is the largest possible, but not exceeding total_sum / 2
.
We can use a boolean DP array (or set) to track all possible sums that can be formed using a subset of the given numbers.
Dynamic Programming Steps:
- Calculate
total_sum
of all elements innums
. - Create a boolean DP array,
dp
, of sizetotal_sum + 1
.dp[i]
will beTrue
if a sumi
can be formed using a subset ofnums
, otherwiseFalse
.- Initialize
dp[0]
toTrue
(an empty set sums to 0). All otherdp[i]
areFalse
.
- Iterate through each number
num
innums
:- For each
num
, iterate backwards fromtotal_sum
down tonum
:- If
dp[current_sum - num]
isTrue
, it meanscurrent_sum - num
can be formed. - Then, by adding
num
,current_sum
can also be formed. So, setdp[current_sum] = True
.
- If
- Why iterate backwards? To ensure that each number
num
is used at most once to form a sum. If we iterate forwards,num
could be used multiple times within the same inner loop iteration (e.g.,dp[num]
becomes true, thendp[2*num]
becomes true usingnum
again in the same pass).
- For each
- After populating the
dp
array, find the largests1
(a sum forS1
) such thatdp[s1]
isTrue
ands1 <= total_sum / 2
. - The minimum difference will be
total_sum - 2 * s1
.
Let’s see this in action:
def min_partition_dp(nums):
"""
Solves the Minimum Partition Problem using dynamic programming.
"""
if not nums:
return 0
total_sum = sum(nums)
# dp[i] will be True if sum 'i' can be formed by a subset of nums.
# The maximum possible sum for one subset is total_sum // 2.
# We only need to check up to total_sum // 2 because if sum 's1' is achievable,
# then total_sum - s1 is also achievable as the sum of the other subset.
dp = [False] * (total_sum // 2 + 1)
dp[0] = True # An empty set can form a sum of 0
# Iterate through each number in the input list
for num in nums:
# Iterate backwards from the max possible sum down to the current number
# This ensures each number is considered only once for forming a sum
for current_sum in range(total_sum // 2, num - 1, -1):
if dp[current_sum - num]:
dp[current_sum] = True
# Find the largest s1 (sum of one subset) that is achievable
# and is less than or equal to total_sum // 2
s1 = 0
for i in range(total_sum // 2, -1, -1):
if dp[i]:
s1 = i
break
# The sum of the other subset will be s2 = total_sum - s1
# The minimum difference is |s1 - s2| = |s1 - (total_sum - s1)| = |2 * s1 - total_sum|
return abs(total_sum - 2 * s1)
# --- Examples ---
print("\n--- Dynamic Programming Examples ---")
# Example 1: Basic case
nums1 = [3, 1, 4, 2]
result1 = min_partition_dp(nums1)
print(f"Numbers: {nums1}")
print(f"Total sum: {sum(nums1)}")
print(f"Minimum difference: {result1}")
print("-" * 20)
# Example 2: Larger set
nums2 = [10, 20, 15, 5, 25]
result2 = min_partition_dp(nums2)
print(f"Numbers: {nums2}")
print(f"Total sum: {sum(nums2)}")
print(f"Minimum difference: {result2}")
print("-" * 20)
# Example 3: Set with no perfect partition
nums3 = [1, 2, 7, 8] # Total sum = 18. Half = 9. Can we make 9? (1,8) (2,7). Yes. Diff = 0.
result3 = min_partition_dp(nums3)
print(f"Numbers: {nums3}")
print(f"Total sum: {sum(nums3)}")
print(f"Minimum difference: {result3}")
print("-" * 20)
# Example 4: Another example
nums4 = [5, 6, 8, 11] # Total sum = 30. Half = 15.
# Possible sums: 5, 6, 8, 11, 11(5+6), 13(5+8), 14(6+8), 16(5+11), 17(6+11), 19(8+11),
# 19(5+6+8), 20(5+6+11), 22(5+8+11), 25(6+8+11)
# Largest sum <= 15 is 14 (6+8). S1=14, S2=16. Diff = 2.
result4 = min_partition_dp(nums4)
print(f"Numbers: {nums4}")
print(f"Total sum: {sum(nums4)}")
print(f"Minimum difference: {result4}")
--- Dynamic Programming Examples ---
Numbers: [3, 1, 4, 2]
Total sum: 10
Minimum difference: 0
--------------------
Numbers: [10, 20, 15, 5, 25]
Total sum: 75
Minimum difference: 5
--------------------
Numbers: [1, 2, 7, 8]
Total sum: 18
Minimum difference: 0
--------------------
Numbers: [5, 6, 8, 11]
Total sum: 30
Minimum difference: 2
Explanation of dp
array population:
Let’s trace nums = [3, 1, 4, 2]
with total_sum = 10
. total_sum // 2 = 5
.
dp
array of size 5 + 1 = 6
. Initially dp = [True, False, False, False, False, False]
-
num = 3
:current_sum = 5
(range 5 down to 3)dp[5-3]
(dp[2]
) is False.dp[5]
remains False.
current_sum = 4
dp[4-3]
(dp[1]
) is False.dp[4]
remains False.
current_sum = 3
dp[3-3]
(dp[0]
) is True. Setdp[3] = True
.
dp
is now[True, False, False, True, False, False]
-
num = 1
:current_sum = 5
(range 5 down to 1)dp[5-1]
(dp[4]
) is False.dp[5]
remains False.
current_sum = 4
dp[4-1]
(dp[3]
) is True. Setdp[4] = True
.
current_sum = 3
dp[3-1]
(dp[2]
) is False.dp[3]
remains True.
current_sum = 2
dp[2-1]
(dp[1]
) is False.dp[2]
remains False.
current_sum = 1
dp[1-1]
(dp[0]
) is True. Setdp[1] = True
.
dp
is now[True, True, False, True, True, False]
-
num = 4
:current_sum = 5
(range 5 down to 4)dp[5-4]
(dp[1]
) is True. Setdp[5] = True
.
current_sum = 4
dp[4-4]
(dp[0]
) is True. Setdp[4] = True
. (already true, but conceptually refreshed)
dp
is now[True, True, False, True, True, True]
-
num = 2
:current_sum = 5
(range 5 down to 2)dp[5-2]
(dp[3]
) is True. Setdp[5] = True
. (already true)
current_sum = 4
dp[4-2]
(dp[2]
) is False.dp[4]
remains True.
current_sum = 3
dp[3-2]
(dp[1]
) is True. Setdp[3] = True
. (already true)
current_sum = 2
dp[2-2]
(dp[0]
) is True. Setdp[2] = True
.
dp
is now[True, True, True, True, True, True]
Finally, find the largest s1 <= 5
where dp[s1]
is True
. This is s1 = 5
.
min_diff = abs(total_sum - 2 * s1) = abs(10 - 2 * 5) = abs(10 - 10) = 0
. Correct!
Edge Cases and Considerations
- Empty input array: The DP solution handles this by returning 0, which is correct (difference between two empty sets is 0).
- Single element array:
nums = [42]
.total_sum = 42
.dp
up to21
. Onlydp[0]
is true. Largests1
is0
. Diff =|42 - 2*0| = 42
. Correct, one set is{42}
, other is{}
. - All numbers are the same:
nums = [7, 7, 7, 7]
.total_sum = 28
.total_sum // 2 = 14
. The DP will correctly find that sum14
is achievable (e.g.,{7, 7}
), leading to a difference of0
. - Negative numbers: The problem, as typically defined, assumes positive integers. If negative numbers are allowed, the
total_sum
can be negative, and the targettotal_sum / 2
might also be negative. The DP array index (representing sums) would need to accommodate negative values, often by offsetting the indices or using a dictionary/set instead of a boolean array. For this specific post, we adhere to positive integers as per the common problem definition. - Large sum: The DP approach’s time and space complexity depend on the
total_sum
. Iftotal_sum
is extremely large (e.g.,10^9
), thedp
array would be too big for memory. In such cases, one would need approximation algorithms or heuristics.
Time and Space Complexity of DP Approach
-
Time Complexity:
- Calculating
total_sum
:O(n)
- Initializing
dp
array:O(total_sum)
- Populating
dp
array:n
iterations (for each number) andtotal_sum // 2
iterations in the inner loop. So,O(n * total_sum)
. - Finding
s1
:O(total_sum)
- Overall:
O(n * total_sum)
. - Note: This is considered a pseudo-polynomial time complexity because it depends on the numerical value of the input (
total_sum
) rather than just the number of inputs (n
). If the numbers innums
can be very large,total_sum
can also be very large, making this algorithm slow despite being better than exponential.
- Calculating
-
Space Complexity:
O(total_sum)
for thedp
array.
When to Use Which Approach
- Recursive (Brute-Force): Almost never for practical applications, unless
n
is extremely small (e.g.,n <= 20
) or you’re just demonstrating the concept without optimization. It’s too slow. - Dynamic Programming: This is the standard solution for the Minimum Partition Problem when the
total_sum
of the numbers is within a manageable range (e.g., up to10^5
to10^6
for typical memory constraints). It’s efficient and guaranteed to find the optimal solution. - Heuristics/Approximation Algorithms: If
n
is very large, and especially iftotal_sum
is also very large (e.g., numbers are10^9
), then the DP approach becomes infeasible due to memory constraints for thedp
array. In such scenarios, you would need to use approximation algorithms (like greedy approaches, simulated annealing, or genetic algorithms) that do not guarantee the absolute optimal solution but find a “good enough” solution within reasonable time/memory limits. These are beyond the scope of this particular post.
Conclusion
The Minimum Partition Problem is a classic example of how a seemingly simple problem can quickly lead to an exploration of algorithmic complexity. While the brute-force recursive solution provides a clear conceptual understanding, its exponential time complexity renders it impractical for most real-world scenarios.
Dynamic programming, by transforming the problem into a variation of the Subset Sum Problem, offers a significantly more efficient solution, exhibiting pseudo-polynomial time complexity. This makes it a powerful tool for solving this problem when the sum of numbers is not excessively large.
Understanding such foundational problems and their efficient solutions using techniques like dynamic programming is crucial for any developer building robust and performant applications. Keep experimenting, keep learning!