Photo: Two zebras
Photo by Geran de Klerk on Unsplash

The two-sum puzzle

Given an array of integers and an integer target, return indices of the two numbers such that they add up to the target.

Summing up two elements to add to a specified integer is a classic (and basic) interview problem.

A good interviewer will discuss a few variants for this problem explain time-complexity and space complexity implications, identify edge cases, write a few test cases and write working code.

Let's look at an input and its expected output:

// input
int[] data = {1, 3, -1, 5, 8, 10};
int sum = 9;

// output
int[] solution = {0, 4};

The two elements, at indexes 0 and 4, sum up to the searched value 9.

Let's see some code.

Simplest solution

The simplest solution would be to iterate over each element in the array twice and stop when we found a solution.

public static int[] findSumTwo(int[] data, int sum) {
    // iterate until the next to last element
    for (int i = 0; i < data.length-1; i++) {
        // iterate to the last element, starting from the second element
        for (int j = 1; i < data.length; j++) {
            // (the double loop will start from the
            // {0, 1} index pair and stop iterating at {n-1, n})

            // if the elements add up, return their indices
            if (data[i]+data[j] == sum) return new int[]{i,j};
        }
    }

    // not found
    return new int[]{-1};
}

This implementation has terrible performance (O(N^2)) since we could be looking for 18, and the result would be the last two numbers in the array.

It works, but I wouldn't use it in an interview.

We can do better!

A fast but costly solution

Since we're only looking at two changing values, we can precompute differences to the sum we've already seen.

Going through our initial array ({1, 3, -1, 5, 8, 10}), we can compute and store the difference to the sum for each element; given the desired value of 9, we can add the following to a set:

  • 9-1 = 8
  • 9-3 = 6
  • 9-(-1) = 10
  • 9-5 = 4
  • 9-8...

When we reach 8 at position 4, we can determine that it forms a solution pair since it already exists in our set (position 0).

We can now stop and return.

But since the problem is asking for indexes, we also need to store the position of the elements making up the solution. We can do this by using a HashMap<Integer, Integer> data structure or by using an extra int[] array.

Let's see some code:

public static int[] findSumTwo(int[] data, int sum) {
    // value->index map
    Map<Integer, Integer> seen = new HashMap<>();

    // iterate through the array once (O(N))
    for (int pos=0; i < data.length; i++) {
        // assigning the element for convenience
        // because it reads better
        int element = data[i];

        // if we've already seen the current element
        if (seen.containsKey(element)) {
            // form a solution
            return new int[]{seen.get(element), pos};
        }

        // otherwise mark its difference as 'seen'
        seen.put(sum-element, i);
    }

    // not found
    return new int[]{-1};
}

This looks pretty clean, but what's the cost?

We're only iterating through the input once, so time-complexity-wise it's O(N), but we're also storing all the differences in the seen map.

We are accruing a space-complexity code of at least O(N).

With a sufficiently large input, this would take up a lot of extra heap, or it wouldn't work at all.

Can we do better?

No extra space-complexity

You might have noticed that the input array in the example above is already sorted.

This may be a trap, don't fall for it! Don't assume things and ask clarifying questions to determine what the interviewer is interested in. (I will be writing more about interview best practices in a later article.)

If the input is sorted, we can rely on a bit of math to find a more optimal solution. If we chose two starting indices, one at the beginning of the array (left) and one at the end (right), we could be sure that incrementing the left index or decrementing the right index would result in either a larger sum or a lower sum, respectively.

Given left + right = found, by comparing found to sum, we would know that the searched element needed to be larger, smaller, or the one we were looking for.

Here's how the implementation looks like:

public static int[] findSumTwo(int[] data, int sum) {
    int left=0;
    int right=data.length-1;

    // only iterate until the positions are valid
    // O(N)
    while (left < right) {
        int found = data[left] + data[right];

        if (found == sum) {
            // we have found a solution
            return new int[]{left, right};
        } else if (found < sum) {
            // we need a larger number
            left++;
        } else {
            // or a lower one
            right--;
        }
    }

    // not found
    return new int[]{-1};
}

Since we can't assume the input is already sorted, we would also need to sort it:

// most sorting algorithms run in O(N*lgN) time
Arrays.sort(data);

And there you have it - Another solution to this problem.

At this point, you may be congratulating yourself for a job well done. Don't...

Edge cases

Never call a coding interview complete before talking about edge-cases! Figuring out how your code can break is a crucial trait of a talented software engineer!

Let's consider some inputs:

// empty input
int[] data = {};
int sum = /*irrelevant*/;

// null input
int[] data = null;
int sum = /*irrelevant*/;

// overflow
int[] data = {Integer.MAX_VALUE, Integer.MAX_VALUE};
int sum = -2; // doesn't seem correct, but it is given the Integer input

// not enough elements
int[] data = {9};
int sum = 9; // we need at least two elements for a valid solution

// equal elements
int[] data = {8,8};
int sum = 16; // this would cause the first solution to wrongly return {0,0}

// unsorted input
int[] data = {3,1};
int sum = 4; // the correct solution is {0,1}

// no solution
int[] data = {1,1};
int sum = 3; // no pair can sum up to 3

A few tips to keep in mind:

  • always assume the input can be tainted unless explicitly told otherwise
  • think of null values and how they can break the code
  • do not assume the input is sorted
  • clarify what the output should look like
  • avoid printing outputs - it's hard to test and results in messy code
  • understand the consequences of making a choice (e.g., "if I need to sort the input, the time-cost is at least N*lgN)
  • explicitly state assumptions (e.g., "I assume the input is sorted", or "I assume the input has a valid solution")
  • show a good understanding of big-O notation and algorithmic performance

That's all for now!

I'll leave you with some food for thought: How would you solve this problem if the input could not fit in the memory of a single instance?

Let me know on this Twitter thread!

If you enjoyed this post, please share it with your friends!