Xor Trick to Solve Odd Numbered Integers

Johann “Myrkraverk” Oskarsson

September 19, 2022

*Problem description.* A friend online, George, gave this coding exercise problem
the other day.

*Create a function f that when given an array of numbers, returns the number
that appears an odd amount of times.*

*Only one number will ever appear an odd number of times, and there will
always be one number that appears an odd amount of times.*

*e.g.*

*f([4]) //=> 4*

*f([0]) //=> 0*

*f([1,1,2,3,3]) //=> 2*

*f([1,7,1,7,3,7,3]) //=> 7*

Now, there is a neat Xor trick to solve this problem. What follows is the proof. To keep things simple, yet interesting, we will only concern ourselves with two-bit integers.

I did not think of this trick myself, my own solution was the naïve implementation described below, coded in C. I am merely explaining how and why it works.

Rather than a truly rigorous proof like we’d see in a math journal this one is
rather informal, and can probably be looked at as a *sketch of a proof*, more than a
mathematical proof on its own.

*A naïve way to solve the problem.* Let’s look at the last example from George
adapted to two-bit integers with a 7→2 substitution.

f([1,2,1,2,3,2,3]) //=> 2

The way a human would probably do this, is to sort first,

f([1,1,2,2,2,3,3]) //=> 2

and then count, we have 2×1, which is even, so we continue, and count 3×2 which is odd, so we stop. We have found our answer, 2, and don’t have to count the rest.

*Introduction to Xor.* People intimately familiar with the Xor function can skip this
section.

Xor is a bitwise function, and is *one* when both bits are different, and *zero*
otherwise. The long form name of it is *exclusive or*, and is almost always referred
to as *xor* in comman parlance. Let’s look at a truth table for two-bit integers.

a |
b |
a ⊕ b |

00 |
00 |
00 |

00 |
01 |
01 |

00 |
10 |
10 |

00 |
11 |
11 |

01 |
00 |
01 |

01 |
01 |
00 |

01 |
10 |
11 |

01 |
11 |
10 |

10 |
00 |
10 |

10 |
01 |
11 |

10 |
10 |
00 |

10 |
11 |
01 |

11 |
00 |
11 |

11 |
01 |
10 |

11 |
10 |
01 |

11 |
11 |
00 |

The above table is in binary. It covers the numbers in the set {0, 1, 2, 3}. Hand
checking this table, we can confirm that *xor* is commutative. That means it does
not matter which number we start with, the result is the same left to right, or right
to left. We will use this property later.

Another important property of *xor* is that a number *xor*ed with itself is always
zero. We will make use of this property too.

*The proof*. Now, let’s take a look at the example problem above,

f([1,1,2,2,2,3,3]) //=> 2

and if we then go ahead and *xor* together all the numbers separately, we get

1⊕1⊕2⊕2⊕2⊕3⊕3

which we can simplify, by noting that every two numbers that are the same, are zero,

0⊕0⊕2⊕0 where 1⊕1=0, 2⊕2=0, and 3⊕3=0.

Now, we also know that 0⊕0=0, so this further simplifies to

0⊕2⊕0

and since a number *xor*ed with zero is the number itself, it further simplifies to

2.

All of this sounds rather pointless until we realize that *xor* is commutative, and we
don’t have to sort first. We can simply *xor* all the numbers together, and the last
one standing is the number we want,

1⊕2⊕1⊕2⊕3⊕2⊕3=2. □

*Optimality*. While at the surface, counting the individual numbers means we can
stop as soon as we hit the odd numbers, and therefore do not have to look at the
rest of the numbers, bringing the average case down to something less than O(n)
1,
the fact remains that a computer must sort first, and therefore look at all the
numbers anyway.

That means the *xor* method is the fastest we have.