One of the assignments in my introductory computer systems course was to write a memory allocator. We were to be graded on speed and memory usage (as well as correctness, obviously), and one of the best optimizations I made was to switch out my countLeadingZeros function with the equivalent GCC built-in.

That experience is one of many that has taught me not to re-invent the wheel. But as an exercise, I think implementing a bit-counting function can quickly demonstrate someone’s proficiency with concepts that still come up in low-level systems programming. Whether it is useful day-to-day is another question, sadly.

For my toy implementation in Python, I started by defining the simplest possible bit-counting function.

def countBits(n):
    count = 0
    for i in range(32):
        if (n >> i & 1):
            count += 1
    return count

I could try to do clever things like calculate the number of times I need to shift (no more than \(\lceil \log_{2}(n+1) \rceil\)), but I am only going to use this function a few times to pre-compute values for a lookup table.

lookup = [countBits(i) for i in range(16)]

Once I have my lookup table, I can break any integer I care about into pieces and look up the number of ones in each piece.

def countBitsPrecomputed(n):
    sum = 0
    for shift in range(0,32,4):
        idx = (n >> shift) & 0b1111
        sum += lookup[idx]
    return sum