One of the assignments in my introductory computer systems course was to write a memory allocator. We were to be graded on speed and memory usage (as well as correctness, obviously), and one of the best optimizations I made was to switch out my countLeadingZeros function with the equivalent GCC built-in.

That experience is one of many that has taught me not to re-invent the wheel. But as an exercise, I think implementing a bit-counting function can quickly demonstrate someone’s proficiency with concepts that still come up in low-level systems programming. Whether it is useful day-to-day is another question, sadly.

For my toy implementation in Python, I started by defining the simplest possible bit-counting function.

def countBits(n):
count = 0
for i in range(32):
if (n >> i & 1):
count += 1
return count


I could try to do clever things like calculate the number of times I need to shift (no more than $$\lceil \log_{2}(n+1) \rceil$$), but I am only going to use this function a few times to pre-compute values for a lookup table.

lookup = [countBits(i) for i in range(16)]


Once I have my lookup table, I can break any integer I care about into pieces and look up the number of ones in each piece.

def countBitsPrecomputed(n):
sum = 0
for shift in range(0,32,4):
idx = (n >> shift) & 0b1111
sum += lookup[idx]
return sum