DEV Community

Steven Hur
Steven Hur

Posted on

How I Fixed a Confusing Bug in NumPy

Contributing to a massive open-source project like NumPy can feel intimidating. You imagine complex C code, advanced math, and scary build processes. But sometimes, a bug is just a simple logic error hiding in plain sight.

I just submitted a Pull Request to NumPy to fix a bug that was causing misleading error messages in numpy.convolve. Here’s the story of the bug, the fix, and how I verified it.

"Wait, What?"
Imagine you are using numpy.convolve. You accidentally pass an empty array as your first argument, but your second argument is perfectly fine.

import numpy as np

a = np.array([])      # Empty!
v = np.array([1, 2])  # Not empty!

np.convolve(a, v)
Enter fullscreen mode Exit fullscreen mode

You would expect an error saying a cannot be empty, right? Instead, NumPy screams at you:

ValueError: v cannot be empty
Enter fullscreen mode Exit fullscreen mode

Wait... what? I know v isn't empty. I just double-checked it! This is the kind of error message that sends developers down a rabbit hole for an hour, debugging the wrong variable.

Keep Calm, just Find the Bug
I search through the NumPy source code, numpy/_core/numeric.py to see what was happening under the hood. The logic looked something like this:

# The original buggy logic
def convolve(a, v, mode='full'):
    # ...

    if (len(v) > len(a)):
        a, v = v, a  # <--- The SWAP happens here!

    # Validation
    if len(a) == 0:
        raise ValueError('a cannot be empty')
    if len(v) == 0:
        raise ValueError('v cannot be empty') # <--- The error triggers here
Enter fullscreen mode Exit fullscreen mode

Do you see the problem?

  1. The function sees that v is longer than a.
  2. It decides to swap them for performance reasons.
  3. Now, internally, variable v holds the empty array.
  4. The check if len(v) == 0 triggers, raising ValueError: v cannot be empty. The function was swapping the contents of the variables, but the error message was hardcoded to the variable name. It was basically gaslighting the user.

Check First, Optimize Later
The fix was simple. We just needed to ensure the input validation happens before any internal swapping takes place.

I changed the order of operations:

# The fixed logic
def convolve(a, v, mode='full'):
    # ...

    # 1. Check for empty inputs FIRST
    if len(a) == 0:
        raise ValueError('a cannot be empty')
    if len(v) == 0:
        raise ValueError('v cannot be empty')

    # 2. THEN perform the optimization swap
    if (len(v) > len(a)):
        a, v = v, a
Enter fullscreen mode Exit fullscreen mode

Now, if a is empty, it gets caught immediately, and the user gets the correct error message, a cannot be empty.

This was a small change, just moving a few lines of code but it significantly improves the developer experience. No one likes misleading error messages.

It was a great reminder that you don't need to be a math genius to contribute to libraries like NumPy. Sometimes, you just need to spot a logic bug and move some if statements around.

My PR is up! Fingers crossed for the merge.

Top comments (0)