What anomaly and bug detections would you like to see automated?

#ai #machinelearning #python

I am working on a debugging tool for neural networks (https://github.com/FlorianDietz/comgra)). Currently it is useful for visualizations and in-depth manual analysis, something that is lacking in tensorboard and other tools.
I want to extend it to automate a lot of the common analyses and anomaly detections in order to save the developer time.

I am looking for suggestions on what would be the most useful for you.

How it would work:

You run a number of trials on similar networks with similar tasks, with different hyperparameters. The tool logs all relevant data and automatically detects anomalies such as "vanishing gradients" or "the loss has unusually high variance" or "the classification is imbalanced and works poorly on targets of type X".

In a second step, it performs a correlation analysis between the hyperparameters of each trial and the anomalies detected in those trials. It then generates a list of warnings for each statistically significant finding. For example:

"30% of trials with learning rate above 3e-4 had vanishing gradients, versus 0% of trials with learning rate below 3e-4."

"50% of trials with architectural variant X had unusually high variance in the loss, versus 10% of trials with other architectural variants."

Having a large list of warnings like these generated automatically would allow you to identify bugs very quickly. Additionally, if no warnings are generated then you can be much more confident in the stability of your model.

Of course, many warnings would also be false positives that aren't worth investigating, but I imagine it's better to be warned for no reason than to miss a problem that actually matters.

What do you think of the idea?

What types of anomalies do you think would make the most sense to look for?

DEV Community

What anomaly and bug detections would you like to see automated?

Top comments (0)