One will never achieve a 100% accurate machine learning algorithm. If you have one then please take the trouble to prove us that its not over-fitting the training dataset.
The best performing machine learning algorithms for any particular task achieves around 90% accuracy. For example, parts-of-speech tagging problem in the field of natural language processing, the state-of-art is 97% . Similarly in the most famous Imagenet challenge for Large scale visual recognition, the 2016 results show 97% accuracy for object detection.
Thus the conclusion being you will never achieve 100% accuracy. Which implies some of the output from the machine learning will be incorrect .
Based on these 2 facts, there is one most important conclusion we can draw, we should not write deterministic code based on the outputs of the machine learning algorithms. Currently all our codes are based on deterministic style, basically we know if the bit is either 0 or 1.
int i = 5
we know that the value of i is 5, full deterministic. But if the value of i came from a machine learning algorithm, we cannot not be sure that the value is 5.
For a real world example, if the image classifier says the object is a table we cannot be fully deterministic that the object is table because there is 4% chance that the object is not a table.
This information about machine learning has greater impact on the software developed which uses prediction of machine learning algorithms in their software architecture.
We face this problem based on our experience from our participation in Robocup@work competition. The competition is about mobile robot in an industrial environment and solving the pick and place task.
The problem is sometimes the objects are mis-classified. But as we were not taking care of this corner condition, we ended up picking wrong objects and loosing lot of points during the competition. These problems can be addressed using some solutions to accommodate the corner conditions.
There are different solutions for robustly using machine learning predictions in your software architectures. Some solutions which we have used are:
Most of the state-of-art machine learning algorithms(SVM, Random forest, CNN etc) provide a confidence parameter for their predictions. This should be used to determine which predictions should be used.
But not algorithms provide the expected confidence measures. For example in CNN classification sometimes it mis-classifies with 100% confidence. This makes it difficult to use this as parameter for discarding the faulty prediction.
Never use just 1 time prediction in real world. For example, if doing image classification we should take multiple images from different angles or different cameras and do multiple prediction. Then find a knowledge fusion method to combine the predictions from the multiple prediction to determine what is the correct prediction.
These are some simple solutions but more complex methods using filters and Bayesian methods will improve results.
The main take away point from the discussion is that machine learning algorithms output are not deterministic. So the software which uses these output(predictions) cannot be just simple if-else condition based, on the contrary they should have mechanisms through which it can accommodate the faults of the predictions.
Please comment on the various methods you have used to accommodate the faults in prediction of your algorithms.