One will never achieve a 100% accurate machine learning algorithm. If you have one then please take the trouble to prove us that its not over-fitting the training dataset.
The best performing machine learning algorithms for any particular task achieves around 90% accuracy. For example, parts-of-speech tagging problem in the field of natural language processing, the state-of-art is 97% . Similarly in the most famous Imagenet challenge for Large scale visual recognition, the 2016 results show 97% accuracy for object detection.
Thus the conclusion being you will never achieve 100% accuracy. Which implies some of the output from the machine learning will be incorrect .
Based on these 2 facts, there is one most important conclusion we can draw, we should not write deterministic code based on the outputs of the machine learning algorithms. Currently all our codes are based on deterministic style, basically we know if the bit is either 0 or 1.
For example
int i = 5
we know that the value of i is 5, full deterministic. But if the value of i came from a machine learning algorithm, we cannot not be sure that the value is 5.
For a real world example, if the image classifier says the object is a table we cannot be fully deterministic that the object is table because there is 4% chance that the object is not a table.
This information about machine learning has greater impact on the software developed which uses prediction of machine learning algorithms in their software architecture.
We face this problem based on our experience from our participation in Robocup@work competition. The competition is about mobile robot in an industrial environment and solving the pick and place task.
Before picking robot perceives the location and decides which object to pick.
We use a CNN based deep learning algorithm to classify the objects that needs to be picked with 99% accuracy on our training dataset.
The problem is sometimes the objects are mis-classified. But as we were not taking care of this corner condition, we ended up picking wrong objects and loosing lot of points during the competition. These problems can be addressed using some solutions to accommodate the corner conditions.
There are different solutions for robustly using machine learning predictions in your software architectures. Some solutions which we have used are:
Using Confidence Parameter of the Machine Learning algorithms
Most of the state-of-art machine learning algorithms(SVM, Random forest, CNN etc) provide a confidence parameter for their predictions. This should be used to determine which predictions should be used.
But not algorithms provide the expected confidence measures. For example in CNN classification sometimes it mis-classifies with 100% confidence. This makes it difficult to use this as parameter for discarding the faulty prediction.
Repeated Predictions
Never use just 1 time prediction in real world. For example, if doing image classification we should take multiple images from different angles or different cameras and do multiple prediction. Then find a knowledge fusion method to combine the predictions from the multiple prediction to determine what is the correct prediction.
These are some simple solutions but more complex methods using filters and Bayesian methods will improve results.
Conclusion
The main take away point from the discussion is that machine learning algorithms output are not deterministic. So the software which uses these output(predictions) cannot be just simple if-else condition based, on the contrary they should have mechanisms through which it can accommodate the faults of the predictions.
Please comment on the various methods you have used to accommodate the faults in prediction of your algorithms.
Top comments (6)
On some level, determinism is a fleeting goal in software development in general. Things can quickly get so complicated that strict determinism that we can actually keep track of is pretty unlikely.
I would say it was the fleeting goal in software development. Especially in the field which I am from, Robotics. Here we cannot be sure of anything(sensor readings, algorithm outputs). Its always a probability with each value. We feel its because of the real world in which the robots interact, which cannot be measured deterministically.
In robotics we are moving away from deterministic approaches and converging towards probabilistic approaches. One of a good example is the problem of navigation in robots, which is fully solved using probabilistic algorithms(SLAM).
The complexity was the major drawback but we are building methods and tools to solve.
For example, new programming languages like probabilistic programming which are new tools for creating software when you have non-deterministic inputs.
Oh yeah, I definitely agree.
@deebuls , cool post.
and @ben ... from Halifax? if so, the world is small, and cool site!
Ben from Halifax indeed. I'm actually going home tomorrow if you're around. 😄
Have a good time back home! It be great to catch up, but I'm living in Germany, and unfortunately I won't be heading home this year- using my holidays to go to Vancouver