Without getting into the weeds, here are my high-level thoughts:
Given the input data and approach, are you sure that you should be able to get accuracy in the 0.94 range? Does the exercise indicate you should be able to get results like this using the algorithm approach you tried?
This ☝️ point brings up a second important point: it would be useful for you to decompose the problem into its key components and check each of those components separately. Are you sure the loading and broadcasting steps are working like you think it should? Have you tried your distance function on some simple data and got the result you expected? Finally, if all of the above are working as expected, could you put together some data which should receive 100% accuracy, and then test that out?
By looking at each piece individually (ideally with a light-weight unit test) you can start to narrow down the list of potential causes.
I'm a Sr. Software Engineer at Flashpoint. I specialize in Python and Go, building functional, practical, and maintainable web systems leveraging Kubernetes and the cloud. Blog opinions are my own.
Thanks for your help. I had been putting off tests, but that was the next step in the plan.
Actually, running through a much simpler case in a REPL ended up doing it for me. See my comment about my solution.
But your comments about going back to debugging basics and slowly and methodically validating one piece of logic at a time were what put me back on the right track, so thanks!
Without getting into the weeds, here are my high-level thoughts:
Given the input data and approach, are you sure that you should be able to get accuracy in the
0.94range? Does the exercise indicate you should be able to get results like this using the algorithm approach you tried?This ☝️ point brings up a second important point: it would be useful for you to decompose the problem into its key components and check each of those components separately. Are you sure the loading and broadcasting steps are working like you think it should? Have you tried your
distancefunction on some simple data and got the result you expected? Finally, if all of the above are working as expected, could you put together some data which should receive 100% accuracy, and then test that out?By looking at each piece individually (ideally with a light-weight unit test) you can start to narrow down the list of potential causes.
Thanks for your help. I had been putting off tests, but that was the next step in the plan.
Actually, running through a much simpler case in a REPL ended up doing it for me. See my comment about my solution.
But your comments about going back to debugging basics and slowly and methodically validating one piece of logic at a time were what put me back on the right track, so thanks!
Glad you were able to figure it out, nice work!