Discussion on: Help Me Understand This Vectorized Logic

View post

Without getting into the weeds, here are my high-level thoughts:

Given the input data and approach, are you sure that you should be able to get accuracy in the 0.94 range? Does the exercise indicate you should be able to get results like this using the algorithm approach you tried?

This ☝️ point brings up a second important point: it would be useful for you to decompose the problem into its key components and check each of those components separately. Are you sure the loading and broadcasting steps are working like you think it should? Have you tried your distance function on some simple data and got the result you expected? Finally, if all of the above are working as expected, could you put together some data which should receive 100% accuracy, and then test that out?

By looking at each piece individually (ideally with a light-weight unit test) you can start to narrow down the list of potential causes.

Ryan Palo • Aug 1 '19

Thanks for your help. I had been putting off tests, but that was the next step in the plan.

Actually, running through a much simpler case in a REPL ended up doing it for me. See my comment about my solution.

But your comments about going back to debugging basics and slowly and methodically validating one piece of logic at a time were what put me back on the right track, so thanks!

Evan Oman • Aug 1 '19

Glad you were able to figure it out, nice work!