re: Amazon's Interview Question: Count Island VIEW POST

FULL DISCUSSION
 

My approach to deal with this kind of problems :

  • first isolate typical cases : the matrix of size 0x0, all 1s, all 0s, a doughnut-shaped island with other small islands inside. This will be useful, because you can quickly discard bad implementations with a bit of thought by testing against your list of cases.
  • When you have an idea of algorithm, check that it's compatible with your cases. Then stash this idea, and try to come up with another one. Don't stop at the first idea you have. Try to find several implementations (at least 2, 3 is best).
  • Weight the pros and cons of each, then select the most promising, and work out the details.

For this specific task my solution is :
Use a double-buffered row (matching the matrix row length), in which I'll put "colours" (a natural will do fine) for each "1" found. Use an invalid colour (-1 for naturals) for the "0"s in the row. You fill one row buffer form left to right, then you swap buffers. To know what colour to assign, you have to check :
1) Is the matching matrix cell land or water ? If it's water, use invalid_colour
2) Else use colour of the cell directly to the left, or the colour of the cell directly above (that's why you need to keep 2 colour rows, but you don't need more). If both are invalid, use a new colour.

Do all the rows until you reach the bottom of the matrix. Initialize the colour buffers with invalid_colour.

By doing that, you will have "coloured" all the landmasses. But it may happen that a single island has 2 colours (Try with a U-shaped island). To avoid counting double you'll use a map that store, for each colour, the "name" of the island. (of course, the "names" too can be naturals). The idea is : when you add a new colour (see step 2 above), also add it to the map with a new island name. And when you detect a 2 different adjacent colours, change the name mapped to one of the colours, so they both map to the same name (by doing that, you're "freeing" an island name, that you can store for reuse later). Detecting adjacent colours is quite easy. In step 2 of the algorithm, if the cell above, and the cell on the left both have a valid colour, and they're different, then you should "merge" names.

At the end, you just want to count the number of different names, which is easy if you used integers, and reused "free"d ones : that's your "highest" name, minus the number of free'd ones remaining at the end.

This would be quite efficient if the matrix is given as a continuous stream of rows of booleans. If the matrix is given as columns, you may want to adapt the algorithm.

 

This is pretty detail. To be honest, I have to read it a couple times for it to make sense. I just want to do good enough for now to pass an interview. :-p

code of conduct - report abuse