loading...
Cover image for Amazon's Interview Question: Count Island

Amazon's Interview Question: Count Island

rattanakchea profile image Rattanak Chea ・3 min read

Last year when I was interviewing with Amazon, the first question that I was asked is to write a function get_number_of_islands(matrix) . This was the first time I had seen this problem. It was hard for me. In retrospect, I want to share what I had learned and approaches to solving this problem.

Problem

Given a 2 dimension array matrix of 0s and 1s, count the number of islands of 1s. An island is surrounded by a group of adjacent cells that are all 1s. A cell can only be adjacent to each other horizontally and vertically.

Example

input:  binaryMatrix = [ [0,    1,    0,    1,    0],
                         [0,    0,    1,    1,    1],
                         [1,    0,    0,    1,    0],
                         [0,    1,    1,    0,    0],
                         [1,    0,    1,    0,    1] ]

output: 6  # there are six islands

Approaches

First, you should ask the interviewer questions for clarification, paraphrase the problem to make sure you understand it correctly, and/or state any assumptions.

  • Does the matrix have the same number of columns or rows? (Interviewer: it should not matter)
  • We assume that the cell can be adjacent only horizontally and vertically, not diagonally? (Correct)
  • The function should handle the empty matrix and out of bound data access correctly.

Second, now you can talk about the processes to solve this problem out loud.

Okay, so we want to iterate over each cell in the matrix. Then we can check if the cell is 1 or 0.

It is a 0, it is not an island and we can continue. If it is a 1, it is guaranteed that we found 1 island. (sound like a base case? ) At this point, we need a mechanism to make this cell as visited, so next time around we can skip it to avoid repeated counting. After that, we want to look at all the adjacent cells to repeat the process (recursion).

Pseudocode

def main():
    visited [] # array to keep track of visited cell
    num_islands = 0 # start with zero
    for each row, col in matrix
          num_island += get_number_of_islands(matrix, row, col) #return 1 if it is an island
return num_island
def get_number_of_islands(matrix, row, col, visited):
    check if row and col is out of bound of the matrix
    check if we already visited the cell with row, col
    check if the cell is 0
    => return 0

    mark the cell as visited in the visited array
    recursive call get_number_of_islands() on each adjacent cell
    return 1

Solution (Python)

def get_number_of_islands(binaryMatrix):
    rows = len(binaryMatrix)
    cols = len(binaryMatrix[0])
    # you can use Set if you like
    # or change the content of binaryMatrix as it is visited
    visited = [[0 for col in range(cols)] for r in range(rows)]
    number_of_island = 0
    for row in range(rows):
        for col in range(cols):
            number_of_island += get_island(binaryMatrix, row, col, visited)
    return number_of_island


# get a continuous island
def get_island(binaryMatrix, row, col, visited):
    if not is_valid(binaryMatrix, row, col)
        or visited[row][col] == 1 or binaryMatrix[row][col] == 0:
        return 0

    # mark as visited
    visited[row][col] = 1
    get_island(binaryMatrix, row, col + 1, visited)
    get_island(binaryMatrix, row, col - 1, visited)
    get_island(binaryMatrix, row + 1, col, visited)
    get_island(binaryMatrix, row - 1, col, visited)
    return 1


def is_valid(binaryMatrix, row, col):
    rows = len(binaryMatrix)
    cols = len(binaryMatrix[0])
    return row >= 0 and row < rows and col >= 0 and col < cols

Extra

  • What is the time and space complexities?
  • Question variation: Find the island with the largest size? In this case above, return 5
  • This problem can also be solved without using recursion as well. But it is required a Queue or Stack to keep track of adjacent cells to visit.

Notes

This question can be hard and make an interviewee being stuck (including myself) due to a number of reasons.

  • unfamiliarity with basic Graph data structure and the concepts of DFS or BFS traversal. This is considered a graph with the adjacency matrix.
  • being stuck in handling the base case and recursive case.
  • interview time constraint

*Cover image is obtained from unsplash.com

Posted on by:

Discussion

pic
Editor guide
 

My approach to deal with this kind of problems :

  • first isolate typical cases : the matrix of size 0x0, all 1s, all 0s, a doughnut-shaped island with other small islands inside. This will be useful, because you can quickly discard bad implementations with a bit of thought by testing against your list of cases.
  • When you have an idea of algorithm, check that it's compatible with your cases. Then stash this idea, and try to come up with another one. Don't stop at the first idea you have. Try to find several implementations (at least 2, 3 is best).
  • Weight the pros and cons of each, then select the most promising, and work out the details.

For this specific task my solution is :
Use a double-buffered row (matching the matrix row length), in which I'll put "colours" (a natural will do fine) for each "1" found. Use an invalid colour (-1 for naturals) for the "0"s in the row. You fill one row buffer form left to right, then you swap buffers. To know what colour to assign, you have to check :
1) Is the matching matrix cell land or water ? If it's water, use invalid_colour
2) Else use colour of the cell directly to the left, or the colour of the cell directly above (that's why you need to keep 2 colour rows, but you don't need more). If both are invalid, use a new colour.

Do all the rows until you reach the bottom of the matrix. Initialize the colour buffers with invalid_colour.

By doing that, you will have "coloured" all the landmasses. But it may happen that a single island has 2 colours (Try with a U-shaped island). To avoid counting double you'll use a map that store, for each colour, the "name" of the island. (of course, the "names" too can be naturals). The idea is : when you add a new colour (see step 2 above), also add it to the map with a new island name. And when you detect a 2 different adjacent colours, change the name mapped to one of the colours, so they both map to the same name (by doing that, you're "freeing" an island name, that you can store for reuse later). Detecting adjacent colours is quite easy. In step 2 of the algorithm, if the cell above, and the cell on the left both have a valid colour, and they're different, then you should "merge" names.

At the end, you just want to count the number of different names, which is easy if you used integers, and reused "free"d ones : that's your "highest" name, minus the number of free'd ones remaining at the end.

This would be quite efficient if the matrix is given as a continuous stream of rows of booleans. If the matrix is given as columns, you may want to adapt the algorithm.

 

This is pretty detail. To be honest, I have to read it a couple times for it to make sense. I just want to do good enough for now to pass an interview. :-p

 

An island is surrounded by a group of adjacent cells that are all 1s.

Do you mean it's surrounded by adjacent 0's?

 

a correct way to say it would be "an island is a group of adjacent cells that are all 1s, surrounded by 0s or array's margins"

 

This picture should clarify what it means. Each color group represents each island.
island

 

Thanks, that's what I thought it meant, but wasn't sure.

 

I was asked this question in an interview(not Amazon). Initially I thought that I won't be able to solve it, but was able to with some hints from the interviewer. My solution was in C# and was similar to yours.

 

awesome. It always felt great to be able to find the solution in the end. Did you end up getting that job?

 

Umm, call me dumb, but according to the definition you have provided, there is only one island - centered at the second row fourth column.
The solution rather says an island is an isolated one, or a group of adjacent ones, diagonals excluded; that would give six as a result indeed.

 

Ha! I actually asked the exact same challenge by Amazon few months back. Sadly unable to solve it during the interview itself. But 1.5 hours afterwards, I solved it and then sent my solution to them. Didn't land the job though :)

 

I think you could also solve it using Flood Fill
Algorithm, which is similar to your approach. Thank you for sharing!

 

I have no idea what Flood Fill algorithm is. Will take a look at it. Thanks

 

Tbh, this is a very basic question, solvable in linear time with graph traversal algorithms like BFS/DFS.