tommy-3

Posted on

# Learning Algorithms with JS, Python and Java 6: Array Chunking

This is the sixth article of my attempts to follow Stephen Grider's Udemy course in three different languages. JavaScript solutions are by Stephen. I try to "translate" it into Python and Java.

Today's question is:

--- Directions
Given an array and chunk size, divide the array into many subarrays
where each subarray is of length size
--- Examples
chunk([1, 2, 3, 4], 2) --> [[ 1, 2], [3, 4]]
chunk([1, 2, 3, 4, 5], 2) --> [[ 1, 2], [3, 4], [5]]
chunk([1, 2, 3, 4, 5, 6, 7, 8], 3) --> [[ 1, 2, 3], [4, 5, 6], [7, 8]]
chunk([1, 2, 3, 4, 5], 4) --> [[ 1, 2, 3, 4], [5]]
chunk([1, 2, 3, 4, 5], 10) --> [[ 1, 2, 3, 4, 5]]

I add to each solution the time (ms) it took to divide the array with 10,000,000 elements into subarrays with 1,000 elements.

JavaScript:

``````function chunk1(array, size) { // 227.480ms
const chunked = [];

for (let element of array) {
const lastChunk = chunked[chunked.length - 1];

if (!lastChunk || lastChunk.length === size) {
chunked.push([element]);
} else {
lastChunk.push(element);
}
}

return chunked;
}
``````

Python:

``````def chunk1a(lst: list, size: int) -> list: # 2409.636ms
chunked = []

for element in lst:
if not chunked or len(chunked[-1]) == size:
chunked.append([])
last_chunk = chunked[-1]
last_chunk.append(element)

return chunked
``````

Like in exercise 4, we can't start with `last_chunk = chunked[-1]` as in JS because it would cause an IndexError.

Since this looks for the last element of `chunked` twice, it gets a little faster by rewriting it as:

``````def chunk1b(lst: list, size: int) -> list: # 2014.493ms
chunked = []

for element in lst:
if not chunked:
chunked.append([])
last_chunk = chunked[-1]
if len(last_chunk) == size:
last_chunk = []
chunked.append(last_chunk)
last_chunk.append(element)

return chunked
``````

I also thought of using collections.deque instead of a list:

``````from collections import deque

def chunk1c(lst: list, size: int) -> list: # 2618.956ms
chunked = deque()

for element in lst:
if not chunked or len(chunked[-1]) == size:
chunked.append([])
last_chunk = chunked[-1]
last_chunk.append(element)

return list(chunked)
``````

but this resulted in a bit longer execution time than the first solution.

Java:

``````import java.util.ArrayList;
import java.util.List;

public static List<List<Integer>> chunk1a(List<Integer> list, int size) { // 2072.358ms
List<List<Integer>> chunked = new ArrayList<>();

for (int element : list) {
if (chunked.isEmpty() || chunked.get(chunked.size() - 1).size() == size) {
}
List<Integer> lastChunk = chunked.get(chunked.size() - 1);
}

return chunked;
}
``````

A solution like the Python 1b is much faster than the first one.

``````import java.util.ArrayList;
import java.util.List;

public static List<List<Integer>> chunk1b(List<Integer> list, int size) { // 404.818ms
List<List<Integer>> chunked = new ArrayList<>();

for (int element : list) {
if (chunked.isEmpty()) {
}
List<Integer> lastChunk = chunked.get(chunked.size() - 1);
if (lastChunk.size() == size) {
lastChunk = new ArrayList<>();
}
}

return chunked;
}
``````

It can be improved even more when I use LinkedLists:

``````import java.util.LinkedList;
import java.util.List;

public static List<List<Integer>> chunk1c(List<Integer> list, int size) { // 295.885ms

for (int element : list) {
if (chunked.isEmpty()) {
}
List<Integer> lastChunk = chunked.getLast();
if (lastChunk.size() == size) {
lastChunk = new ArrayList<>();
}
}

return chunked;
}
``````

Incidentally, here's the LinkedList version of the first Java code, and it's much slower than any solution. I wonder why because it looks to me essentially the same as the 1c above.

``````public static List<List<Integer>> chunk1d(List<Integer> list, int size) { // 4556.835ms

for (int element : list) {
if (chunked.isEmpty() || chunked.getLast().size() == size) {
}
List<Integer> lastChunk = chunked.getLast();
}

return chunked;
}
``````

This post has got longer than I'd expected, but now comes the second set of solutions, which are more concise and also faster.

# 2: Using a slice method

JavaScript:

``````function chunk2(array, size) { // 83.652ms
const chunked = [];
let index = 0;

while (index < array.length) {
chunked.push(array.slice(index, index + size));
index += size;
}

return chunked;
}
``````

Python:

``````def chunk2a(lst: list, size: int) -> list: # 240.898ms
chunked = []
index = 0

while index < len(lst):
chunked.append(lst[index:index+size])
index += size

return chunked
``````

A Pythonic one-liner:

``````def chunk2b(lst: list, size: int) -> list: # 234.880ms
return [lst[i:i+size] for i in range(0, len(lst), size)]
``````

Java:

``````import java.util.ArrayList;
import java.util.List;
import java.lang.Math;

public static List<List<Integer>> chunk2(List<Integer> list, int size) { // 1.250ms
int index = 0;

while (index < list.size()) {