Discussion on: Removing duplicates in an Array of Objects in JS with Sets

View post

Interesting challenge. So I've three very similar solutions. They are all based on the same principle of reducing the array into a key-value structure and re-creating the array from the values only.

Approach 1: Classical Reducer

(reducer maintains immutability)

/**
 * classic reducer
 **/
const uniqByProp = prop => arr =>
  Object.values(
    arr.reduce(
      (acc, item) =>
        item && item[prop]
          ? { ...acc, [item[prop]]: item } // just include items with the prop
          : acc,
      {}
    )
  );

// usage:

const uniqueById = uniqByProp("id");

const unifiedArray = uniqueById(arrayWithDuplicates);

Depending on your array size, this approach might easily become a bottleneck in your app. More performant is to mutate your accumulator object directly in the reducer.

Approach 2: Reducer with object-mutation

/**
 * using object mutation
 **/
const uniqByProp = prop => arr =>
  Object.values(
    arr.reduce(
      (acc, item) => (
        item && item[prop] && (acc[item[prop]] = item), acc
      ), // using object mutation (faster)
      {}
    )
  );

// usage (same as above):

const uniqueById = uniqByProp("id");

const unifiedArray = uniqueById(arrayWithDuplicates);

The larger your input array, the more performance gain you'll have from the second approach. In my benchmark (for an input array of length 500 - with a duplicate element probability of 0.5), the second approach is ~440 x as fast as the first approach.

Approach 3: Using ES6 Map

My favorite approach uses a map, instead of an object to accumulate the elements. This has the advantage of preserving the ordering of the original array:

/**
 * using ES6 Map
 **/
const uniqByProp_map = prop => arr =>
  Array.from(
    arr
      .reduce(
        (acc, item) => (
          item && item[prop] && acc.set(item[prop], item),
          acc
        ), // using map (preserves ordering)
        new Map()
      )
      .values()
  );

// usage (still the same):

const uniqueById = uniqByProp("id");

const unifiedArray = uniqueById(arrayWithDuplicates);

Using the the same benchmark conditions as above, this approach is ~2 x as fast as the second approach and ~900 x as fast as the first approach.

Conclusion

Even if all three approaches are looking quite similar, they have surprisingly different performance footprints.

You'll find the benchmarks I used here: jsperf.com/uniq-by-prop