DEV Community

Cover image for Learn To Clone Like A Sith Lord
Adam Nathaniel Davis
Adam Nathaniel Davis

Posted on • Edited on

Learn To Clone Like A Sith Lord

[NOTE: The cloning utilities discussed in this article are now in their own NPM package. You can find them here: https://www.npmjs.com/package/@toolz/clone]

I'm going to highlight the strengths-and-weaknesses of "native" methods for cloning objects/arrays. Then I'm going to show how to create a custom, recursive approach that will faithfully clone ALL THE THINGS.

In most programming languages, objects (and their nephews, arrays) are passed by reference. This is an incredibly useful (and powerful) concept that can be leveraged to do all sorts of impressive things. But one instance where it can feel like a hindrance is when we need to get a full, fresh, clean, standalone copy of an object/array. In other words, there are times when you want a full-fledged clone of an object/array. But this process is not exactly "straight forward".

Tricky References

The simplest version of an object might look something like this:

const phantomMenace = { master: 'palpatine', apprentice: 'maul' };
Enter fullscreen mode Exit fullscreen mode

One of the first gotchas that new devs run into is when they try to "copy" the object, like this:

const phantomMenace = { master: 'palpatine', apprentice: 'maul' };
const attackOfTheClones = phantomMenace;
attackOfTheClones.apprentice = 'dooku';
console.log(phantomMenace.apprentice);  // dooku(!)
Enter fullscreen mode Exit fullscreen mode

Code like this is a common source of confusion. Just by giving it a quick read-through, it's easy to come to the (mistaken) conclusion that phantomMenace and attackOfTheClones are each independent entities. Continuing with this (flawed) logic, it's tempting to think that console.log(phantomMenace.apprentice); will output 'maul', because the value was set to 'maul' in the phantomMenace object, and it was only set to 'dooku' on the attackOfTheClones object, and not on the phantomMenace object.

Of course, the reality's quite different. attackOfTheClones is not a standalone entity. Instead, it's nothing but a pointer referring back to the original phantomMenace object. So when we update the contents of attackOfTheClones, the change is also reflected in phantomMenace.

For this reason, it can sometimes be desirable to have a true, clean, standalone copy of an object/array. An entity that has all the same information as its source - but will act independently after we've copied it. In other words, sometimes we need a full clone of an object/array.

Spread Operators

One very fast, very easy way to clone objects is with the new(ish) spread operator. That would look like this:

const phantomMenace = { master: 'palpatine', apprentice: 'maul' };
const attackOfTheClones = {...phantomMenace};
attackOfTheClones.apprentice= 'dooku';
console.log(phantomMenace.apprentice);  // maul
Enter fullscreen mode Exit fullscreen mode

This is so simple that it's tempting to throw out all of your "old" object-cloning tools in favor of spread operators. Unfortunately, this is only "simple" when the object you're cloning is simple. Consider this slightly-more-complex example:

const phantomMenace = { 
  master: 'palpatine', 
  apprentice: 'maul',
  henchmen: {
    one: 'nute gunray',
    two: 'rune haako',
  },
};
const attackOfTheClones = {...phantomMenace};
attackOfTheClones.henchmen.one = 'jar jar binks';
console.log(phantomMenace.henchmen.one);  // jar jar binks(!)
Enter fullscreen mode Exit fullscreen mode

We're back to the original problem. We "cloned" phantomMenace. Then we made a change to attackOfTheClones. And then the change was reflected in the original phantomMenace object. Why did this happen?

The problem occurs because all objects are passed by reference, not just the parent object. In the example above, there are two objects - one nested inside the other.

Using the spread operator, a brand new object was created as attackOfTheClones. However, when the spread operator was doing its magic, it encountered another object when it reached the henchmen key. So it copied that object over by reference. This brings us right back to square one.

Theoretically, you can address this problem by doing this:

const phantomMenace = { 
  master: 'palpatine', 
  apprentice: 'maul',
  henchmen: {
    one: 'nute gunray',
    two: 'rune haako',
  },
};
const attackOfTheClones = {
  ...phantomMenace,
  henchmen: {...phantomMenace.henchmen},
};
attackOfTheClones.henchmen.one = 'jar jar binks';
console.log(phantomMenace.henchmen.one);  // nute gunray
Enter fullscreen mode Exit fullscreen mode

But this solution is far-from-scaleable. We can't use attackOfTheClones = {...phantomMenace} with universal confidence that it will "just work". We have to manually reconfigure our use of the spread operator every time we're dealing with a multilevel object. Yech... And if our object has many nested layers, we need to recreate all those layer with many nested spread operators. Many nested Yechs...

JSON.parse(JSON.stringify())

This is the solution that I've used for all of my "lightweight" object/array cloning. It uses JSON serialization/de-serialization to break the "connection" between a copied object and its source object. JSON.stringify() converts it into a plain-ol' string - with no knowledge of the originating object. (Because strings are passed by value, not by reference.) JSON.parse() converts it back into a full-fledged JavaScript object, that still bears no connection to the originating object.

This approach looks like this:

const phantomMenace = { 
  master: 'palpatine', 
  apprentice: 'maul',
  henchmen: {
    one: 'nute gunray',
    two: 'rune haako',
  },
};
const attackOfTheClones = JSON.parse(JSON.stringify(phantomMenace));
attackOfTheClones.henchmen.one= 'jar jar binks';
console.log(phantomMenace.henchmen.one);  // nute gunray
Enter fullscreen mode Exit fullscreen mode

It has some strong features in its favor:

  • It maintains scalar data types. So if a value was a Boolean, or a number, or NULL before it was copied, the cloned version will have those same data types.

  • It's perfectly fine if the source object contains other objects (or arrays).

  • It's inherently recursive. So if your source object has 100 nested layers of objects, those will be fully represented in the cloned object.

So is this the ultimate answer?? Umm... not really. I leverage this technique on a fairly regular basis, but it fails entirely when you have more "complex" items in your object.

Consider this example:

const phantomMenace = { 
  master: 'palpatine', 
  apprentice: 'maul',
  henchmen: {
    one: 'nute gunray',
    two: 'rune haako',
    fearLeadsTo: () => console.log('the dark side'),
  },
};
const attackOfTheClones = JSON.parse(JSON.stringify(phantomMenace));
console.log(attackOfTheClones.henchmen.fearLeadsTo()); 
Enter fullscreen mode Exit fullscreen mode

Oops.

The console tells us Uncaught TypeError: attackOfTheClones.henchmen.fearLeadsTo is not a function. This happens because functions don't survive the serialization process. This is a pretty big gotcha because most modern JavaScript frameworks - like React - are heavily based upon the idea that our objects can contain functions.

There's another nasty problem with this approach that presents itself in React. It comes up when you try to do this:

export default function StarWars() {
  const phantomMenace = { key: <Prequel1/>};
  const attackOfTheClones = JSON.parse(JSON.stringify(phantomMenace));
  return <div>A long time ago, in a galaxy far far away...</div>;
}
Enter fullscreen mode Exit fullscreen mode

This example won't even compile. It throws an error that reads TypeError: Converting circular structure to JSON. Explaining exactly why that happens would require an entirely new post. Just suffice it to say that you can't serialize React components. And in a large enough app, it's not uncommon to find that you occasionally have objects that contain React components.

Third-Party Cloning Tools

Obviously, I'm not the first person to ponder these challenges. And there are a number of NPM utilities that will allow you to get a deep clone of an object or an array. I don't have any "problem" with such utilities. I'm not going to review them all here. You can have fun googling all those solutions on your own. Some of them are quite good.

But one of my pet peeves is when we import all sorts of outside packages/libraries to do something in JavaScript that we could easily do on our own with plain ol' programming. The reason why most people don't code this up on their own is because, to do it properly, you need to use recursion. And recursion feels to many devs like... the dark side.

Cloning the Sith Way

If we want to "clone like a Sith lord", there's no way that I know to accomplish it without going to the dark side. In other words, we must utilize recursion. Since every object/array can contain a theoretically-endless number of nested objects/arrays, we can't get by with a simple for/while loop. We need something that has the ability to call itself. This isn't "hard". But it steps outside of some devs' comfort zones.

First, let's create a decent test object that will ensure our cloning utilities will truly rise to the task. I'll be using this:

const original = {
  one: '1',
  two: '2',
  nest1: {
    four: '4',
    five: '5',
    header: <SiteHeader/>,
    nest2: {
      seven: '7',
      eight: '8',
      function1: () => console.log('the function'),
    },
    nest3: [
      {
        john: 'doe',
        mary: 'mack',
      },
      {
        butcher: 'brown',
        karen: 'conroy',
      },
      <AnotherComponent/>,
    ],
  },
};
Enter fullscreen mode Exit fullscreen mode

This is a fairly robust object. We have objects inside objects. We have an array inside a (nested) object. We have a function inside one of the nested objects. We have a React component inside one of the nested objects. We have another React component inside the nested array.

First, I want a convenient way to test whether something is an object or an array. To do that, I'm going to use my is() utility. I wrote about that here:
https://dev.to/bytebodger/javascript-type-checking-without-typescript-21aa

Second, the logic for recursively cloning an object is slightly different than the logic for recursively cloning an array. So I'm going to create two separate, but interdependent, functions.

The code looks like this:

const cloneArray = (originalArray = []) => {
  const suppressError = true;
  if (!is.anArray(originalArray))
    return;
  return originalArray.map(element => {
    if (React.isValidElement(element))
      return element; // valid React elements are pushed to the new array as-is
    if (is.anObject(element, suppressError))
      return cloneObject(element); // push the CLONED object to the new array
    if (is.anArray(element, suppressError))
      return cloneArray(element);  // push the CLONED array to the new array
    return element;  // if it's neither an array nor an object, just push it to the new array
  });
};

const cloneObject = (originalObject = {}) => {
  const suppressError = true;
  if (!is.anObject(originalObject))
    return;
  let clonedObject = {};
  Object.keys(originalObject).forEach(key => {
    const currentValue = originalObject[key];
    if (React.isValidElement(currentValue))
      clonedObject[key] = currentValue; // valid React elements are added to the new object as-is
    else if (is.anObject(currentValue, suppressError))
      clonedObject[key] = cloneObject(currentValue);  // set this key to the CLONED object
    else if (is.anArray(currentValue, suppressError))
      clonedObject[key] = cloneArray(currentValue);  // set this key to the CLONED array
    else
      clonedObject[key] = currentValue;  // if it's neither an object nor an array, just set this key to the value
  });
  return clonedObject;
};
Enter fullscreen mode Exit fullscreen mode

Notice that when we're drilling through an object/array, and we find another object/array, we need to (again) call cloneObect() or cloneArray(). This ensures that we keep calling cloneObject() or cloneArray() until we finally reach an object/array that has no child objects/arrays. In other words, we have to do this recursively.

So let's put this to the test:

const original = {
  one: '1',
  two: '2',
  nest1: {
    four: '4',
    five: '5',
    header: <SiteHeader/>,
    nest2: {
      seven: '7',
      eight: '8',
      function1: () => console.log('the function'),
    },
    nest3: [
      {
        john: 'doe',
        mary: 'mack',
      },
      {
        butcher: 'brown',
        karen: 'conroy',
      },
      <AnotherComponent/>,
    ],
  },
};
const clone = cloneObject(original);
original.nest1.nest2.eight = 'foo';
console.log(clone);
clone.nest1.nest2.function1();
Enter fullscreen mode Exit fullscreen mode

This passes the test. Merely by calling cloneObject(), we created a true, deeply-nested clone of the original object.

The cloning process throws no errors. The function sitting at clone.nest1.nest2.function has survived the cloning process and can be called directly as part of clone. The React components that were in original are now transferred over to clone and can be used in any standard way you would expect to use a React component. Even though we made a subsequent change to original.nest1.nest2.eight, that change is not reflected in clone.

In other words: clone is a true, deep clone of original, reflecting the exact state of original at the time we created the clone (but not reflecting any future changes that were made to original).

Also, by leveraging two inter-dependent functions, there's no need to start the cloning process with an object. If you need to clone an array, you can call cloneArray(), and that should work the same way, even if the array has many, complex, nested layers - and even if some of those layers consist of objects.

Top comments (17)

Collapse
 
Sloan, the sloth mascot
Comment deleted
Collapse
 
Sloan, the sloth mascot
Comment deleted
Collapse
 
Sloan, the sloth mascot
Comment deleted
 
Sloan, the sloth mascot
Comment deleted
 
Sloan, the sloth mascot
Comment deleted
 
Sloan, the sloth mascot
Comment deleted
 
Sloan, the sloth mascot
Comment deleted
 
Sloan, the sloth mascot
Comment deleted
Collapse
 
macsikora profile image
Pragmatic Maciej

You should almost never do the deep clone. In most what you want is to change one, maybe few elements of the structure. Deep clone is the simplest but also the wrong answer for the problem.

Collapse
 
bytebodger profile image
Adam Nathaniel Davis

Interesting point. I can certainly agree that a deep clone is not needed or appropriate 100% of the time. But I can't give a blanket agreement to the idea that, if "most what you want is to change one, maybe few elements of the structure. Deep clone is the simplest but also the wrong answer for the problem."

It depends, at least partially, on individual coding style. Because I often find myself in scenarios where I need a fresh clone of an object. I've seen some devs' code where they're rarely storing data in objects. Rarely passing them around. Rarely updating their contents. For devs like that, cloning an object is probably an esoteric concept. But that wouldn't make it "wrong".

There are more decision factors that could be involved here. But for me, it usually comes down to a few basic questions:

Will we need to reference the original data structure, in its original form, at some future point in the program? Will we need another copy of that data structure which will, at some point in the future, be altered from its original form? If both of these conditions are TRUE, there's a decent chance that we might need a true, deep clone of the original data structure.

Collapse
 
macsikora profile image
Pragmatic Maciej

First of all, I am not addressing here mutation vs immutability thing (it's a huge subject on its own), I am just against unneeded deep copies.

If you have 1000 elements in array let's say and you update one element, do you clone the whole array (what mean whole is copy of every item)? If you have let's say user object and comments as nested array inside. Do you clone the user and all other comments when you want to add one comment more?

From my side both answers are no. And there are many reasons, first is memory efficiency, and many will say - haha in the FE side it's not important. It's fair argument, but if we go into such deep clone every time we can certainly got a headache on some low-end device, and maybe someday some of this code will land in node.js and there you will have for sure much bigger memory issue.

The second is equality check. As I see you are React dev then you are aware of such thing like useMemo or PureComponent. For any such deep clone any component which should not be rendered will be rendered, as its equality check will got different reference, as we did deep clone of the whole tree of things. So component which let's say was rendering data of the second comment, and it's pure one, will be re-rendered even though there is no change in this comment but some other comment was added. Finally we can end by quite a lot of rendering issues, and believe me I have been there and have seen a lot of such bugs related to unwanted reference change.

Thirdly and most importantly fact that we need deep clone means that probably our data structure is wrong, or we pass too much to consumers which should not be aware of the whole.

Let's say I do have a function addComment, should such function get user data with it's comments? No, such function should get as argument array of comments and new comment data, when we have that we are left with simple flat copy [...comments, newComment]. Such function should not be responsible for dealing with user and copying it's data. The same is true for React component, you probably not pass the whole tree to every component, and every component doesn't dig for it's part, the same should be true for changes. What is good example of such, is decomposition of updates into many reducers in Redux (like it or not, it's just a good example of updating big objects)

What I want to say is - when we decompose the data into smaller pieces there is no need for deep copying. The consumer is responsible for it's part is aware of only small piece, and copy this piece in a most flat way.

Thread Thread
 
bytebodger profile image
Adam Nathaniel Davis • Edited

If you have 1000 elements in array let's say and you update one element, do you clone the whole array (what mean whole is copy of every item)?

This is a straw-man argument that I never made. The original article never said, "You should perform a deep clone on all of your objects - regardless of the size of those objects or the scope of changes that you plan to make." Cloning objects is a tool. Sometimes a hammer is the perfect tool for the job. Sometimes it's ridiculous to use a hammer for a particular job. Showing someone how to clone !== telling someone to always clone.

Finally we can end by quite a lot of rendering issues, and believe me I have been there and have seen a lot of such bugs related to unwanted reference change.

But... this is exactly the point of cloning (where appropriate). Sometimes we need to ensure that we've broken the original references. I don't always clone an object. But when I want to avoid unwanted reference changes, I absolutely reach for cloning.

Thirdly and most importantly fact that we need deep clone means that probably our data structure is wrong, or we pass too much to consumers which should not be aware of the whole.

I get that. You don't want to pass Google to the consumer if the consumer has no need for All Of Google. No argument there. But there are absolutely times when the consumer needs All The Data from an object - but I don't want changes to the consumer's object to be reflected back in the source object.

What I want to say is - when we decompose the data into smaller pieces there is no need for deep copying. The consumer is responsible for it's part is aware of only small piece, and copy this piece in a most flat way.

This isn't "wrong" - but it makes a lot of assumptions about the code.

For one, it makes assumptions about who is the consumer, and who is the provider. For example, I often have to deal with APIs - as a consumer. I can't dictate the shape in which data is provided to me. In this scenario, I am the consumer. For example, an API response that I consume might look something like this:

{
  data: {
    user: {
      title: 'QA',
      name: 'Joe',
      id: '92354',
    },
    supervisor: {
      title: 'Tech Lead',
      name: 'Mary',
      id: '92042',      
    },
    creator: {
      title: 'Developer',
      name: 'Susan',
      id: '82991',      
    },
  }
}
Enter fullscreen mode Exit fullscreen mode

Furthermore, when I'm dealing with API results, it's often helpful (or even necessary) that I maintain a record of the original result that was returned to me. That's why I wouldn't simply pass apiResult.data to downstream consumers. Because if the downstream consumer changes that data directly on the object that was passed to them, I've now lost the record of the API response's "original state".

In that scenario, I could assign every dang value from the API into their own standalone variables, and then pass every dang standalone variable down the line to the next consumer. But that's often a verbose, and unnecessary waste of code.

So, I could pass that down to the next consumer as:

const response = {
  userTitle: apiResult.data.user.title;
  userName: apiResult.data.user.name;
  userId: apiResult.data.user.id;
  supervisorTitle: apiResult.data.user.title;
  supervisorName: apiResult.data.user.name;
  supervisorId: apiResult.data.user.id;
  creatorTitle: apiResult.data.user.title;
  creatorName: apiResult.data.user.name;
  creatorId: apiResult.data.user.id;
};
processResponse(response);
Enter fullscreen mode Exit fullscreen mode

Or... I could just do this:

processResponse(cloneObject(apiResult.data));
Enter fullscreen mode Exit fullscreen mode

And be done with it. And I'll be honest with you. I don't, in any way, feel "bad" about the second approach.

Collapse
 
bytebodger profile image
Adam Nathaniel Davis

BTW, if any of my responses sound "combative", they're absolutely not meant to be. You make many valid points. I just felt compelled to answer in the way that I did because, in my personal experience, it's almost never accurate to look at any coding technique and denounce it as "wrong". Are there times when it's a bad idea to clone an object?? Sure. Of course. But just because there are instances where cloning is a bad idea doesn't mean that all uses of cloning are a "code smell".

Collapse
 
sanderdebr profile image
sanderdebr

Thanks for this informative post, learned a lot. A lot of times I see react devs copying their state with the spread operator and then modifying a property on their new copy. Did they also changed the original state then while they were thinking they made a new copy?

Collapse
 
bytebodger profile image
Adam Nathaniel Davis • Edited

This is a good point! And, no, the answer is that they are not updating the original state. This happens for several reasons.

First, state is a "special" kinda protected instance in React. Once it's created, you can only update it via this.setState() (in class-based components), or via the custom setter that was created in the useState() Hook (in function-based components). [Little trivia fact here: Even if you're using useState() in a function-based component, it's still using setState() under the covers.]

Second, you have to think about the type of data, and the structure of data that is usually saved in state. Specifically, what I'm talking about is the fact that the state object is usually holding only scalar values. This also means that state data is often only one "layer' deep. This means that you can safely clone it with just a spread operator.

Look at the first example given above under "Spread Operators". This works fine:

let phantomMenace = { master: 'palpatine', apprentice: 'maul' };
const attackOfTheClones = {...phantomMenace};

attackOfTheClones is a true clone of phantomMenace - because phantomMenace contained only a single layer of simple, scalar values. But this does not work fine:

let phantomMenace = { 
  master: 'palpatine', 
  apprentice: 'maul',
  henchmen: {
    one: 'nute gunray',
    two: 'rune haako',
  },
};
const attackOfTheClones = {...phantomMenace};

Now, attackOfTheClones is no longer a true clone, because part of its data - the henchmen object - is a pointer referring back to the original object. In this case, a simple spread operator failed to achieve the objective.

Now think about what you typically see in state values:

export default class Foo extends React.Component {
  state = {
    isLoggedIn: false,
    username: '',
    showLeftNav: true,
  };

  render = () => {
    return (<div>...all the display in here...</div>);
  }
}

In this scenario, if we were to do this: const newObject = {...this.state}; would it result in a perfect clone that has no ties back to the original object? Yes. It would.

Here's another way to think about it. This always creates a fresh, new object: const newObject = {...oldObject}; However, sometimes the oldObject has additional data structures nested inside it. In that case, the additional data structures won't be cloned with a simple spread operator. The newObject will just get a pointer to the additional data structures that are nested inside of oldObject.

So when you are dealing with a "typical" state object that looks like this:

state = {
  isLoggedIn: false,
  username: '',
  showLeftNav: true,
};

The spread operator will always give you a nice, clean clone because there are no nested data structures.

Collapse
 
sanderdebr profile image
sanderdebr • Edited

A compressed version, which also works with methods and complex objects:

const deepClone = obj => {
  let clone = Object.assign({}, obj);
  Object.keys(clone).forEach(key => (clone[key] = typeof obj[key] === 'object' ? deepClone(obj[key]) : obj[key]));
  return clone;
}
Collapse
 
gregfletcher profile image
Greg Fletcher

Nice! I'll join the DarkSide for this.

Some comments have been hidden by the post's author - find out more