loading...

All the ways to collect garbage

ishanigupta27 profile image Ishani Gupta Updated on ・2 min read

As promised in my previous blog, I am writing about some of the techniques used by different programming languages to automatically collect the dead/no-more-used objects during runtime.

There are broadly three schemes/techniques which is the core of garbage collection. I will try writing about these three in the as straightforward way as possible. Also, I will go one step ahead and discuss the fourth technique, developed by the combination of the three schemes. Let's start!

1. Reference Counting

The idea behind this type of garbage collection is to keep track of usage of an object at every point of the program. This idea means as soon as an object becomes garbage (To know when an object becomes eligible for waste, read geeksforgeeks), the space occupied by the object is rehabilitated by putting the free space address in a free list. Then, new objects are allocated by matching the required space with the available spaces on the free list. This concept appears very attractive because it makes sense to do garbage collection as soon as garbage is there. Does it? Take a step back from the virtual world, and enter the real world. Do you run to the dumpster at the end of your lane every time you have to throw away the gum packet? You don't, right? You collect the garbage in your dustbin, and when the bin is full, you go to the dumpster. In computer science, this is known as a lazy approach, i.e., to take action only when needed. This leads us to the second technique available for garbage collection.

Run to throw away your garbage

2. Mark and Sweep.

Let's enter the real world again. Whenever you dustbin is full, you stop all your essential works ( Well, that could be watching Beyonce video too) and dump the dustbin to the dumpster at the end of your lane. This concept that process stops doing anything else and throws away the garbage is known as "Stop-the-World." Mark and Sweep is one of the algorithms which builds upon Stop the World. Whenever the free memory for allocation is finished, the garbage collector kicks in and starts working. This being clear, let's see how the algorithm works once the process is stopped from working. It works by having two phases. Firstly, it goes to all the registers, global variables and local variables {This is known as root set} and goes to all the objects reachable from the root set. It marks these reachable objects ( the live objects) and exits. Then there is the second phase of sweeping the whole heap. During this sweeping, everything not marked is the garbage, and their corresponding addresses are inscribed in the free list for future object allocations.

Writing this blog, I feel I have given a lot of information in very few words with few dangling pieces of information. So, I will stop here for this blog and continue about the other two types of garbage collectors in my next blog.

Till then, which languages you use and do you have any idea which type of garbage collectors they use.

Happy coding.

Discussion

pic
Editor guide
Collapse
610yesnolovely profile image
Harvey Thompson

Great easy to read summary of these types of Garbage Collection, look forward to the sequel.

Some languages I've used:

  • C++ - programmer is responsible. With great power comes great responsibility.
  • Java - fancy mark sweep (parallel, generational, compacting?)
  • C# - fancy mark sweep (parallel, generational, compacting)
  • Python - reference counting (mark sweep for reference cycles).
  • Lisp - mark sweep typically (eg. Emacs Lisp)

I'm also a programming language design/implementation enthusiast and have written several (non-published) languages. Typically I prefer stop-the-world mark/sweep because it's the least intrusive and simplest to implement for compile languages. For interpreted languages, reference counting can be easier to start with. Either way you can refine these later when it becomes an issue (it will do if you want to get serious).

A very interesting language to look at is Rust which puts the responsibility of memory management in the hands of the programmer, but really enforces this through life time annotations. Initially mind blowing, but it's essentially the power of something like C++ without so much danger (though at the cost of hurting your brain and hands more).

If anyone wishes to read more (or too much perhaps), I can recommend this book which I have read:

Garbage Collection Algorithms - 1996

Or it's newer sequel, which I have not read (yet):

Garbage Collection Handbook - 2011

See also Wikipedia - Garbage Collection which is a good jumping off point to find out more information.

For most programmers I recommend at least having a high level understanding of the garbage collection used by each language, and be aware of the trade-offs.

Collapse
ishanigupta27 profile image
Ishani Gupta Author

Telepathy!

I am working on Rust nowadays to analyze its performance quantitatively and whether the feature of "Ownership" is worth sacrificing performance!

Thanks for these awesome links !!

Collapse
jcsvveiga profile image
João Veiga

You might be interested in that, check Pony's capabilities tutorial.ponylang.org/capabilities...