In computer programming, tracing garbage collection is a form of automatic memory management that consists of determining which objects should be deallocated ("garbage collected") by tracing which objects are reachable by a chain of references from certain "root" objects, and considering the rest as "garbage" and collecting them. Tracing is the most common type of garbage collection – so much so that "garbage collection" often refers to the tracing method, rather than others such as reference counting – and there are a large number of algorithms used in implementation.
Tracing collectors are so called because they trace through the working set of memory. These garbage collectors perform collection in cycles. It is common for cycles to be triggered when there is not enough free memory for the memory manager to satisfy an allocation request. But cycles can often be requested by the mutator directly or run on a time schedule. The original method involves a naïve mark-and-sweep in which the entire memory set is touched several times.
In the naive mark-and-sweep method, each object in memory has a flag (typically a single bit) reserved for garbage collection use only. This flag is always cleared, except during the collection cycle.
The first stage is the mark stage which does a tree traversal of the entire 'root set' and marks each object that is pointed to by a root as being 'in-use'. All objects that those objects point to, and so on, are marked as well, so that every object that is reachable via the root set is marked.
In the second stage, the sweep stage, all memory is scanned from start to finish, examining all free or used blocks; those not marked as being 'in-use' are not reachable by any roots, and their memory is freed. For objects which were marked in-use, the in-use flag is cleared, preparing for the next cycle.
This method has several disadvantages, the most notable being that the entire system must be suspended during collection; no mutation of the working set can be allowed. This can cause programs to 'freeze' periodically (and generally unpredictably), making some real-time and time-critical applications impossible. In addition, the entire working memory must be examined, much of it twice, potentially causing problems in paged memory systems.
Top comments (0)