Garbage collection

When the heap is full, a garbage collector runs to free memory, after which the program continues.

The Clean run time system has two main garbage collectors, a copying collector and a mark-scan collector. By default, the copying collector is used. The marking collector can be turned on by compiling with clm -gcm, setting Application.MarkingCollector in a cpm project file, or running a program with -gcm.

Additionally, both collectors will switch automatically to a compacting collector when it determines this is more efficient. This cannot be disabled, nor can it be forced.

Running a program with -gc (or compiling with clm -gc, or setting Application.ShowGC in a cpm project file) makes it show the size of the heap after a garbage collection run. The collectors prints this information in a different way, by which you can recognize which one has run:

Collector type String with -gc
Copying Heap use after garbage collection: XXX Bytes.
Mark-scan Marked: XXX Bytes.
Compacting Heap use after compacting garbage collection: XXX Bytes.

The copying collector

See Cheney’s algorithm on Wikipedia. Clean’s copying collector is similar to this algorithm.

In this algorithm the heap is divided into two semi-spaces, and only one is used at any moment. As such, memory consumption is doubled.

This garbage collector is especially efficient when a program allocates many short-lived nodes. It becomes less efficient when the size of the live set begins to approach the heap size, at which point the switch to the compacting collector is made.

The mark-scan collector

(There is no description for this garbage collector yet.)

The compacting collector

The compacting collector is more complex, and hence less efficient, than the copying collector, but has the advantage that memory consumption is not doubled.

In the mark-scan setup, the marking phase does not compact memory, so every now and then a compacting phase is needed to make sure the mark-scan iterations are sufficiently efficient.