Sunday 23 October 2011

.Net Garbage Collection, Fundamentals

The Managed Heap
The managed heap is an area of memory which is controlled by the .Net CLR. It holds the memory allocated for each object created in the .Net managed environment.
The heap consists of 2 sub-heaps, the small object heap and the large object heap. Objects larger than 85k go to the large object heap.

The Garbage Collector
The garbage collector (GC from now on) controls the allocation, deallocation and recycling of unused memory on the managed heap.

Object Roots
The JIT compiler keeps a note of the roots of each object on the managed heap; a root is effectively a holder of a reference to an object on the heap. An object can be referenced by, among others, static objects, local objects, parameters and pointers on CPU registers.

Garbage Collection
When the GC performs a collection, it searches for objects with no roots, and these are deemed unreachable. As any object could also be referenced by other objects on the heap, this search is performed recursively. When the list of reachable objects is identified, the GC compacts them and moves them to the lowest possible memory address into a contiguous block (think defragmentation of a hard drive). All references to the remaining objects are updated to the new memory addresses, and the pointer to the next available memory address is updated.

Object Creation
When an object reference is created, the GC attempts to allocate a block of contiguous memory on the heap; if there is no block available, the GC performs a collection. If there is still no block available, an OutOfMemoryException is thrown.

Object Generations
When the managed heap is initialised, it contains no objects; as each new small object is added, it is deemed to be in generation 0. Any large object will be added to the large object heap as generation 2. When the heap fills and a GC is required, all surviving objects are promoted to generation 1. The next time the GC compacts, any surviving generation 1 objects are promoted to generation 2. This is the highest generation that is used by the GC.
This generational GC allows optimisation, by inspecting only the lowest generations required to free enough memory to allow new objects to be created.
The assumption behind this optimisation is that new objects are expected to be short-lived, and that objects created at the same time are likely to be used then destroyed together.

Forcing a Collection
As a developer, if you need to force the GC to perform a garbage collection (for example, you have a complex form with lots of controls that has closed any you want to immediately reclaim the memory), call the System.GC.Collect() method. Overloads of this method allows you to specify the maximum generation of objects to collect, so specifying 1 would force the GC to collect generations 0 and 1. Another overload allows the developer to specify the collection mode, Forced, Optimized or Default. Optimized allows the GC to determine if a collection is required.

Reference articles:
MSDN Library article on Garbage Collection

MSDN magazine articles:
Automatic Memory Management in the Microsoft .NET Framework, Part 1
Automatic Memory Management in the Microsoft .NET Framework, Part 2

Next time... Finalize() - when and why you should avoid it

1 comment: