1 Title: Understanding the current GC
2 Tags: en, d, dgc, understanding the current gc, druntime, gc, mark-sweep, intro, pool, bin
4 Oh, yeah! A new year, a new air, and the same thesis =)
6 After a *little* break, I'm finally starting to analyze the current
7 D (druntime) GC (basic) implementation in depth.
9 First I want to say I found the code really, but **really**, hard to read and
10 follow. Things are split in several parts without apparent reason, which make
11 it really hard to understand and it's pretty much undocumented.
13 I hope I can fully understand it in some time to be able to make a full rewrite
14 of it (in a first pass, conserving the main design).
19 I'll start with a big picture overview, and then I'll try to describe each
20 component with more detail.
22 The implementation in split in several files:
25 I didn't took a look at this one yet, but I guess it's about stats =).
28 A custom bitset implementation for collector bit/flags (mark, scan, etc.).
31 A wrapper for memory allocation with several versions (malloc, win32, mmap
32 and valloc). 4 functions are provided: map, unmap, commit and decommit. The
33 (de)commit stuff if because (in Sean Kelly's words) Windows has a 2-phase
34 allocation process. You can reserve the address space via map and unmap,
35 but the virtual memory isn't actually created until you call commit. So
36 decommit gets rid of the virtual memory but retains ownership of the
40 The real GC implementation, split in 2 main classes/structs: GC and Gcx. GC
41 seems to be a thin wrapper over Gcx that only provides the allocation logic
42 (alloc/realloc/free) and Gcx seems to be the responsible for the real GC
43 work (and holding the memory).
46 This is just a thin wrapper over gcx.d to adapt it to the `druntime GC
49 __ http://www.dsource.org/projects/druntime/wiki/GarbageCollectorD2_0
52 The Gcx struct is where most magic happens. It holds the GC memory organized in
53 pools. It holds the information about roots, the stack and free list, but in
54 this post I'll focus in the memory pools:
60 A pool is a group of pages, each page has a bin size (``Bins``) and host
61 a fixed number of bins (``PAGESIZE / Bins``, for example, if ``Bins == 1024``
62 and ``PAGESIZE == 4096``, the page holds 4 bins).
64 Each bin has some bits of information:
67 Setted when the Bin is visited by the *mark phase*.
70 Setted when the Bin is has been visited by the *mark phase* (the ``mark``
71 bit is set) but it has pointers yet to be scanned.
74 Setted when the Bin is free (linked to a free list).
77 The object stored in this bin has a destructor that must be called when
81 This bin should be not scanned by the collector (it has no pointers).
85 +----------------------------------------+-----+-----------------+
86 | Page 0 (bin size: Bins) | ... | Page (npages-1) |
88 | +--------+-----+---------------------+ | | |
89 | | Bin 0 | ... | Bin (PAGESIZE/Bins) | | | |
90 | +--------+-----+---------------------+ | | |
91 | | mark | ... | | | | |
92 | | scan | ... | | | | ... |
93 | | free | ... | ... | | | |
94 | | final | ... | | | | |
95 | | noscan | ... | | | | |
96 | +--------+-----+---------------------+ | | |
97 +----------------------------------------+-----+-----------------+
103 A single chunk of memory is allocated for the whole pool, the ``baseAddr``
104 points to the start of the chunk, the ``topAddr``, to the end. A ``pagetable``
105 holds the bin size (``Bins``) of each page ::
107 . ,-- baseAddr topAddr --,
110 |--- committed pages ---,------ uncommitted pages -------|
112 +--------+--------+-----+--------+-----+-----------------+
113 memory | Page 0 | Page 1 | ... | Page i | ... | Page (npages-1) |
114 +--------+--------+-----+--------+-----+-----------------+
117 +--------+--------+-----+--------+-----+-----------------+
118 pagetable | Bins 0 | Bins 1 | ... | Bins i | ... | Bins (npages-1) |
119 (bin size) +--------+--------+-----+--------+-----+-----------------+
121 The bin size can be one of:
124 The ``XXX`` is a power of 2 from 16 to 4096. The special name ``B_PAGE`` is
125 used for the size 4096.
128 The whole page is a continuation of a large object (the first page of
129 a large object has size ``B_PAGE``).
132 The page is completely free.
135 The page is not committed yet.
138 Not really a value, used for iteration or allocation. Pages can't have this
141 The information bits are stored in a custom bit set (``GCBits``). ``npages
142 * PAGESIZE / 16`` bits are allocated (since the smallest bin is 16 bytes long)
143 and each bit is *addressed* using this formula::
145 bit(pointer) = (pointer - baseAddr) / 16
147 This means that a bit is reserved each 16 bytes. For large bin sizes, a lot of
150 The minimum pool size is 256 pages. With 4096 bytes pages, that is 1 MiB.
152 The ``GCBits`` implementation deserves another post, it's a little complex and
153 I still don't understand why.
156 .. vim: set et sw=4 sts=4 :