1 Title: Understanding the current GC, part II
2 Tags: en, d, dgc, understanding the current gc, druntime, gc, mark-sweep, gcx
4 Back to the `analysis of the current GC implementation`__, in this post I will
5 focus on the ``Gcx`` object structure and methods.
7 __ /blog/blog/post/250bf643
14 ``roots`` (``nroots``, ``rootdim``)
15 An array of root pointers.
17 ``ranges`` (``nranges``, ``rangedim``)
18 An array of root ranges (a range of memory that should be scanned for
21 Beginning of the stack (``stackBottom``)
22 A pointer to the stack bottom (assuming it grows up).
24 ``Pool`` table (``pooltable``, ``npools``)
25 An array of pointers to ``Pool`` objects (the heap itself).
27 Free list (``bucket``)
28 A free list for each ``Bins`` size.
32 Set if the marking of a range has actually marked anything (and then
33 using in the full collection.
36 Set if the GC has been initialized.
38 Behaviour changing attributes
40 Don't scan the stack if activated.
43 Turn on logging if activated.
46 Don't run the collector if activated.
48 Cache (for optimizations and such)
49 ``p_cache``, ``size_cache``
50 Querying the size of a heap object is an expensive task. This caches
51 the last query as an optimization.
53 ``minAddr``, ``maxAddr``
54 All the heap is in this range. It's used as an optimization when
55 looking if a pointer can be pointing into the heap (if the pointer is
56 not in this range it can be safely discarded, but if it's in the range,
57 a full search in the ``pooltable`` should be done).
64 Initialization, set the ``Gcx`` object attributes to 0, except for the
65 ``stackBottom`` (which is set to the address of a dummy local variable,
66 this works because this function is one of the first functions called by
67 the runtime) and the ``inited`` flag, that is set to 1. The log is
71 Destruction, free all the memory.
74 ``addRoot(p)``, ``removeRoot(p)``, ``rootIter(dg)``
75 Add, remove and iterate over single root pointers.
77 ``addRange(pbot, ptop)``, ``remove range(pbot)``, ``rangeIter(dg)``
78 Add, remove and iterate over root pointer ranges. This methods are
79 almost the same as the previous ones, so the code duplication here can
83 Each ``Bin`` has some flags associated (as explained before). With this
84 functions the user can manipulate some of them:
86 * ``FINALIZE``: this pool has destructors to be called (*final* flag)
87 * ``NO_SCAN``: this pool should not be scanned for pointers (*noscan* flag)
88 * ``NO_MOVE``: this pool shouldn't be moved (not implemented)
90 ``getBits(pool, biti)``
91 Get which of the flags specified by ``biti`` are set for the ``pool``
94 ``setBits(pool, mask)``
95 Set the flags specified by ``mask`` for the ``pool`` ``Pool``.
97 ``clrBits(pool, mask)``
98 Clear the flags specified by ``mask`` for the ``pool`` ``Pool``.
102 Find the ``Pool`` object that pointer ``p`` is in.
105 Find the base address of block containing pointer ``p``.
108 Find the size of the block pointed by ``p``.
111 Get information on the pointer ``p``. The information is composed of:
112 ``base`` (the base address of the block), ``size`` (the size of the
113 block) and ``attr`` (the flags associated to the block, as shown in
114 *Flag manipulation*). This information is returned as a structure
115 called the ``BlkInfo``.
118 Compute ``Bins`` (bin size) for an object of size ``size``.
120 Heap (``pagetable``) manipulation
121 The ``pooltable`` is kept sorted always.
124 Allocate a new ``Pool`` of at least size bytes.
127 Minimizes physical memory usage by returning free pools to the
131 Allocate a chunk of memory that is larger than a page.
134 Allocate a new ``Pool`` with at least ``npages`` pages in it.
137 Allocate a page of ``bin`` size.
141 This is the mark phase. It search a range of memory values and mark any
142 pointers into the GC heap. The ``mark`` bit is set, and if the
143 ``noscan`` bit is unset, the ``scan`` bit is activated (indicating that
144 the block should be scanned for pointers, equivalent to coloring the
145 cell grey in the `tri-colour abstraction`_).
147 The mark phase is not recursive (nor a mark stack is used). Only the
148 passed range is marked, pointers are not followed here.
150 That's why the ``anychanges`` flag is used, if anything has got marked,
151 ``anychanges`` is set to *true*. The marking phase is done iteratively
152 until no more blocks are marked, in which case we can safely assume
153 that we marked all the live blocks.
155 ``fullcollectshell()``
156 The purpose of the *shell* is to ensure all the registers get put on
157 the stack so they'll be scanned.
159 ``fullcollect(stackTop)``
160 Collect memory that is not referenced by the program. The algorithm is
163 1. Stop the world (all other threads)
164 2. Clear all the mark/scan bits in the pools
165 3. Manually mark each free list entry (``bucket``), so it doesn't get
167 4. ``mark()`` the static data
168 5. ``mark()`` stacks and registers for each paused thread
169 6. ``mark()`` the root set (both ``roots`` and ``ranges``)
170 7. ``mark()`` the heap iteratively until no more changes are detected
171 (``anychanges`` is *false*)
172 8. Start the world (all other threads)
173 9. Sweep (free up everything not marked)
174 10. Free complete pages, rebuild free list
177 This is a very summarized version of the algorithm, what I could
178 understand from a quick look into the code, which is pretty much
179 undocumented. A deeper analysis should be done in a following post.
181 .. _`tri-colour abstraction`: http://en.wikipedia.org/wiki/Garbage_collection_(computer_science)#Tri-colour_marking
183 .. vim: set et sw=4 sts=4 :