Writing a mark-and-sweep tracing GC in Rust
Garbage collection is a well-researched area in compiler development and is particularly necessary in languages that require more than regular dynamic memory management. We'll look into a specific type of garbage collector (GC) referred to as a tracing GC and a related algorithm, mark-and-sweep.
A reachable (live) object is one that is referenced strongly by the GC, which could be used to manage static variables, symbol table entries that live throughout a program's lifetime, etc. An unreachable object, on the contrary, is not referenced in any way by the GC (linked list nodes are a great way to model this structural requirement).
Regular GCs usually deallocate immediately. The issue with this is that it is limited by cyclic references and a range of other issues.
struct Node {
next: Option<Rc<Node>>,
}
let a = Rc::new(Node { next: None });
let b = Rc::new(Node { next: Some(a.clone()) });
a.next = Some(b.clone());
drop(a);
drop(b);
A tracing GC resolves this by providing an abstraction that allows managing reachable and unreachable objects. Normally, a regular GC can still manage to use language-level weak references to handle cyclic references but tracing GCs offer more control.
We will start with a structure that holds all objects, roots and a next_id
which is a trivial index increment. Rc
does the actual work here by reference counting our objects.
use std::rc::Rc;
use std::cell::{Cell, RefCell};
use std::collections::{HashMap, HashSet};
pub trait GcObject {
fn trace(&self, tracer: &TracingGC);
}
pub struct TracingGC {
objects: RefCell<HashMap<usize, (Rc<dyn GcObject>, Cell<bool>)>>,
roots: RefCell<HashSet<usize>>,
next_id: Cell<usize>,
}
On a side note, real-world GCs will typically work with a single type, e.g. something like Gc<T>
which is scoped and a fine-grained way to model a GC, however, our one will manage heterogenous objects through trait objects, basically type erasure.
Each
Gc<T>
would ideally only manage objects of typeT
, there’s no need for type checks or casting (i.e.as_any
ordowncast_ref
), which add some runtime overhead. It also enables monomorphization—compiler-generated and type-specific implementations without indirections.
What about core::mem::ManuallyDrop<T>
?
ManuallyDrop<T>
is not relevant at all, Rc
closely follows RAII and just collects everything.
GCs are advanced reference counters.
GC references are clonable? #
From a pragmatic approach, it is great to work with a single immutable interface that implements Clone
for objects. For this, we will require a GcRef
and associated methods. The id
is the index of the object in the trace's hash set.
pub struct GcRef<T: GcObject> {
id: usize,
rc: Rc<T>,
}
impl<T: GcObject> Clone for GcRef<T> {
fn clone(&self) -> Self {
GcRef {
id: self.id,
rc: self.rc.clone(),
}
}
}
impl<T: GcObject> GcRef<T> {
pub fn id(&self) -> usize {
self.id
}
}
Allocating objects on the object map #
Emplacing objects onto the map is literally just inserting it in the map, but only as a GcRef
, not a GcObject
. We will revisit the address
part later (which holds the raw pointer address of the contained object, for debugging).
pub fn allocate<T: 'static + GcObject>(&self, object: T) -> GcRef<T> {
let id = self.next_id.get();
self.next_id.set(id + 1);
let rc = Rc::new(object);
let _address = Rc::as_ptr(&rc) as *const ();
self.objects
.borrow_mut()
.insert(id, (rc.clone(), Cell::new(false)));
GcRef { id, rc }
}
Tracing GC objects from the roots set #
GcObject
requires a trace
that finds any root with an object of the same index and marks the object.
The actual tracing is done in two phases:
-
Marking unreachable objects
- Early return or set
&Cell<bool>
(marked) to true. - Call
trace
on the marked object
- Early return or set
fn mark(&self, obj: &Rc<dyn GcObject>, marked: &Cell<bool>) {
if marked.get() { return; }
marked.set(true);
obj.trace(self);
}
-
Collecting (sweeping) the unreachable objects
- Force set
marked
to false. - For each root, find the relevant objects for marking
- Retain every indexed object where it's marked (according to the docs,
retain
runs in instead of because it internally visits empty buckets besides existing elements, which is bad)
- Force set
pub fn collect(&self) {
for (_, (_, marked)) in self.objects.borrow().iter() {
marked.set(false);
}
for root in self.roots.borrow().iter() {
if let Some((obj, marked)) = self.objects.borrow().get(root) {
self.mark(obj, marked);
}
}
self.objects.borrow_mut().retain(|&id, (obj, marked)| {
if marked.get() { true } else { false }
});
}
trace
can mark an object conditionally based on whether it's reachable.
impl GcObject for TracingGC {
fn trace(&self, tracer: &TracingGC) {
for root in self.roots.borrow().iter() {
if let Some((obj, marked)) = self.objects.borrow().get(root) {
tracer.mark(obj, marked);
}
}
}
}
I excluded the tests for brevity but the output is conclusive. For each iteration with roots pointing to another GcRef
of the same type (i.e. live objects that are initialized, 2), the remaining objects (i.e. unreachable objects) are 2 (marked), with 0 collected (none sweeped).
| allocated | idx: 0 | address: 0x55d4f3c25af0 | marked: false |
| allocated | idx: 1 | address: 0x55d4f3c25a20 | marked: false |
| root (add) | id: 1 |
| mark phase |
| marked | address: 0x55d4f3c25a20 |
| marked | address: 0x55d4f3c25af0 |
| sweep phase |
| live object | id: 0 | address: 0x55d4f3c25af0 |
| live object | id: 1 | address: 0x55d4f3c25a20 |
| collection complete | total collected: 0 | remaining: 2 |
With iterations where the root object does not point to another object, 1 object is remaining, 1 is collected from 2 live objects and only 1 unreachable (non-root) marked for collecting (0x55fb85d83a20
is the unreachable allocated object).
| allocated | idx: 0 | address: 0x55fb85d83af0 | marked: false |
| allocated | idx: 1 | address: 0x55fb85d83a20 | marked: false |
| root (add) | id: 1 |
| mark phase |
| marked | address: 0x55fb85d83a20 |
| sweep phase |
| live object | id: 1 | address: 0x55fb85d83a20 |
| collected | id: 0 | address: 0x55fb85d83af0 |
| collection complete | total collected: 1 | remaining: 1 |
Both cases are tested on 2 objects. The above steps are not entirely representative of existing, real-world, implementations. You'd have to read actual source code to reproduce any remotely-good GC or related obscure compiler thingy, which is unfortunate, but also why I decided to post this.
Cf. final source.