Skip to main content
elric neumann

Writing a mark-and-sweep tracing GC in Rust

Garbage collection is a well-researched area in compiler development and is particularly necessary in languages that require more than regular dynamic memory management. We'll look into a specific type of garbage collector (GC) referred to as a tracing GC and a related algorithm, mark-and-sweep.

A reachable (live) object is one that is referenced strongly by the GC, which could be used to manage static variables, symbol table entries that live throughout a program's lifetime, etc. An unreachable object, on the contrary, is not referenced in any way by the GC (linked list nodes are a great way to model this structural requirement).

figure 1

Regular GCs usually deallocate immediately. The issue with this is that it is limited by cyclic references and a range of other issues.

struct Node {
    next: Option<Rc<Node>>,
}

let a = Rc::new(Node { next: None });
let b = Rc::new(Node { next: Some(a.clone()) });
a.next = Some(b.clone());

drop(a);
drop(b);

A tracing GC resolves this by providing an abstraction that allows managing reachable and unreachable objects. Normally, a regular GC can still manage to use language-level weak references to handle cyclic references but tracing GCs offer more control.


We will start with a structure that holds all objects, roots and a next_id which is a trivial index increment. Rc does the actual work here by reference counting our objects.

use std::rc::Rc;
use std::cell::{Cell, RefCell};
use std::collections::{HashMap, HashSet};

pub trait GcObject {
    fn trace(&self, tracer: &TracingGC);
}

pub struct TracingGC {
    objects: RefCell<HashMap<usize, (Rc<dyn GcObject>, Cell<bool>)>>,
    roots: RefCell<HashSet<usize>>,
    next_id: Cell<usize>,
}

On a side note, real-world GCs will typically work with a single type, e.g. something like Gc<T> which is scoped and a fine-grained way to model a GC, however, our one will manage heterogenous objects through trait objects, basically type erasure.

Each Gc<T> would ideally only manage objects of type T, there’s no need for type checks or casting (i.e. as_any or downcast_ref), which add some runtime overhead. It also enables monomorphization—compiler-generated and type-specific implementations without indirections.

What about core::mem::ManuallyDrop<T>?

ManuallyDrop<T> is not relevant at all, Rc closely follows RAII and just collects everything.

GCs are advanced reference counters.

GC references are clonable? #

From a pragmatic approach, it is great to work with a single immutable interface that implements Clone for objects. For this, we will require a GcRef and associated methods. The id is the index of the object in the trace's hash set.

pub struct GcRef<T: GcObject> {
    id: usize,
    rc: Rc<T>,
}

impl<T: GcObject> Clone for GcRef<T> {
    fn clone(&self) -> Self {
        GcRef {
            id: self.id,
            rc: self.rc.clone(),
        }
    }
}

impl<T: GcObject> GcRef<T> {
    pub fn id(&self) -> usize {
        self.id
    }
}

Allocating objects on the object map #

Emplacing objects onto the map is literally just inserting it in the map, but only as a GcRef, not a GcObject. We will revisit the address part later (which holds the raw pointer address of the contained object, for debugging).

pub fn allocate<T: 'static + GcObject>(&self, object: T) -> GcRef<T> {
    let id = self.next_id.get();
    self.next_id.set(id + 1);
    let rc = Rc::new(object);
    let _address = Rc::as_ptr(&rc) as *const ();

    self.objects
        .borrow_mut()
        .insert(id, (rc.clone(), Cell::new(false)));

    GcRef { id, rc }
}

Tracing GC objects from the roots set #

GcObject requires a trace that finds any root with an object of the same index and marks the object.

The actual tracing is done in two phases:

fn mark(&self, obj: &Rc<dyn GcObject>, marked: &Cell<bool>) {
    if marked.get() { return; }
    marked.set(true);
    obj.trace(self);
}
pub fn collect(&self) {
    for (_, (_, marked)) in self.objects.borrow().iter() {
        marked.set(false);
    }

    for root in self.roots.borrow().iter() {
        if let Some((obj, marked)) = self.objects.borrow().get(root) {
            self.mark(obj, marked);
        }
    }

    self.objects.borrow_mut().retain(|&id, (obj, marked)| {
        if marked.get() { true } else { false }
    });
}

trace can mark an object conditionally based on whether it's reachable.

impl GcObject for TracingGC {
    fn trace(&self, tracer: &TracingGC) {
        for root in self.roots.borrow().iter() {
            if let Some((obj, marked)) = self.objects.borrow().get(root) {
                tracer.mark(obj, marked);
            }
        }
    }
}

I excluded the tests for brevity but the output is conclusive. For each iteration with roots pointing to another GcRef of the same type (i.e. live objects that are initialized, 2), the remaining objects (i.e. unreachable objects) are 2 (marked), with 0 collected (none sweeped).

| allocated | idx: 0 | address: 0x55d4f3c25af0 | marked: false |
| allocated | idx: 1 | address: 0x55d4f3c25a20 | marked: false |
| root (add) | id: 1 |

| mark phase |
| marked      | address: 0x55d4f3c25a20 |
| marked      | address: 0x55d4f3c25af0 |

| sweep phase |
| live object | id: 0 | address: 0x55d4f3c25af0 |
| live object | id: 1 | address: 0x55d4f3c25a20 |
| collection complete | total collected: 0 | remaining: 2 |

With iterations where the root object does not point to another object, 1 object is remaining, 1 is collected from 2 live objects and only 1 unreachable (non-root) marked for collecting (0x55fb85d83a20 is the unreachable allocated object).

| allocated | idx: 0 | address: 0x55fb85d83af0 | marked: false |
| allocated | idx: 1 | address: 0x55fb85d83a20 | marked: false |
| root (add) | id: 1 |

| mark phase |
| marked      | address: 0x55fb85d83a20 |

| sweep phase |
| live object | id: 1 | address: 0x55fb85d83a20 |
| collected   | id: 0 | address: 0x55fb85d83af0 |
| collection complete | total collected: 1 | remaining: 1 |

Both cases are tested on 2 objects. The above steps are not entirely representative of existing, real-world, implementations. You'd have to read actual source code to reproduce any remotely-good GC or related obscure compiler thingy, which is unfortunate, but also why I decided to post this.

Cf. final source.


References #