Object Oriented Programming in Rust—Yuck and Yet...

Introduction

I remember a coworker, gosh, maybe two decades ago, telling me you can write object oriented code in any language—even C. We were happily writing Java at the time (see, a long time ago), and I was confused because obviously the C language does not natively support object oriented programming but he was right. Many people have written object oriented code in C. For instance GTK uses the GObject library. Nethack is another example. So object oriented programming can be done in C but the language does not explicitly facilitate it. Rust is another language that has questionable support for conventional object oriented programming. Suppose we want conventional object oriented support, they can’t stop us from doing object oriented programming if that’s what we’re determined to do.

Object Oriented Programming

Object oriented mind graph showing it connect polymorphism, abstraction, encapsulation, inheritance, classes, and objects

What is object oriented programming? Gosh, it’s been with us so long it’s hard to think of what it actually means.

  • A class defines some data and provides methods.
struct Mouse { // a "class"
  age: u8 // an unsigned 8-bit, 
          // a byte, or a "datum"
}

impl Mouse {
  fn is_old(&self) -> bool { // a "method"
    self.age >= 2
  }
}
  • An object is an instance of a class, of which there can be many.
let mouse = Mouse { age: 1 }; // an "object"
  • Encapsulation: The data a class defines may be private and its public methods constitute its programmatic interface.
pub struct Rat {
  age: u8 // Field is not public so Rat 
          // can't be constructed and age 
          // can't be accessed externally.
}

impl Rat {
  pub fn new(years_old: u8) -> Rat { // a "static method"
    return Rat { age: years_old };
  }

  pub fn is_old(&self) -> bool { // an "instance method"
    self.age >= 3
  }
}

let rat = Rat::new(3);
  • Abstraction: A caller—that can interact with a class—can also interact with any of its subclasses. Likewise a caller—that can interact with an interface—can interact with any of its implementers.
pub trait Old { // an "interface"
  fn is_old(&self) -> bool;
}
  • Inheritance: One can extend a class adding new data and modifying its methods.

We’ll come back to this one. The short answer is that rust does not support implementation inheritance natively: structs cannot inherit from one another. But a trait can inherit from other traits.

Polymorphism

What is polymorphism? Polymorphism is the ability to access objects of different types through the same programmatic interface. There is subclass polymorphism: a child class can be substituted anywhere that accepts its parent. There is interface polymorphism: a class that implements an interface can be substituted anywhere that accepts that interface.

Furthermore there are two variants of polymorphism: static and dynamic. With dynamic polymorphism a caller does not need to know the class yet it can still interact via its interface. Dynamic polymorphism is traditionally implemented with a virtual method table that each class and subclass maintains. This introduces some overhead in space and time: there’s an lookup for a virtual method call. This is called dynamic dispatch.

Then there is static polymorphism, where code can be written to work some interface but the compiler knows what class is given and omits the dynamic dispatch and calls the class’s method directly. This is called static dispatch.

A Model Animal

Rust’s traits are analogous to C# or Java’s interfaces. Let’s create the canonical object oriented animal example.

trait Animal {
  fn cry();
  fn sleep();
  fn roam();
}

It crys, sleeps, and roams. What more could you ask for?

A Cat

Let’s make a Cat “class” to implement it. Rust calls it a struct.

struct Cat { }

impl Animal for Cat {
  fn cry() {
    println!("meow");
  }
  fn sleep() {
    println!("Zzzz");
  }
  fn roam() {
    println!("patter patter pounce");
  }
}

Run it

We can exercise it like so.

Cat::cry();
meow

Crying cat meme in pixel art form by
PixelPandaArts

However, this cry() is akin to a static method in Java or C# terminology. There is no instance or object it is called on. Static methods aren’t capable of any dynamic polymorphism. These self-less functions are called associated functions in rust terminology.

Static Polymorphism

However, rust can do static polymorphism. Say we had another “class” called Bird.

struct Bird {}

impl Animal for Bird {
  fn cry() {
    println!("chirp");
  }
  fn sleep() {
    println!("Zzzz");
  }
  fn roam() {
    println!("flit flit flurp");
  }
}

We can of couse make calls on the class like so.

Bird::roam();
flit flit flurp

But we can also use rust’s generics to write code that will handle any Animal struct without requiring an instance.

fn cry_to_sleep<T: Animal>() {
  T::cry();
  T::cry();
  T::sleep();
}

Now we can exercise it. You might think we’d do call_me_sleepy<Cat>() but rust instead uses cry_to_sleep::<Cat>(). They call the ::<Type> a turbo fish. I don’t know why. Does it ::<> look like a fish? Maybe I can see it now.

cry_to_sleep::<Cat>();
meow
meow
Zzzz
cry_to_sleep::<Bird>();
chirp
chirp
Zzzz

I think this is pretty cool. It was only recently that C# got interface static methods, and you can use them to do all kinds of cool things like make generics work seamlessly with math operators in C#. And you can use it to good effect.

Factories. Factories everywhere.

Remember how everyone needed a factory class in addition to their class? Why did they need a factory? Because new is not dynamically dispatched; new only operates on a concrete class. You can’t new an interface. So to workaround that we added an indirection, a whole other class we call a factory, which is really just a glorified constructor, a dynamically dispatched constructor.

You’d think one could have added a static method to their class like this:

class Gorilla {
  public static Gorilla New() { ... }
}

var gorilla = Gorilla.New();

But one could not substitute a Gorilla class with some other Gorilla class. You can substitute an instance of the Gorilla class with any of its descendant instances but not the class itself. There’s only one Gorilla class.

A New Way

With static polymorphism you could instead write this:

trait Factory {
  fn new() -> Self; // "new" is not a keyword in rust.
                    // It's not special.
}

struct Gorilla { }

impl Factory for Gorilla {
  fn new() -> Self { // "Self" is an alias for the implementing type.
    println!("Made a gorilla");
    return Gorilla { }; // This is a "struct constructor."
  }
}

fn gorilla_keeper<F: Factory>() {
  let gorilla1 = F::new();
  let gorilla2 = F::new();
}

gorilla_keeper::<Gorilla>();
Made a gorilla
Made a gorilla

Then you could make a class responsible for making instances of itself without a factory class, and you could substitute other classes that were their own factory too.

struct MockGorilla {}

impl Factory for MockGorilla {
  fn new() -> MockGorilla {
    println!("Made a mock gorilla");
    MockGorilla { } // Can omit 'return' and ';' to 
                    // return last expression.
  }
}

gorilla_keeper::<MockGorilla>();
Made a mock gorilla
Made a mock gorilla

This static polymorphism is limited however: the type must be known at compile time so the compiler can emit code for a concrete function that uses that type. So we are trading space for time when we use generics in this way. Our binary size is larger but our performance is better. But we can’t use this on types that weren’t present at compile-time. Here’s a fun new word: this is called monomorphization.

Take Two

What we’d like instead is to call these functions on an instance of an object. To do that in rust, we add the &self parameter to our trait.

trait Animal {
  fn cry(&self);
  fn sleep(&self);
  fn roam(&self);
}

Now they’re instance methods.

struct Cat { }

impl Animal for Cat {
  fn cry(&self) {
    println!("meow");
  }
  fn sleep(&self) {
    println!("Zzzz");
  }
  fn roam(&self) {
    println!("patter patter pounce");
  }
}

Now let’s try and run it.

let cat = Cat {};
cat.cry();
meow

It does what we expect. But it does not offer any subclass polymorphism. There’s no means to inherit Cat’s behavior.

No Subtype Polymorphism Unless…

Object oriented mind graph showing it connects polymorphism, abstraction, encapsulation, classes, and objects with inheritance crossed out, which rust doesn’t support.

If we want subtype polymorphism, we’re going to have to implement it ourselves. Our trait probably needs to know the base type. We can make our trait Animal generic with respect to a Base type that is some other kind of Animal like so:

trait Animal {
  type Base : Animal;
  fn cry(&self);
  // ...
}

And we can also define default implementations of trait functions. That seems like a good place to put our dynamic dispatcher functionality, doesn’t it? But we need to be able to get an instance of the base “class” or struct, so we’ll add a self.base() function.

trait Animal {
  type Base : Animal;
  fn base(&self) -> &Self::Base;
  fn cry(&self) {
    self.base().cry();
  }
  fn sleep(&self) {
    self.base().sleep();
  }
  fn roam(&self) {
    self.base().roam();
  }
}

Now we can implement only part of Animal for Cat.

struct Cat {}

impl Animal for Cat {
  type Base = Cat;
  fn base(&self) -> &Cat {
    &Cat {} // My base type is... myself?
  }
  fn cry(&self) {
    println!("meow");
  }
}

If we call cry(), it works.

let cat = Cat {};
cat.cry();
meow

But calling the unimplemented roam() will cause a stack overflow.

let cat = Cat {};
cat.roam();

thread 'main' has overflowed its stack
fatal runtime error: stack overflow

The reason that happens is because we infinitely recurse down the base type. We call self.base().base().base()... when there’s no default implementation. We can define a base type that implements all the functions to prevent infinite recursion.

Base Animal type

We can rectify that problem by defining a base animal type that implements all the functions. This is the base case for our recursive dynamic dispatcher.

struct BaseAnimal {}

impl Animal for BaseAnimal {
  type Base = BaseAnimal;
  fn base(&self) -> &BaseAnimal {
    panic!("Never call me here.");
  }
  fn cry(&self) { 
    println!("base cry");
  }
  fn sleep(&self) { 
    println!("base sleep");
  }
  fn roam(&self) { 
    println!("base roam");
  }
}
struct Cat {}

impl Animal for Cat {
  type Base = BaseAnimal;
  fn base(&self) -> &BaseAnimal {
    &BaseAnimal {}
  }
  fn cry(&self) {
    println!("meow");
  }
}

Now we can call roam() and we won’t get an infinite loop.

let cat = Cat {};
cat.roam();
base roam

An Optional Base Type

Instead of requiring a base struct for every object oriented trait, we could change our base() signature to use an Option<>.

trait Animal {
  type Base : Animal;
  fn base(&self) -> Option<&Self::Base> {
    None
  }
  fn cry(&self) {
    if let Some(b) = self.base() {
      b.cry();
    } else {
      todo!("Not yet implemented");
    }
  }
  // ...
}

How to Extend the Class’s API?

One further issue is what happens when a trait is extended? This happens all the time with object oriented code. A subclass adds fields and new public methods. Adding fields is easy in this setup:

Bat {
  speed: f32,
  base_obj: Rat
}

Adding public “methods” that also do subtype polymorphism is less clear.

trait EndangeredAnimal : Animal {
  fn is_endangered(&self) -> bool;
}

I’m going to leave this as an exercise for the reader partly because I think it’s going to be a mess but it is doable.

How do Crustaceans Actually do This?

Let’s go back to our second version of this trait.

trait Animal {
  fn cry(&self);
  fn sleep(&self);
  fn roam(&self);
}
struct Cat {} 

impl Animal for Cat {
  fn cry(&self) {
    println!("meow");
  }
  fn sleep(&self) {
    println!("Zzzz");
  }
  fn roam(&self) {
    println!("patter patter pounce");
  }
}

struct Bird {} 

impl Animal for Bird {
  fn cry(&self) {
    println!("chirp");
  }
  fn sleep(&self) {
    println!("Zzzz");
  }
  fn roam(&self) {
    println!("flit flit flurp");
  }
}

We’ve got two struct and we can use generics to call whichever implementation is present.

fn cry_to_sleep<T: Animal>(animal: T) {
  animal.cry();
  animal.cry();
  animal.sleep();
}
let cat = Cat {};
cry_to_sleep(cat);
meow
meow
Zzzz

But what if I don’t know the type exactly?

Suppose you have a heterogenous list of Animals and you don’t know their exact type any more. How do you deal with that? You can’t use the generic solution.

Ok, here’s the really cool thing that rust does.

fn cry_to_sleep_dynamic(animal: &dyn Animal) {
  animal.cry();
  animal.cry();
  animal.sleep();
}

Rather than having a structural cost baked into every object that is capable of polymorphism with a virtual function look up, rust has virtual function look up on demand via its dyn keyword. When you use dyn you’re passing and receiving a “fat” pointer, which is two pointers: one to the regular struct, and another to a virtual function look up table.

So you can call this method on bird

let bird = Bird {};
cry_to_sleep_dynamic(&bird);
chirp
chirp
Zzzz

or on cat.

let cat = Cat {};
cry_to_sleep_dynamic(&cat);
meow
meow
Zzzz

So you can have dynamic polymorphism in rust whenever and with whatever trait you like, but the funny thing is if one has a choice between direct static dispatch or an indirect dynamic dispatch, one often prefers static dispatch.

Static dispatch is known at compile time. It’s faster. It’s simpler to reason about. Use it if you can. I have found that when dynamic dispatch is available on a per call instead of a per class basis, it’s amazing how little I actually use it. But dynamic dispatch is there for whatever trait1 you need it on and you don’t have to rely on the original author knowing which methods you really needed to be marked “virtual”.

Acknowledgments

Thanks to CandyCorvid and kohugaly at r/rust for their input on this article’s thread.

1

Turns out only certain traits can use dyn; only traits that are object safe: roughly no generics, no constants, and accepts a &self, &mut self, Box\<Self\>, or other smart pointer to self.


2330 Words

2023-05-07