Expose a Rust Library to Other Languages

Posted on January 13, 2021 by Olivier Goffart and Simon Hausmann

With SixtyFPS, we are creating a GUI toolkit. We chose Rust as the implementation language for our runtime library, and we want to make the same library usable from different programming languages. We believe programmers in all languages need to build GUIs - powered by the same runtime library. Rust, with its Foreign Function Interface (FFI) is an excellent choice.
In this article we look at how to expose an idiomatic C++ API from our Rust library.

The Challenge

Initially we chose to start with support for three languages:

  • Rust: Because it's our implementation language.
  • C++: It's a low level language that we're familiar with, and is still one of the most established languages in the embedded device space.
  • JavaScript / TypeScript: Because it's a very popular dynamic language.

The Rust library (also known as a crate) is split into two parts, the shared implementation crate and a thin idiomatic API crate.

For JavaScript we use Neon to expose an API. Neon enables us to conveniently write JavaScript APIs and create an NPM package.

The C++ part is a bit more challenging.

Expose an Idiomatic C++ API through Rust FFI

We decided to keep the C++ API only in the header files. This is because, unlike with Rust, there's no widely adopted C++ equivalent of Cargo, to help with downloading and building dependencies. If we want to ship binaries, then we have to maintain ABI compatibility, which is difficult in C++.
This way, we can also keep the C++ binding as lightweight as possible: for performance and memory footprint.

Rust cannot expose a C++ API: structures can only be exported using a C representation (#[repr(C)]) and extern "C" functions. This means that we cannot expose Rust features like traits, generics, or destructors even if they have a C++ equivalent.

The Rust ecosystem provides a few helper crates to make the job easier:

  • cbindgen: This helper crate automatically generates C/C++ header files based on the repr(C) structure and the extern "C" functions. We use cbindgen to generate internal header files only. It's very helpful to avoid manually writing some unsafe error-prone boilerplate.
  • The cpp crate: This helper crate is useful when calling C++ libraries from Rust, and we make use of that. However it is not suitable for exposing a C++ API from Rust. (Note: Olivier Goffart happens to be the maintainer.)
  • The cxx crate: This would be a safer way than cbindgen to ensure that the interface between C++ and Rust is correct. While it could be useful, cbindgen already get us a long way. So we don't use it for now.

To build the correct shared library we use Cargo. The resulting library exports C mangled symbols. We ship a set of C++ header files that provide the C++ API and use the C functions behind the scenes. For convenience, we provide a CMake integration that ties together the library linkage and includes path setup.

Slices, Vectors, and Strings

In FFI, passing a basic integer works out of the box. But what about more complex data types, like a Rust slice or a string? Well, most classes like Rust's String, Vec, or slices are not #[repr(C)], so we can't use them directly. While we could use these classes with an indirection, every simple call may need to go through a non-inline function boundary. So we would need to convert types, which means re-allocating memory.

So instead of sharing code, we implemented data structures using #[repr(C)] and a stable ABI, so that they can be accessed directly from C++ and Rust, or any low-level language.

For the slice we create a structure that holds a pointer and a size:

#[repr(C)]
pub struct Slice<'a, T> {
    ptr: NonNull<T>,
    len: usize,
    phantom: PhantomData<&'a [T]>,
}

Slice<'a, T> can be dereferenced to &'a [T]. In C++, cbindgen generates the following snippet:

template<typename T>
struct Slice {
    T *ptr;
    uintptr_t len;
};

We tell cbindgen to generate that code in a cbindgen_private namespace, and we wrap it an interface similar to std::span.

We use strings and vectors to pass data between the engine and the user's code. This results in shared ownership where we want to avoid unnecessary copying of data. Our API is property based with setters and getters, therefore we implement shared ownership through Implicit sharing / Copy-on-write.

#[repr(C)]
struct SharedVectorHeader {
    refcount: atomic::AtomicIsize,
    size: usize,
    capacity: usize,
}

#[repr(C)]
pub struct SharedVector<T> {
    inner: NonNull<SharedVectorInner<T>>,
}


/// These functions are called from the C++ constructor
/// and destructor
#[no_mangle]
pub unsafe extern "C" fn sixtyfps_shared_vector_allocate(
    size: usize, align: usize) -> *mut u8 { /*...*/ }
#[no_mangle]
pub unsafe extern "C" fn sixtyfps_shared_vector_free(
    ptr: *mut u8, size: usize, align: usize) { /*...*/ }
}

In Rust, the impl Clone and impl Drop make sure to increment and decrement the atomic reference count and call the destructors. Similarly, in C++, we implement copy constructor and destructor for the same purpose. Note that we still need to call the Rust allocator function via the exposed C interface.

Now we can write a wrapper in C++: (full file)

template<typename T> struct SharedVector {
  SharedVector() : inner(nullptr) {}

  SharedVector(const SharedVector &other)
    : inner(other.inner)
  { if (inner) ++inner->refcount; }

  ~SharedVector() {
     if (inner && (--inner->refcount) == 0) {
        for (auto it = begin(); it < end(); ++it)
            it->~T();
        cbindgen_private::sixtyfps_shared_vector_free(
            reinterpret_cast<uint8_t *>(inner),
            sizeof(SharedVectorHeader)
                + inner->capacity * sizeof(T),
            alignof(SharedVectorHeader));
     }
  }
  SharedVector &operator=(const SharedVector &other)
  { /*...*/ }

  const T *begin() const { /* ... */ }
  const T *end() const { /* ... */ }
  void push_back(const T &value) { /* ... */ }
  // ... more vector-like API

private:
  // (SharedVectorHeader is generated by cbindgen)
  cbindgen_private::SharedVectorHeader *inner;
};

Right now these types, such as SharedVector and SharedString, are within the internal sixtyfps-corelib crate, and re-exported for Rust users through the public sixtyfps crate. If there is demand for it, we may consider moving them into a smaller public crate with its own release schedule.

Destructors

It's important to note that SharedVector and SharedString have destructors in C++. We can't pass instances by value in extern "C" functions, because the calling conventions are different for arguments or return types with C++ destructors; not supported by C. Therefore we can only pass them by pointer or reference.

If we want to add a C++ destructor, constructor, or any member functions to types directly exported by cbindgen to our public API, we use cbindgen::ExportConfig::body:

cbindgen_config.export.body.insert(
    "MyStruct".to_owned(),
    "    inline MyStruct(); inline ~MyStruct();".to_owned()
  );

Then we implement MyStruct::MyStruct and MyStruct::~MyStruct in a manually written header file, by either doing the memory management directly or calling C helper functions implemented in Rust.

It's important to keep in mind that anything allocated from Rust needs to be freed by Rust. The same applies to allocations in C++: they might not share the same allocator.

Dynamic Dispatch (virtual table) Across the Language Barrier

Let's start with the classic example of dynamic dispatch in Rust:

pub trait Animal {
  fn speak(&self, loudness: i32) -> String;
}
struct Dog { name: String }
impl Animal for Dog {
  fn speak(&self, loudness: i32) -> String
  { "Waf!".into() }
}
#[no_mangle]
pub extern "C" fn do_something_with(
  animal: &dyn Animal
) {
  println!("{}", animal.speak(1));
}

Unfortunately the above code does not work. How could we implement a class Cat in C++ and call the do_something_with function? What if we wanted to implement do_something_with in C++? The problem is that trait objects (&dyn) are not valid in FFI - their binary representation is not guaranteed to be stable. If we try to compile the above code, we get this warning:

warning: `extern` fn uses type `dyn Animal`, which is not FFI-safe
  | extern "C" fn do_something_with(animal: &dyn Animal)
  |                                         ^^^^^^^^^^^ not FFI-safe
  = note: `#[warn(improper_ctypes_definitions)]` on by default
  = note: trait objects have no C equivalent

Internally, we know that a trait object is composed of a pointer to the instance, and a pointer to a virtual table containing pointers to functions. The layout of this trait object (which pointer comes first) and the layout of the virtual table is an implementation detail of Rust. So we decided to re-implement them to work accross FFI. Instead of writing a trait Animal, we write a virtual table by hand:

#[repr(C)]
pub struct AnimalVTable {
    speak: extern "C" fn speak(
        VRef<AnimalVTable>, i32, & mut SharedString);
}

In this case, our virtual table has only one function. It is #[repr(C)] so that cbindgen can generate a structure that the C++ code can access. Since we can't use String we changed the return type to SharedString. We also pass the parameter by mutable reference instead of just returning it, because it is not allowed to return a type that has a destructor.
Instead of passing a trait object, our functions receive a pointer to the virtual table and a pointer to the instance, which we will wrap in a structure called VRef:

#[repr(C)]
pub struct VRef<'a VTable> {
    vtable: *const VTable,
    ptr: *const c_void,
    phantom: PhantomData<&'a ()>,
}

With a bit of boilerplate, we can implement Deref on VRef so that it adds the function speak to it. We can also generate an Animal trait, and, with a macro, create an AnimalVTable for any structure that implements that trait.
We provide a vtable crate to annotate our AnimalVTable with the #[vtable] macro that generates the boiler plate. For a complete list of features and examples refer to the crate documentation.

Rc/Arc

When writing an API that plays with objects, we may want to hold these objects in reference counted structures. It would be nice to be able to pass Rc<Dog> or Rc<dyn Animal> around. Once again we're facing the issue that Rc is not #[repr(C)]. Similar to what we did with String, Vec, and the others, we re-implemented Rc with a stable binary representation: VRc.

Since we needed dynamic dispatch as well, we did that in the vtable crate:

#[repr(C)]
struct VRcInner<VTable, X> {
    vtable: *const VTable,
    strong_ref: Cell<u32>,
    weak_ref: Cell<u32>,
    /// offset to the data from the beginning of VRcInner.
    data_offset: u16,
    data: X,
}
#[repr(C)]
pub struct VRc<VTable, X = Dyn> {
    inner: NonNull<VRcInner<VTable, X>>,
}
/// This is a marker type to be used in `VRc` and `VWeak`
/// to mean that the actual type is not known.
pub struct Dyn(());

CMake

Cargo builds the cdylib crate for us and produces the correct shared library. C++ users should not be required invoke cargo themselves. There are many different C++ build systems. We decided to support CMake first -- it is the most popular one.

Corrosion is a nice project that helps us to invoke Cargo automatically.
Behind the scenes, Corrosion runs cargo metadata to find out what is generated as a library or an executable. Corrosion then provides CMake targets that invoke cargo build.

Below is an excerpt of our CMakeLists.txt, edited for context. For the complete version, see the original file.

# Expose the crates as cmake target
corrosion_import_crate(
  MANIFEST_PATH "${CMAKE_CURRENT_SOURCE_DIR}/Cargo.toml"
  CRATES sixtyfps-core xtask
)

# Create a target which we will export
add_library(SixtyFPS INTERFACE)
target_link_libraries(SixtyFPS INTERFACE sixtyfps-core)

# Generate headers with cbindgen
set(generated_headers
  ${CMAKE_CURRENT_BINARY_DIR}/gen/sixtyfps_internal.h
)
file(GLOB generated_headers_dependencies
   "${CMAKE_CURRENT_SOURCE_DIR}/src/*.rs")
add_custom_target(
  generated_headers_target
  COMMAND xtask cbindgen --out
        "${CMAKE_CURRENT_BINARY_DIR}/gen/"
  BYPRODUCTS ${generated_headers}
  DEPENDS ${generated_headers_dependencies}
)
add_dependencies(SixtyFPS generated_headers_target)

# This will install the headers
set_property(TARGET SixtyFPS PROPERTY
  PUBLIC_HEADER include/sixtyfps.h ${generated_headers}
)

# export and install the SixtyFPS target so it can be used
export(TARGETS SixtyFPS sixtyfps-core
  #...
)
install( #... )

sixtyfps-core is the name of the cdylib Rust crate. We also use the "xtask" pattern to invoke cbindgen; a concept borrowed from rust-analyzer.

This turns out to be really easy to use! With Rust and Cargo installed, the SixtyFPS can be used from C++ with only a few lines of CMake code. Our previous tutorial blog post demonstrates this:

include(FetchContent)
FetchContent_Declare(SixtyFPS
    GIT_REPOSITORY https://github.com/sixtyfpsui/sixtyfps
    SOURCE_SUBDIR api/sixtyfps-cpp
)
FetchContent_MakeAvailable(SixtyFPS)
add_executable(my_app main.cpp)
target_link_libraries(my_app PRIVATE SixtyFPS::SixtyFPS)

Here, we use the FetchContent CMake module, which downloads and builds SixtyFPS from our git repository. In future, we plan to offer binary packages for C++ users as well, which can be integrated using find_package(), eliminating the need to install Cargo and Rust.

Possible Improvements

Although we are very satisfied with how the cross-language support turned out, there is always room for improvement:

  • How to publish C++ headers for crates on crates.io?:
    If a crate that exposes a C++ interface is fetched from crates.io, its C++ headers - possibly including the cbindgen generated ones - should be included in a way that CMake can find them. We don't have this problem right now because all such crates are included in our git repository. In general it would be nice if Cargo supported the distribution of extra crate assets.
    The cxx crate, for example, embeds its cxx.h file in its cxx_brigde binary utility with include_str!, but not every library has a utility like that.
    If the header files are placed within the crate's source, they are install in a path like ~/.cargo/registry/src/github.com-1ecc6299db9ec823/vtable-0.1.1/include/vtable.h, which is hard to locate from CMake and other tools.
  • It would be nice if trait objects and virtual tables could be configured to have a stable layout. This would simplify the vtable macro a lot. For example RFC2955 would have helped.
  • It would help if more of the trivial data structures such as slices could have a stable #[repr(C)] representation. Similarly a feature like #[repr(C++)] or extern "C++" fn, to have the correct calling convention when passing or returning objects that have non-trivial special member functions, would simplify the language bindings.
  • More controversial would be a stable binary layout of some types from the Rust std library, such as String, Vec, etc. This was even previously considered for Rc.

Conclusion

We're glad to have found a solution to the cross-language library and build system task that works for us. However we recognize that this comes with a considerable amount of complexity that would be nice to encapsulate in a way that other projects can benefit with less effort. We hope that you'll find this article and our helper vtable crate useful.