Some thoughts on Rust

June 28, 2023

This is a bunch of my ramblings I just wanted to put out into the world because 1. it’s my blog and what else am I going to use it for (apart from stories of pain that is) and 2. I can link it to people rather than constantly explaining what I mean when I say “I think rust is great but”.

I’m not going to be constantly contrasting against C++ or lamenting how language X is so much better, I love Rust and as someone with a bit of experience in both C++ and Rust I can appreciate where Rust learned its lessons and where C++ just has decades more ecosystem and language maturity (for good or ill).

Essentially, this is me putting down a marker. This is how I felt about Rust halfway through 2023 (I started writing this ~~a month~~ ~~two~~ three months ago ok, 99% of the content is from may-june I’ve just been too busy to edit it for publication).

Language Pain Points

Alright lets go through some of the pain points I’ve personally encountered. This list is not exhaustive, generally the issues that made the cut are ones that have been open for multiple years (in some cases pre-1.0, i.e. close to a decade), and are fundamental issues with the language that are difficult or impossible to workaround with libraries.

Derive bounds are too strict (2015)

https://github.com/rust-lang/rust/issues/26925

This is, straight up, just a bug in the standard library. #[derive] works for types which are covariant in all their generic arguments in respect to implementing the given trait. e.g. types like Vec and Box. That is:

Co<T>: PartialEq <=> T: PartialEq

However, if you have a type that is invariant on its generics, i.e. types which do implement PartialEq no matter the constraints on T, then clearly you should be able to still #[derive(PartialEq)]… but you can’t. This is because #[derive(Z)] struct Y<X> adds the requirement that X: Z on the trait impl, even if isn’t needed.

Anyway with that hefty explanation out of the way, surely this is a simple fix? Well… no. Because it would both break backwards compatibility, and because nobody can agree how it should be done better 🤦‍♀️. Use something like impl_tools for now I guess.

Variadic Generics (2013)

rfc: https://github.com/rust-lang/rfcs/issues/376 (2014)

original issue: https://github.com/rust-lang/rust/issues/10124 (2013)

This is the oldest issue I’m going to talk about, unfortunately it is also one of the sharpest corners in Rust. Bold claim I know, but seriousely this issue being unsolved infects a huge amount of code and yet only becomes a problem when you are trying write a few specific APIs. There are 2 main problems caused by this:

Closures and Fn* in general are implemented as a pile of hacks. I mean it uses a special extern directive, an unstable Tuple trait and compiler magic, to turn a signature that takes 1 argument (a tuple) to take the elements of the tuple instead for god sake.
Tuple manipulation.

Look at the generic closures point on this list for more information on why (1) is an issue.

For (2), Rust is not alone in tuples being a pain to manipulate. Haskell also has this problem, and it’s generally “solved” by using template haskell to generate every possible definition of the tuple code (GHC caps tuple lengths somewhere between 60 and 70). In contrast however, in C++, you can do such crazy things as map (with a generic function) over a tuple. You can also define N-ary curry and uncurry in C++, which would be another nice feature for Rust to support and might help a little with the mess of the Fn* traits, even if they didn’t want to support varadics directly.

The tl;dr of this issue is that it doesn’t do much on its own, but its absence causes a lot of issues when implementing other features

Generic closures (2016)

https://github.com/rust-lang/rfcs/pull/1650

This one is closed until the “new trait system” is implemented. That was back in 2017, the last comment is from 2018. I actually tried to implement this as a crate, and it is technically possible within Rust’s current system - you do need unstable to impl Fn thought. The issue I ran into was type inference for the captures. If you are OK with syntax such as:

let val = 2;
let func = lambs::once!{ [val: u32]|| -> u32 { val + 1 } };

Then its doable as a library (but obviously not ideal).

Impl trait in type aliases (2019)

https://github.com/rust-lang/rust/issues/63063

This one is a real pain point for me. In my case at least this is actually a performance issue as I am faced with the choice of either threading impl X through every single function my type is used in, or using Box. This is an even bigger issue when I have some P<impl X> where adding another generic arg requires updating potentially dozens of functions. Realistically what this means is that I simply use Box<dyn X> rather than making it generic. Even worse, since being object safe means no associated type aliases or generic functions, the Boxness propagates.

There are trade-offs to be considered when using generics vs runtime polymorphism, but this isn’t a case of making these trade-offs. This is a case of “I cannot write the more performant thing because it is too brittle and too verbose”, which is tolerable on a small scale, like inside a function (e.g. for inline asm), but here we are talking about the APIs of entire libraries or building-block types.

Direct heap allocation (2018)

https://github.com/rust-lang/rust/issues/53827

Oh yeah this one’s fun. Currently, if you want to allocate something in Rust - even if it is placed directly into a Box (i.e. its on the heap), you still have to allocate it on the stack and then copy onto the heap. This obviously is a big problem if the entire reason your type needs to be on the heap is because it will blow whatever stack frame you try and allocate it in. There are optimisations to elide the stack allocation when passing directly to Box::new. e.g. Box::new([0u8; 10000000000000]) may be allocated directly on the heap, emphasis on *may* - optimisations are not guaranteed and thus cannot be relied upon for correctness of code.

On the surface, this is somewhat solvable in current Rust:

pub fn alloc_heap<T>() -> *mut T {
    let layout = Layout::new::<T>();
    let mem = unsafe { std::alloc::alloc(layout) } as *mut T;
    mem
}

The problem here is that it is wildly unsafe both from a memory leak, and UB perspective. The caller of this function has to initialise the returned pointer before use, and they better do it quick because if they don’t free the memory then it gets leaked. We also can’t fix this by having the user pass a lambda to initialise the memory or something, because that would require returning the value and we are back to blowing the stack.

We could of course write a macro, maybe something like:

macro_rules! alloc_heap {
    ($tp:ty, $init:expr) => { {
        let layout = Layout::new::<$tp>();
        let mut mem = unsafe { std::alloc::alloc(layout) } as *mut $tp;
        unsafe { mem.write($init); }
        unsafe { Box::from_raw(mem) } }
    };
}

Even with this however we can still run into issues with simple things like assignment [issue comment][heap alloc assignment]. The real issue is Rust’s lack of guaranteed copy elision ala C++. To go even further Rust currently cannot support the kind of interface that C++ has for inplace construction (e.g. std::make_unique) because it lacks… varadic generics (damn these keep coming up huh).

Higher Kinded Types (HKTs) (2014)

https://github.com/rust-lang/rfcs/issues/324

This one is a little bit controversial, there is a well known twitter thread by withoutboats (link, but beware the wall, alternatively I took screenshots) where they talk about why Monads won’t work in Rust:

The problem is that without currying at the type level, higher kinded polymorphism makes type inference trivially undecidable. We have no currying.

This is, strictly speaking, true. If I am allowed to define a kind such as (pseudocode):

Poly<i32> = Vec<i32>;
Poly<f32> = List<f32>;

Then a constraint like C<i32> = Vec<i32> would be unsolvable as it would not imply C = Vec, trivially provable by C = Poly also being valid solution. This problem can in fact be demonstrated even in current Rust:

use std::collections::LinkedList as List;

trait TerribleThing<A> {
    type Collect;
}
struct Terrible {}

impl TerribleThing<i32> for Terrible {
    type Collect = Vec<i32>;
}
impl TerribleThing<f32> for Terrible {
    type Collect = List<i32>;
}

struct PrettyBad {}
impl TerribleThing<A> for PrettyBad {
    type Collect = Vec<A>;
}


fn map<C: TerribleThing<A> + TerribleThing<B>, A, B>(f: impl FnMut(A) -> B, values: <C as TerribleThing<A>>::Collect) -> <C as TerribleThing<B>>::Collect { /* ... */ }

fn do_map() -> Vec<f32> {
    let x = vec![5];
    let v = map(|x: i32| x as f32, x);
    v
}

godbolt

This will require explicitly specifying C. This is because Rust associated types are not 1-1, PrettyBad and Terrible’s implementations overlap in all cases in fact. How is the compiler to know whether I want to use PrettyBad’s Vec<A> or Terrible’s Vec<f32>. It might seem obvious that PrettyBad cannot be correct in this instance but consider if it also had an overload for i32 -> f32, the compiler isn’t allowed to just assume that overload doesn’t exist just because it can’t see it. For the intuition on this it’s like saying that

f(1) = 1 => f(x) = x

It’s a case but it is not every case — there are others that satisfy it e.g. f(x) = x^2

Where does this leave us with general HKTs though? Well, it’s actually not a big issue, consider:

fn map<C<type>, A, B>(f: impl FnMut(A) -> B, values: C<A>) -> C<B> { /* ... */ }

This should be perfectly possible for the compiler to do type inference on (as C++ does), because in this case

C<u32> == Vec<T> => C == Vec

This is true by virtue of the fact that it is impossible for C to be polymorphic, it is implicitly constrained by the language. If it were possible to specialise a type alias this would be untrue, but the only way to get polymorphism in this way currently is with generic traits + associated types - which do cause issues as we saw above, but which you have to explicitly be using to involve.

If you want even more depth on this issue, I would advise reading this blog post on the subject. It is a little old (2016) and predates GATs in Rust (they call them “Associated Type Constructors”) but it has the theory in it.

The library situation

I want to preface this by saying that Rust’s ecosystem is very impressive given its age. It is barely 8 years old and yet it has multiple async runtimes with highly competitive performance (source is mostly trust me bro but there is this memory benchmark).

It’s got good support for web stuffs with warp and actix being ones I’ve used personally.

clap is a great argument parser library up there with optparse-applicative in my personal favorites on this front.

sled is more or less solely responsible for showing me the light of simple key-value stores, and freeing me from SQLite.

thiserror and anyhow both make Rust’s error handling far nicer than exceptions for me.

Let’s of course not forget serde, which brought a level of ease to marshalling I’d only ever experienced with runtime reflection capable languages before (and integrates very nicely with the rest of the ecosystem).

Cargo

Oh my god cargo is so good. Rust got in early with a single, standard, powerful, dependency manager. Simply look at C++ or Haskell to see what happens when the dependency manager either isn’t good enough or you don’t have one for decades. Haskell has the advantage of a single dominant ecosystem younger than the internet (GHC), but stack and cabal still fragment the ecosystem to a degree. stack came about because (as far as I’m aware/can remember) cabal v1 was lacking in key features (it’s now got a lot better with v2 but stack is still around). C++’s ecosystem predates the ability for tools like cargo with centralized package repositories to exist, and the fragmentation in build systems and conventions makes life hell for anybody trying to write a dependency manager for it now.

UI libraries

Where things get a little less good is with UI libraries. Both the TUI and GUI libraries (that I have used) feel quite immature. I would like to stress that my knowledge here is limited and possibly out of date, I have not tried libraries such as iced or druid personally, I have just heard experiences of using them that reinforce the general vibe of “usable, but a little sharp”.

Programmers are also human sums this up pretty well as usual: https://youtube.com/clip/UgkxAx-URT2CWQZycm6gosHwxfjW8XvBoZEI

Take for example cursive — it’s an impressive piece of work (writing a good TUI lib is hard, trust me), and the API is innovative - escaping the low level of ncurses and bringing a welcome concept of widgets to terminal UIs. There are however a few sharp corners on this API, want to pass some state to your widgets? yeah I’m afraid that’ll have to live for 'static and be wrapped in an Any that you have to downcast from. This is… not great, we are basically throwing around void pointers at this point, although with the safety net of your program simply not working if you cast wrong instead of making toast.

I’m going to claim that the issue is down to fundamental design. The traditional way of designing these libraries is with OOP and large amounts of shared mutability. This obviously does not translate very well to Rust. I’m supported in this claim by the author of druid, who wrote a blog post on the issue.

Interestingly React started out with the OOP architecture + a weird state system thing, and then transitioned to the much nicer hooks API. I really really like React as a framework and I’m quite excited to see if we could get something similar in Rust for native apps (yes I know about React Native, no I do not think it’s a real alternative to actually native code). For those interested Dioxus is a React inspired Rust library aiming to (as far as I can tell) do exactly this, I haven’t had time to look into it yet but it appears promising.

SmallVec

github

This library is generally pretty good. There is however, a pretty major issue that makes it unusable for some use-cases (use-cases that I have): #146 “SmallVec<[T; N]> is invariant over T”. If you don’t understand subtyping and variance, I feel you, you can read this if you really want to know the nitty gritty but all you need to understand in this case is the following:

There are 3 types of variance: covariant, contravariant and invariant
If there are a mixture of variance types in the fields or variants of a struct or enum then the whole thing is invariant
*mut T is invariant over T

SmallVec uses *mut T internally to store its heap version. This makes it invariant over the lifetime of T - which is a problem. The reason it’s a problem is twofold, the first and most important is that Vec<T> is covariant over T, meaning that SmallVec is not actually a drop-in replacement for Vec. The second is that because invariance propagates upwards, and clobbers every other type of variance, if your type is generic over a lifetime bound and currently covariant over it (e.g. because it’s using Vec), changing to SmallVec means your type is now invariant over that lifetime bound - essentially meaning it has to be 'static and the lifetime bound doesn’t matter. That, unsurprisingly, breaks pretty much everything that depends on that lifetime bound actually mattering - which is probably most things since you bothered to add the bound in the first place.

tl;dr SmallVec<T> implicitly requires that T live for 'static, Vec<T> does not (where T includes a lifetime bound)

ArrayVec

Luckily there is an alternative to SmallVec if you don’t need the dynamic upper bound part. Good thing it doesn’t have a similar iss-oh for god sake

Language and library evolution

This is a point where I think Rust really shines. There is this really great talk by Titus Winters @ CppCon 2017 (youtube) where he talks about how semantic versioning for C++ is a practical impossibility. It’s a great talk and he goes over a sample of the many reasons why C++ is an absolute nightmare to try and apply any sort of versioning too, you basically cannot make any change to a C++ codebase and know whether its an API break or not (and that’s not even getting into ABI breakage).

As far as I can tell Rust basically looked at this mess and went “your problem here is that your language sucks, watch this” and then proceded to make predictability of breakage a core part of the language’s design goals. For example, you cannot implement a foreign trait on a foreign type, because that means that adding an implementation to a type would be a major breaking change — it would break dependents that had done it themselves. The layout of structs in memory is also undefined by default and explicitly UB to depend on, this lets library authors opt-in to making the exact layout of their datatypes part of their API (yes, in C++ adding a private field to a type can be breaking change)

This may seem like an incredibly small thing, and you might question whether it’s worth restricting the language for, but it makes a world of difference for dependency management, both for humans and the tools.

Epochs are another great feature. Adding features on a cycle isn’t exactly novel, but breaking things is (unless you’re Haskell). The ability for Rust to make breaking changes to the syntax and behaviour of the language in a way that still allows interop with old code is incredibly valuable.

Final thoughts

Hey you made it to the end (or skipped, can’t blame you :p). I hope you enjoyed my Rust ramblings and maybe even learned something.

Those who are familiar with many of these issues might notice a conspicuous absence from my language pain points list: generic async & const. The reason for this is twofold

I haven’t really bumped up against it much, so I can’t give a proper personal perspective
This is something that doesn’t really exist in any other language I’ve used, and it doesn’t feel right to fault Rust for not yet implementing a feature nobody else has either

That’s all folks, have a good one