Source from repo
Rust Best Practices

Idiomatic Rust code guidance based on Apollo GraphQL's best practices handbook for ownership, errors, and performance.
apollographqlGitHub apollographqlSource repo Original GitHub link Publisher page
Files
Skill
n/a
Size
89.7 KB
Entrypoint
SKILL.md
Format
git-repo
Open file
references/chapter_03.md

Syntax-highlighted preview of this file as included in the skill package.
Rendered Source
markdown210 linesFree
references/chapter_03.md
1# Chapter 3 - Performance Mindset
2 
3The **golden rule** of performance work:
4 
5> Don't guess, measure.
6 
7Rust code is often already pretty fast - don't "optimize" without evidence. Optimize only after finding bottlenecks.
8 
9### A good first steps
10* Use `--release` flag on you builds (might sound dummy, but it is quite common to hear people complaining that their Rust code is slower than their X language code, and 99% of the time is because they didn't use the `--release` flag).
11* `$ cargo clippy -- -D clippy::perf` gives you important tips on best practices for performance.
12* [`cargo bench`](https://doc.rust-lang.org/cargo/commands/cargo-bench.html) is a cargo tool to create micro-benchmarks and test different code solutions. Write a test scenario and bench you solution against the original code, if your improvement is larger than 5%, might be a good performance improvement.
13* [`cargo flamegraph`](https://github.com/flamegraph-rs/flamegraph) a powerful profiler for Rust code. For MacOS, [samply](https://github.com/mstange/samply) might be a better DX option.
14 
15> #### Further reading on Benchmarking:
16> - [How to build a Custom Benchmarking Harness in Rust](https://bencher.dev/learn/benchmarking/rust/custom-harness/)
17 
18 
19## 3.1 Flamegraph
20 
21Flamegraph helps you visualize how much time CPU spent on each task.
22 
23```shell
24# Installing flamegraph
25cargo install flamegraph
26 
27# cargo support provided through the cargo-flamegraph binary!
28# defaults to profiling cargo run --release
29cargo flamegraph
30 
31# by default, `--release` profile is used,
32# but you can override this:
33cargo flamegraph --dev
34 
35# if you'd like to profile a specific binary:
36cargo flamegraph --bin=stress2
37 
38# Profile unit tests.
39# Note that a separating `--` is necessary if `--unit-test` is the last flag.
40cargo flamegraph --unit-test -- test::in::package::with::single::crate
41cargo flamegraph --unit-test crate_name -- test::in::package::with::multiple:crate
42 
43# Profile integration tests.
44cargo flamegraph --test test_name
45 
46# Run criterion benchmark
47# Note that the last --bench is required for `criterion 0.3` to run in benchmark mode, instead of test mode.
48cargo flamegraph --bench some_benchmark --features some_features -- --bench
49 
50# Run workspace example
51cargo flamegraph --example some_example --features some_features
52```
53 
54> ❗ Always run your profiles with `--release` enabled, the `--dev` flag isn't realistic as it doesn't have optimizations enabled.
55 
56The result will look like a flame graph where:
57 
58* The `y-axis` shows the **stack depth number**. When looking at a flamegraph, the main function of your program will be closer to the bottom, and the called functions will be stacked on top, with the functions that they call stacked on top of them.
59 
60* The `width of each box` shows the **total time that that function** is on the CPU or is part of the call stack. If a function's box is wider than others, that means that it consumes more CPU per execution than other functions, or that it is called more than other functions.
61 
62> ❗ The **color of each box** isn't significant, and **is chosen at random**.
63 
64### 🚨 Remember
65* Thick stacks: heavy CPU usage
66* Thin stacks: low intensity (cheap)
67 
68## 3.2 Avoid Redundant Cloning
69 
70> Cloning is cheap... **until it isn't**
71 
72In sections [Borrowing over Cloning](./chapter_01.md#11-borrowing-over-cloning) and [Important Clippy lints to respect](./chapter_02.md#23-important-clippy-lints-to-respect) we mentioned the impacts of cloning and the relevant clippy lint [`redundant_clone`](https://rust-lang.github.io/rust-clippy/master/#redundant_clone), so in this section we will explore a bit "when to pass ownership".
73 
74* 🚨 If you really need to clone, leave it to the last moment.
75 
76### When to pass ownership?
77 
78* Only `.clone()` if you truly need a new owned copy. A few examples:
79    * Crate API Design requires owned data.
80    * Have overloaded `std::ops` but still need ownership to the old data:
81    ```rust
82    use std::ops::Add;
83 
84    #[derive(Debug, Copy, Clone, PartialEq)]
85    struct Point {
86        x: i32,
87        y: i32,
88    }
89 
90    impl Add for Point {
91        type Output = Self;
92 
93        fn add(self, other: Self) -> Self {
94            Self {
95                x: self.x + other.x,
96                y: self.y + other.y,
97            }
98        }
99    }
100 
101    assert_eq!(Point { x: 1, y: 0 } + Point { x: 2, y: 3 },
102               Point { x: 3, y: 3 });
103    ```
104    * Need to do comparison snapshots or due to API you need multiple owned instances of the data.
105    ```rust
106    fn snapshot(a: &MyValue, b:&MyValue) -> MyValueDiff {
107        a - b
108    }
109 
110    impl Sub for MyValue {
111        type Output = MyValueDiff;
112 
113        fn sub(self, other: Self) -> MyValue {
114            ...
115        }
116    }
117 
118    fn main() {
119        let mut a = MyValue::default();
120        let b = a.clone();
121 
122        a.magical_update();
123        println!("{:?}", snapshot(&a, &b));
124    }
125    ```
126* You have reference counted pointers (`Arc, Rc`).
127* You have small structs that are to big to `Copy` but as costly as `std::collections`. An example is HTTP client like `hyper_util::client::legacy::Client` that cloning allows you to share the connection pool.
128* You have a chained struct modifier that needs owned mutation, some **builders** require owned mutation, but most custom builders can be done with `pub fn with_xyz(&mut self, value: Xyz) -> &mut Self`.
129```rust
130// Inline `HashMap` insertion extension
131 
132fn insert_owned(mut self, key: K, value: V) -> Self {
133    self.insert(key, value);
134    self
135}
136```
137* Ownership can also be a good way to model business logic / state. For example:
138```rust
139let not_validated: String = ...;// some user source
140let validated = Validate::try_from(not_validated)?;
141// Technically that `try_from` maybe didn't need ownership, but taking it lets us model intent
142```
143 
144### When **NOT** to pass ownership?
145 
146* Prefer API designs that take reference (`fn process(values: &[T])`), instead of ownership (`fn process(values: Vec<T>)`).
147* If you only need read access to elements, prefer `.iter` or slices:
148```rust
149for item in &some_vec {
150    ...
151}
152```
153* You need to mutate data that is owned by another thread, use `&mut MyStruct`.
154 
155### Use `Cow` for `Maybe Owned` data
156 
157Sometimes you don't actually need owned data, but that is not clear from the API perspective, so using [`std::borrow::Cow`](https://doc.rust-lang.org/std/borrow/enum.Cow.html) is a way to efficiently address this case:
158 
159```rust
160use std::borrow::Cow;
161 
162fn hello_greet(name: Cow<'_, str>) {
163    println!("Hello {name}");
164}
165 
166hello_greet(Cow::Borrowed("Julia"));
167hello_greet(Cow::Owned("Naomi".to_string()));
168```
169 
170## 3.3 Stack vs Heap: Be size-smart!
171 
172### ✅ Good Practices 
173 
174* Keep small types (`impl Copy`, `usize`, `bool`, etc) **on the stack**.
175* Avoid passing huge types (`> 512 bytes`) by value or transferring ownership. Prefer pass by reference (e.g. `&T` and `&mut T`).
176* Heap allocate recursive data structures:
177```rust
178enum OctreeNode<T> {
179    Node(T),
180    Children(Box<[Node<T>; 8]>),
181}
182```
183* Return small types by value, types that implement `Copy` or a cheaply Cloned are efficient to return by value (e.g. `struct Vector2 {x: f32, y: f32}`).
184 
185### ❗ Be Mindful
186 
187* Only use `#[inline]` when benchmark proves beneficial, Rust is already pretty good at inlining **without** hints.
188* Avoid massive stack allocations, box them. Example `let buffer: Box<[u8; 65536]> = Box::new(..)` would first allocate `[u8; 65536]` on the stack then box it, a non-const solution to this would be `let buffer: Box<[u8]> = vec![0; 65536].into_boxed_slice()`.
189* For large `const` arrays, considering using [crate smallvec](https://docs.rs/smallvec/latest/smallvec/) as it behaves like an array, but is smart enough to allocate large arrays on the heap.
190 
191## 3.4 Iterators and Zero-Cost Abstractions
192 
193Rust iterators are lazy, but eventually compiled away into very efficient tight loops that are only called when consumed. Chaining `.filter()`, `.map()`, `.rev()`, `.skip()`, `.take()`, `.collect()` usually doesn't cost extra and the compiler can reason well enough how to optimize them.
194* Prefer `iterators` over manual `for` loops when working with collections, the compiler can optimize them better than manually doing it.
195* Calling `.iter()` only creates a **reference** to the original collection, this allows you to hold multiple iterators of the same collection.
196 
197#### ❗ Avoid creating intermediate collections unless it is really needed:
198 
199* Consider that `process` accepts an `iterator`.
200* ❌ BAD - useless intermediate collection:
201```rust
202let doubled: Vec<_> = items.iter().map(|x| x * 2).collect();
203process(doubled);
204```
205* ✅ GOOD - pass the iterator (`fn process(arg: impl Iterator<Item = T>)`):
206```rust
207let doubled_iter = items.iter().map(|x| x * 2);
208process(doubled_iter);
209```
210
Preparing the source view

Rust Best Practices

references/chapter_03.md