Skip to content

Commit 790b8d4

Browse files
committed
New notes on the layout of State
I gave myself a few minutes with a quiet EC2 spot instance to run the benchmarks, and my experiments with this struct still haven't borne fruit.
1 parent 31ac09e commit 790b8d4

1 file changed

Lines changed: 15 additions & 5 deletions

File tree

src/transcode/stream.rs

Lines changed: 15 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -143,11 +143,21 @@ enum ErrorSource {
143143
/// This type is designed to facilitate this pattern, generally as part of a newtype struct that
144144
/// implements a specific Serde trait.
145145
struct State<P, E> {
146-
// Only Forwarder actually needs interior mutability for its state (due to `serialize` taking
147-
// `&self`). However, putting Forwarder's full state in a `Cell` has a 15% - 25% impact on
148-
// translation performance for formats that aren't expensive to encode or decode (like
149-
// MessagePack). I can only guess at the possible optimizations and/or CPU microarchitectural
150-
// effects that could be at play, since I haven't found a good way to understand this in depth.
146+
// Only Forwarder actually needs interior mutability, due to `serialize` taking `&self`.
147+
// However, two obvious ways of moving the interior mutability into Forwarder end up hurting
148+
// transcoding performance:
149+
//
150+
// 1. When removing Cell from these fields but otherwise leaving the layout as-is, performance
151+
// slows by as much as 15% - 25%, especially on formats like MessagePack that aren't
152+
// expensive to encode or decode on their own.
153+
//
154+
// 2. When turning this into an enum (with empty, parent, and error states), the impact is
155+
// closer to 5%. Not as bad, but still measurable and consistent.
156+
//
157+
// These numbers are based on Forwarder using Cell for interior mutability, but RefCell doesn't
158+
// improve them either. I can only guess at the optimizations or CPU microarchitectural effects
159+
// that might be at play. My suspicion is that this version somehow minimizes data copying, but
160+
// I don't know how to confirm or deny that hypothesis.
151161
//
152162
// Serde traits are implemented on `&mut` references where possible to help mark where mutation
153163
// is happening, even though `&` would technically work.

0 commit comments

Comments
 (0)