Liquid syntax error: Variable '{{% raw %}' was not properly terminated with regexp: /\}\}/
For further actions, you may consider blocking this person and/or reporting abuse
Liquid syntax error: Variable '{{% raw %}' was not properly terminated with regexp: /\}\}/
For further actions, you may consider blocking this person and/or reporting abuse
Top comments (1)
The trampoline pattern is one of those things that's obvious in hindsight but painful to discover. Nice writeup.
One thing that caught my eye: the 48% hot-path cost from
Box<dyn Fn>vtable indirection is exactly the kind of overhead that shows up in embedded robotics work too. When you're dispatching sensor events at 1kHz+ on a Cortex-A (like a Raspberry Pi Compute Module), every nanosecond of vtable lookup compounds across thousands of subscribers per frame.I'm curious about the multi-event-type scenario where entt pulls ahead (9ns vs 24ns). Since
TypeId::of::<E>()compiles to amovinstruction loading a compile-time constant, the bottleneck must be the HashMap probe itself — probe count, branch misprediction on the hash, or cache line misses in the bucket chain. Have you tried replacing the HashMap with aVec<(TypeId, Vec<Subscriber>)>and doing a linear scan? For the typical case of 5-15 event types, a linear scan over contiguous memory might actually beat the hash lookup due to prefetcher friendliness.Also, the 6-instruction hot loop is beautiful. But I wonder: since you're storing
callas a function pointer in the Subscriber struct, thatcall *(%r15)is still one level of indirection. If all subscribers for the same event type share the same trampoline (sameEbut differentF), you could batch them — sort bycallpointer, then invoke each trampoline once for its contiguous run of subscribers. That would amortize the indirect call cost. The sort itself is cheap since you'd only need to do it once peremit.As for the nerd-snipe challenge — the 1,289ns for 1000 subscribers is already impressively tight. My gut says the next win would come from SIMDeez-style packing if the event payloads are small enough, but that would break the zero-dependency constraint. Looking forward to seeing if someone cracks it.