Writing a fast lexer: many parts of a compiler toolchain (other than optimisation) are O(N), and the lexer has the largest values of N.
A walkthrough showing different designs and performance considerations.
https://nothings.org/computer/lexing.html
Related Posts
I've just squeezed another 5% of performance out of difftastic by finding a few HashSet values that weren't FxHashSet.
I do wonder whether hash DoS resistance is a good default. Sure, Rust programs are often pretty fast anyway, but it feels like a different threat model to the rest of Rust.
I see that *up has become an increasingly common name for toolchain installers: rustup, ghcup, even juliaup.
I think Rust was the first to use this terminology? I'm curious how similar the different *up tools are.
It's so strange that we talk about languages being slow, and have done for years. Computer performance has increased so much in this time.
https://hbfs.wordpress.com/2009/11/10/is-python-slow/ (shared on HN in 2009) discusses Python being slow. My underpowered Thinkpad has 20x the single-threaded performance! https://www.cpubenchmark.net/compare/73vs3766/AMD-Athlon-64-4000+-vs-AMD-Ryzen-5-PRO-4650U
Maybe *relative* performance of languages matters more?