In assembly, JMP may be encoded as various different opcodes, and compilers (LLVM here) carefully choose the best: http://t.co/fFSbavVxd6
miniblog.
Related Posts
Optimising GHC, implementing assembly pretty-printers, and the tradeoffs of implementing against an interface: https://www.tweag.io/blog/2022-12-22-making-ghc-faster-at-emitting-code/
Optimising fizzbuzz with hand-written x86-64 assembly and AVX2: https://codegolf.stackexchange.com/a/236630
I guess it was only a matter of time, but it's an impressive achievement: 31GB/second!
Finally got dynamic string allocation working in my toy compiler! https://github.com/Wilfred/proper-compiler-hat/commit/cd726f45eb0540eb54c2c3c7e0ab75a651c46a43
Implementing intrinsics made this way easier: writing a single large assembly function is a pain.