C++ compiler beating hand written assembly: https://stackoverflow.com/questions/40354978/why-is-this-c-code-faster-than-my-hand-written-assembly-for-testing-the-collat If you haven't measured the cycle time of the different instructions, the optimiser can often do better!