I'm still experimenting with UIs for live (sandboxed) evaluation of tests. I've realised that you really want to highlight the failing assertion, not just the failing test.
Feedback welcome :)
It's odd how lazy evaluation is generally seen as a niche design choice, yet the vast majority of languages treat `foo() || bar()` as short-circuiting.
I'm fascinated to learn that people are discovering weaknesses in state-of-the-art bots for playing Go, such that a novice player can reliably win: https://goattack.far.ai/human-evaluation
This suggests that self-play doesn't always generalise: it's not sufficient to beat earlier versions.