I love how the CommonMark Spec has a test suite that's just a JSON array. It's really easy to test a library for compliance, and I've seen developers nerd-sniped into full compliance.
https://spec.commonmark.org/0.31.2/spec.json
Related Posts
LLMs seem to handle dependency upgrades really well.
The task is well-specified, there's usually a build/test suite to check correctness of the modifications, and there's often a changelog they can consume too.
I'm still experimenting with UIs for live (sandboxed) evaluation of tests. I've realised that you really want to highlight the failing assertion, not just the failing test.
Feedback welcome :)
"Example Driven Development" using Glamorous and Pharo Smalltalk: https://medium.com/feenk/an-example-of-example-driven-development-4dea0d995920
Tests returning values and composing is a really interesting model. It establishes structure and shows which test failure is the most 'fundamental'.