I've worked on many projects where tests are have discrete levels, usually something like unit test, integration test, end-to-end test.
I've also seen elaborate arguments over what counts as a unit, especially in heavily OO codebases.
Related Posts
I love how the CommonMark Spec has a test suite that's just a JSON array. It's really easy to test a library for compliance, and I've seen developers nerd-sniped into full compliance.
https://spec.commonmark.org/0.31.2/spec.json
I'm still experimenting with UIs for live (sandboxed) evaluation of tests. I've realised that you really want to highlight the failing assertion, not just the failing test.
Feedback welcome :)
LLMs seem to handle dependency upgrades really well.
The task is well-specified, there's usually a build/test suite to check correctness of the modifications, and there's often a changelog they can consume too.