So far, all of the people trying difftastic with huge files have been using C or C++ source code. Maybe it's more common in those communities?
(Difftastic will eventually fall back to fast, dumb, line-based diffing if you give it a multi-megabyte source file.)
Related Posts
On the challenge of writing accurate source spans on Unicode source code: https://reedmullanix.com/posts/unicode-source-spans.html
Also (see footnotes) a fair number of LSP clients assume UTF-8 despite early versions of LSP mandating UTF-16!
I'm experimenting with diagnostics formatting.
* I've added a left margin, showing both the file name and line numbers
* I'm showing one line of context above/below the offending line.
* I'm using grey for comments.
What do you think? Is there anything you'd change?
Difftastic has been cited in a paper!
Modernizing SMT-Based Type Error Localization https://arxiv.org/abs/2408.09034
The authors use difftastic to work out which parts of a buggy program have actually changed, a great use case :)