Excellent summary on the problems that tree-sitter solves, how it differs from LSP, and why it's such a great fit for editors: https://www.masteringemacs.org/article/tree-sitter-complications-of-parsing-languages
I'm regularly impressed by how many parsers are available and how accurate they are.
miniblog.
Related Posts
On the challenge of writing accurate source spans on Unicode source code: https://reedmullanix.com/posts/unicode-source-spans.html
Also (see footnotes) a fair number of LSP clients assume UTF-8 despite early versions of LSP mandating UTF-16!
Whilst LLMs don't always give an accurate answer, the UI is really compelling. I keep finding users whose favourite way of doing research is an LLM.
Difftastic does syntax highlighting based on tree-sitter's parse of the *whole file*. It's more accurate than most diffs are able to do.
In this hunk, the opening " of the string literal isn't included, but difftastic still knows that the first lines are from a string.
