Several security vulnerability types are due to misusing strings: command injection, SQL injection, cross-site scripting.
How far could you go with a language that didn't have strings? You might need a Prose type that's a list of Unicode chars, but only use it for printing.
miniblog.
Related Posts
On the challenge of writing accurate source spans on Unicode source code: https://reedmullanix.com/posts/unicode-source-spans.html
Also (see footnotes) a fair number of LSP clients assume UTF-8 despite early versions of LSP mandating UTF-16!
In LSP, a position is represented as a line number and a column offset (in Unicode code units): https://microsoft.github.io/language-server-protocol/specifications/lsp/3.17/specification/#position
This is pretty elegant. You'll get the correct line regardless of encoding bugs, and the editor already knows the line number so it's cheap to compute.
Today in aggravating edge cases: difftastic would crash when line-wrapping content where Unicode combining characters occurred on the boundary. Argh.