Description
@davidanthoff asked on Zulip about incremental reparsing.
is there support for partial reparses, i.e. some sort of incremental parsing? Basic idea is that user presses one key in the editor, and we don't want to reparse the whole document on every key press, but only a subset, based on the precise range of the doc that was edited
To capture my thoughts on this somewhere more permanent, I think this should work fine but there's a couple of tricky things to work out:
First, how are the changed bytes supplied to the parser system? I haven't looked into LanguageServer yet. But presumably it's "insert this byte here" or "change line 10 to 'such-and-such' string". Those might require a representation of the source which isn't a String
(or Vector{UInt8}
buffer). It might be a rope data structure or something? Should we extend the SourceFile
abstraction to allow different AbstractString
types? Or perhaps this state should be managed outside the parser completely? Internally, I feel the lexer and parser should always operate on Vector{UInt8}
as a concrete efficient datastructure for UTF-8 encoded text, so the subrange of text which is being parsed should probably be copied into one of these for use by the tokenizer.
Second, the new source text intersects with the existing parse tree node(s) which cover some range of bytes. There can be several such nodes nested together; which one do we choose? Equivalently, which production (JuliaSyntax.parse_*
function) do we start reparsing from? Starting deeper in the tree is good because it implies a smaller span, but the parser may have nontrivial state which isn't explicit in the parse tree. For example, space sensitive parsing within []
or macro calls. Or the special parsing of in
as =
within iterator specification of a for loop. So we'd need a list of rules to specify which productions we can restart parsing from, and correctly reconstruct the ParseState for those cases. To start with, toplevel/module scope is probably fine and we could throw something together quickly for that, I think.