Skip to content

Incremental reparsing #5

Open
Open
@c42f

Description

@c42f

@davidanthoff asked on Zulip about incremental reparsing.

is there support for partial reparses, i.e. some sort of incremental parsing? Basic idea is that user presses one key in the editor, and we don't want to reparse the whole document on every key press, but only a subset, based on the precise range of the doc that was edited

To capture my thoughts on this somewhere more permanent, I think this should work fine but there's a couple of tricky things to work out:

First, how are the changed bytes supplied to the parser system? I haven't looked into LanguageServer yet. But presumably it's "insert this byte here" or "change line 10 to 'such-and-such' string". Those might require a representation of the source which isn't a String (or Vector{UInt8} buffer). It might be a rope data structure or something? Should we extend the SourceFile abstraction to allow different AbstractString types? Or perhaps this state should be managed outside the parser completely? Internally, I feel the lexer and parser should always operate on Vector{UInt8} as a concrete efficient datastructure for UTF-8 encoded text, so the subrange of text which is being parsed should probably be copied into one of these for use by the tokenizer.

Second, the new source text intersects with the existing parse tree node(s) which cover some range of bytes. There can be several such nodes nested together; which one do we choose? Equivalently, which production (JuliaSyntax.parse_* function) do we start reparsing from? Starting deeper in the tree is good because it implies a smaller span, but the parser may have nontrivial state which isn't explicit in the parse tree. For example, space sensitive parsing within [] or macro calls. Or the special parsing of in as = within iterator specification of a for loop. So we'd need a list of rules to specify which productions we can restart parsing from, and correctly reconstruct the ParseState for those cases. To start with, toplevel/module scope is probably fine and we could throw something together quickly for that, I think.

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions