Incremental parsing is ineffective when a new tag is opened

Due to the way changes in the external scanner state prevent reuse of nodes created with another state, if you open a new tag (which is a common editing operation) somewhere in a big document, everything coming after that tag will be re-parsed. (Same with removing or renaming an opening tag.) This is somewhat similar to the situation when you open a block comment marker, but there the structure of the content actually changes—here you will in most situations just end up creating the exact same nodes.

But not in all situations, I guess, due to the affront to context-freedom that is implicitly closed elements. So I guess this behavior is defensible. But it seems unfortunate. Is it a conscious design decision? Have there been any attempts at workarounds? (Like maybe a mechanism to make dependence on scanner state more fine-grained and disabling the state compare for nodes that have no scanner-state-dependent tokens in them—I _think_ that could be made to work for the HTML case, but it may not be worthwhile in any other situation. I noticed the scanner state approach fits the Python scanner very well—there it encodes exactly the thing you need, without wasting any useful opportunities for reuse.)

As always, feel free to close as 'out of scope'.

<details><summary>Crude benchmark</summary>

```
let Parser = require("tree-sitter")
let p = new Parser
p.setLanguage(require("tree-sitter-html"))

function time(name, f) {
  f() // warmup
  for (let t0 = Date.now(), count = 1, t;; count++) {
    f()
    if ((t = Date.now() - t0) > 1000) {
      console.log(name, (count / (t / 1000)).toFixed(2) + "/s")
      break
    }
  }
}

let doc = "<html>\n  <body>\n" + "    <p>Lots of <span>content</span> here</p>\n".repeat(10000)

time("Parse from scratch", () => p.parse(doc))

let ast2 = p.parse(doc)
let doc2 = doc.slice(0, 15) + "<div>" + doc.slice(15)
ast2.edit({
  startIndex: 15,
  oldEndIndex: 15,
  newEndIndex: 20,
  startPosition: {row: 1, column: 8},
  oldEndPosition: {row: 1, column: 8},
  newEndPosition: {row: 1, column: 13},
})

time("Opened new tag", () => p.parse(doc2, ast2))

let ast3 = p.parse(doc)
let doc3 = doc.slice(0, 15) + "<div></div>" + doc.slice(15)
ast3.edit({
  startIndex: 15,
  oldEndIndex: 21,
  newEndIndex: 20,
  startPosition: {row: 1, column: 8},
  oldEndPosition: {row: 1, column: 8},
  newEndPosition: {row: 1, column: 19},
})

time("Inserted new tag", () => p.parse(doc3, ast3))

// On my machine:
// Parse from scratch 9.36/s
// Opened new tag 9.31/s
// Inserted new tag 49.65/s
```

</details>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

Incremental parsing is ineffective when a new tag is opened #23

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Uh oh!

Incremental parsing is ineffective when a new tag is opened #23

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions