-
Notifications
You must be signed in to change notification settings - Fork 42
Open
Description
I was doing some testing to bring down the parser size. Tried deleting some of the operators and noticed there is a significant difference when changing the arrow
operators.
Case 1 (master, f1baa5f)
arrow: `
<-- --> <-->
← → ↔ ↚ ↛ ↞ ↠ ↢ ↣ ↦ ↤ ↮ ⇎ ⇍ ⇏ ⇐ ⇒ ⇔ ⇴ ⇶ ⇷ ⇸ ⇹ ⇺ ⇻ ⇼ ⇽ ⇾ ⇿ ⟵ ⟶ ⟷ ⟹ ⟺ ⟻ ⟼ ⟽ ⟾ ⟿
⤀ ⤁ ⤂ ⤃ ⤄ ⤅ ⤆ ⤇ ⤌ ⤍ ⤎ ⤏ ⤐ ⤑ ⤔ ⤕ ⤖ ⤗ ⤘ ⤝ ⤞ ⤟ ⤠ ⥄ ⥅ ⥆ ⥇ ⥈ ⥊ ⥋ ⥎ ⥐ ⥒ ⥓ ⥖ ⥗ ⥚ ⥛ ⥞
⥟ ⥢ ⥤ ⥦ ⥧ ⥨ ⥩ ⥪ ⥫ ⥬ ⥭ ⥰ ⧴ ⬱ ⬰ ⬲ ⬳ ⬴ ⬵ ⬶ ⬷ ⬸ ⬹ ⬺ ⬻ ⬼ ⬽ ⬾ ⬿ ⭀ ⭁ ⭂ ⭃ ⥷ ⭄ ⥺ ⭇ ⭈ ⭉
⭊ ⭋ ⭌ ← → ⇜ ⇝ ↜ ↝ ↩ ↪ ↫ ↬ ↼ ↽ ⇀ ⇁ ⇄ ⇆ ⇇ ⇉ ⇋ ⇌ ⇚ ⇛ ⇠ ⇢ ↷ ↶ ↺ ↻
`,
❯ du -sh src/parser.c
49M src/parser.c
❯ cat src/parser.c | rg "#define.*STATE"
#define STATE_COUNT 19881
#define LARGE_STATE_COUNT 9618
Case 2 (https://github.com/ChrHorn/tree-sitter-julia/commit/e66d1bf1a73e4e42e86a70830e0d02c2016cc92d)
Deleted most of the arrow
operators.
arrow: `
<-- --> <-->
← → ↔
`,
No visible change in states and parser size.
❯ du -sh src/parser.c
49M src/parser.c
❯ cat src/parser.c | rg "#define.*STATE"
#define STATE_COUNT 19881
#define LARGE_STATE_COUNT 9618
Case 3 (https://github.com/ChrHorn/tree-sitter-julia/commit/e64b8fcfd7fcc78fdfeacd54b145ab265367799f)
Notice the only difference to Case 2 is the one deleted ↔
arrow operator.
arrow: `
<-- --> <-->
← →
`,
Leads to a pretty significant reduction in states and parser size.
❯ du -sh src/parser.c
32M src/parser.c
❯ cat src/parser.c | rg "#define.*STATE"
#define STATE_COUNT 12760
#define LARGE_STATE_COUNT 5869
Not really sure what's going on. I don' think it's Unicode, for example
arrow: `
⥟ ⥢ ⥤ ⥦ ⥧ ⥨ ⥩ ⥪ ⥫ ⥬ ⥭ ⥰ ⧴ ⬱ ⬰ ⬲ ⬳ ⬴ ⬵ ⬶ ⬷ ⬸ ⬹ ⬺ ⬻ ⬼ ⬽ ⬾ ⬿ ⭀ ⭁ ⭂ ⭃ ⥷ ⭄ ⥺ ⭇ ⭈ ⭉
`,
also results in a smaller parser. I also only noticed this behavior when changing the arrow
operators. The change is always binary (either smaller, or current larger parser size), nothing in between.
@savq are you able to reproduce this on your end, any idea?
Metadata
Metadata
Assignees
Labels
No labels