Passkey retrieval results #22

RonanKMcGovern · 2023-09-30T09:54:14Z

Thanks for releasing this model.

Have you run any passkey retrieval tests?

I note the use of a sliding window for attention. Although this captures n_layers * window_len in width of attention, some work LM-Infinite seems to suggest that isn't enough to get good passkey retrieval. Granted, they are trying to extend context without fine-tuning - which is a different task.

The launch post says that use of sliding window does not affect quality. In what way did you measure that?

Also, is Mistral 7B just using the sliding window OR also adding in historical chunks of attention too?

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Passkey retrieval results #22

Passkey retrieval results #22

RonanKMcGovern commented Sep 30, 2023

Passkey retrieval results #22

Passkey retrieval results #22

Comments

RonanKMcGovern commented Sep 30, 2023