Skip to content

Passkey retrieval results #22

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
RonanKMcGovern opened this issue Sep 30, 2023 · 0 comments
Open

Passkey retrieval results #22

RonanKMcGovern opened this issue Sep 30, 2023 · 0 comments

Comments

@RonanKMcGovern
Copy link

Thanks for releasing this model.

Have you run any passkey retrieval tests?

I note the use of a sliding window for attention. Although this captures n_layers * window_len in width of attention, some work LM-Infinite seems to suggest that isn't enough to get good passkey retrieval. Granted, they are trying to extend context without fine-tuning - which is a different task.

The launch post says that use of sliding window does not affect quality. In what way did you measure that?

Also, is Mistral 7B just using the sliding window OR also adding in historical chunks of attention too?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant