Skip to content

Feat/optimize model gateway #398

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 3 commits into from
Oct 31, 2024

Conversation

mnvsk97
Copy link
Contributor

@mnvsk97 mnvsk97 commented Oct 30, 2024

  1. Avoid creating a model instance for every request and instead cache by model name, config, and other metadata.
  2. Add simple local dict based cache for embedding, llm, reranker, and audio models.
  3. Always check in cache before creating an instance of a model to support 1.
  4. Add cachetools library to support simple caching mechanisms and also for complex cases in the future.
  5. Add documentation for each method in the file

@mnvsk97 mnvsk97 enabled auto-merge (squash) October 30, 2024 08:46
@mnvsk97 mnvsk97 merged commit dc520ae into truefoundry:main Oct 31, 2024
1 check passed
S1LV3RJ1NX pushed a commit that referenced this pull request Nov 29, 2024
* feat: model gateway optimization

---------

Co-authored-by: Sai krishna <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants