-
Notifications
You must be signed in to change notification settings - Fork 69
[Fix] stream for llm completion #345
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Fix] stream for llm completion #345
Conversation
WalkthroughThe pull request modifies the Changes
Sequence Diagram(s)sequenceDiagram
participant C as Client
participant L as LLMEngine
participant R as Response
participant V as Validator
C->>L: call completion(request, stream)
alt stream is True
L->>R: Receive streaming response
R->>L: Send response part
L-->>C: Yield content part via generator
else stream is False
L->>R: Receive complete response
L->>V: Validate response
V-->>L: Validated response
L-->>C: Return validated response
end
Poem
🪧 TipsChatThere are 3 ways to chat with CodeRabbit:
Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments. CodeRabbit Commands (Invoked using PR comments)
Other keywords and placeholders
CodeRabbit Configuration File (
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
🧹 Nitpick comments (2)
edenai_apis/llmengine/llm_engine.py (2)
11-11
: Remove unused importThe
stream_chunk_builder
fromlitellm
is imported but not used in the code.-from litellm import stream_chunk_builder
🧰 Tools
🪛 Ruff (0.8.2)
11-11:
litellm.stream_chunk_builder
imported but unusedRemove unused import:
litellm.stream_chunk_builder
(F401)
847-854
: Consider consistent return types across stream handling methodsWhile this implementation works, it's worth noting that other streaming methods in this class (like
chat
andmultimodal_chat
) wrap their stream responses in aResponseType
object, while this method returns the raw generator. Consider standardizing this pattern for consistency.- if stream: - streaming_response = ( - part.choices[0].delta.content or "" for part in response - ) - return streaming_response - else: - response = ResponseModel.model_validate(response) - return response + if stream: + streaming_response = ( + part.choices[0].delta.content or "" for part in response + ) + return ResponseType[str]( + original_response=None, + standardized_response=streaming_response, + ) + else: + response = ResponseModel.model_validate(response) + return response
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (1)
edenai_apis/llmengine/llm_engine.py
(2 hunks)
🧰 Additional context used
🧬 Code Definitions (1)
edenai_apis/llmengine/llm_engine.py (1)
edenai_apis/llmengine/types/response_types.py (1)
ResponseModel
(56-119)
🪛 Ruff (0.8.2)
edenai_apis/llmengine/llm_engine.py
11-11: litellm.stream_chunk_builder
imported but unused
Remove unused import: litellm.stream_chunk_builder
(F401)
🔇 Additional comments (1)
edenai_apis/llmengine/llm_engine.py (1)
847-854
: Looks good - streaming implementation works correctlyThe implementation correctly handles streaming by returning a generator that yields content from each part of the response when
stream=True
, and returning the validated response otherwise.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
🧹 Nitpick comments (1)
edenai_apis/features/llm/chat/chat_dataclass.py (1)
217-218
: LGTM! Consider adding documentationThe new
StreamChat
class is well-structured and correctly implements the generator pattern for streaming responses. The proper type annotationGenerator[ModelResponseStream, None, None]
ensures type safety.For better maintainability, consider adding a class docstring explaining its purpose and relationship to the streaming functionality in the
LLMEngine
.class StreamChat(BaseModel): + """ + A model for streaming chat completions. + + Attributes: + stream: Generator yielding ModelResponseStream objects for incremental responses. + """ stream: Generator[ModelResponseStream, None, None]
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Lite
📒 Files selected for processing (2)
edenai_apis/features/llm/chat/chat_dataclass.py
(2 hunks)edenai_apis/llmengine/llm_engine.py
(2 hunks)
🚧 Files skipped from review as they are similar to previous changes (1)
- edenai_apis/llmengine/llm_engine.py
🔇 Additional comments (1)
edenai_apis/features/llm/chat/chat_dataclass.py (1)
1-1
: LGTM! Necessary imports added for stream functionalityThe added imports of
Generator
fromtyping
andModelResponseStream
fromlitellm
directly support the new streaming functionality being introduced.Also applies to: 4-4
Summary by CodeRabbit
StreamChat
class to facilitate streaming responses.