-
-
Notifications
You must be signed in to change notification settings - Fork 8.6k
Add kimi-k2 tool parser #20789
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add kimi-k2 tool parser #20789
Conversation
Signed-off-by: wangzhengtao <[email protected]>
👋 Hi! Thank you for contributing to the vLLM project. 💬 Join our developer Slack at https://slack.vllm.ai to discuss your PR in #pr-reviews, coordinate on features in #feat- channels, or join special interest groups in #sig- channels. Just a reminder: PRs would not trigger full CI run by default. Instead, it would only run Once the PR is approved and ready to go, your PR reviewer(s) can run CI to test the changes comprehensively before merging. To run CI, PR reviewers can either: Add 🚀 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Summary of Changes
Hello @MoyanZitto, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!
This pull request integrates support for the Kimi-K2 model's tool parsing capabilities into the system. It provides a dedicated parser to interpret the model's specific tool call syntax, enabling both synchronous and streaming extraction of function calls, and updates the model configuration to properly handle Kimi-K2 models.
Highlights
- New Tool Parser: A
KimiK2ToolParser
class has been introduced to specifically handle the unique tool call format of the Kimi-K2 model. This parser supports both static (non-streaming) and streaming extraction of function calls. - Tool Parser Integration: The newly added
KimiK2ToolParser
is registered within thevllm
's OpenAI entrypoint, making it discoverable and usable by the system for processing Kimi-K2 model outputs. - Model Configuration Update: The
vllm
configuration has been updated to recognizekimi_k2
models. Specifically, it's included in theis_deepseek_mla
check, suggesting that Kimi-K2 models might share similar Multi-Layer Attention (MLA) related properties or configurations with DeepSeek models. - Comprehensive Testing: A new test file (
test_kimi_k2_tool_parser.py
) has been added to validate the functionality of theKimiK2ToolParser
. These tests cover various scenarios, including no tools, single and multi-tool calls, handling invalid JSON in arguments, invalid function call formats, and basic streaming functionality. Note that the entire test module is currently skipped.
Using Gemini Code Assist
The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.
Invoking Gemini
You can request assistance from Gemini at any point in your pull request via creating an issue comment (i.e. comment on the pull request page) using either /gemini <command>
or @gemini-code-assist <command>
. Below is a summary of the supported commands.
Feature | Command | Description |
---|---|---|
Code Review | /gemini review |
Performs a code review for the current pull request in its current state. |
Pull Request Summary | /gemini summary |
Provides a summary of the current pull request in its current state. |
Comment | @gemini-code-assist | Responds in comments when explicitly tagged, both in issue comments and review comments. |
Help | /gemini help |
Displays a list of available commands. |
Customization
To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/
folder in the base of the repository. Detailed instructions can be found here.
Limitations & Feedback
Gemini Code Assist is currently in preview and may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments to provide feedback.
You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.
Footnotes
-
Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution. ↩
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Code Review
This pull request introduces a tool parser for the kimi-k2
model. The implementation is based on an existing parser and includes a new set of tests.
I've identified a few issues, including a critical bug in the streaming parsing logic and some areas where the code could be made more robust. Additionally, the tests for the new parser are currently skipped and should be enabled. I've provided specific suggestions to address these points.
tool_id = (current_tool_call_name_matches.groups()) | ||
tool_name = tool_id.split('.')[1].split | ||
current_tool_call['id'] = tool_id | ||
current_tool_call["name"] = tool_name | ||
current_tool_call["arguments"] = "" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This block contains multiple issues that will lead to a runtime error:
current_tool_call_name_matches.groups()
returns a tuple.tool_id
will be a tuple, and calling.split()
on it will raise anAttributeError
.tool_id.split('.')[1].split
is a reference to thesplit
method, not a call to it. It's also missing the separator argument.current_tool_call['id'] = tool_id
would assign a tuple to the ID, which is incorrect.
Correct this logic to properly extract the tool ID and name.
tool_id_str, = current_tool_call_name_matches.groups()
tool_name = tool_id_str.split('.')[1].split(':')[0]
current_tool_call['id'] = tool_id_str
current_tool_call["name"] = tool_name
current_tool_call["arguments"] = ""
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@MoyanZitto this issue seems real
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fixed,thanks you and gemini!
if result is not None and hasattr(result, | ||
'tool_calls') and result.tool_calls: | ||
assert len(result.tool_calls) >= 0 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The assertions in this streaming test are too generic. The test only checks if result.tool_calls
is a list with a non-negative length, but it doesn't validate the content of the tool calls. Add specific assertions to verify that the correct tool call information (e.g., function name, arguments) is extracted and streamed as expected.
if (self.tool_calls_start_token_id is None | ||
or self.tool_calls_end_token_id is None): | ||
raise RuntimeError( | ||
"Kimi-K2 Tool parser could not locate tool call start/end " | ||
"tokens in the tokenizer!") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The check for special tokens is incomplete. It verifies tool_calls_start_token_id
and tool_calls_end_token_id
, but it misses tool_call_start_token_id
and tool_call_end_token_id
, which are also retrieved from the vocabulary and used in the streaming logic. Update the condition to check for all required special token IDs to prevent potential runtime errors.
if (self.tool_calls_start_token_id is None
or self.tool_calls_end_token_id is None
or self.tool_call_start_token_id is None
or self.tool_call_end_token_id is None):
raise RuntimeError(
"Kimi-K2 Tool parser could not locate one or more tool call "
"start/end tokens in the tokenizer!")
for match in function_call_tuples: | ||
function_id, function_args = match | ||
# function_id: functions.get_weather:0 | ||
function_name = function_id.split('.')[1].split(':')[0] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The logic for extracting the function name is brittle. Using split('.')[1]
assumes that the function name does not contain any dots and that there is always a prefix like functions.
. For example, this would fail for a function_id
like functions.weather.get:0
. A more robust approach would be to split from the right by the colon first to separate the name from the index, and then extract the function name part. Apply this same logic in extract_tool_calls_streaming
as well.
function_name = function_id.rsplit(":", 1)[0].split(".", 1)[1]
diff = (diff.encode("utf-8").decode("unicode_escape") | ||
if diff is str else diff) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if current_tool_call is None: | ||
return None |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Verified locally.
Head branch was pushed to by a user without write access
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you. We can also add a section to the Tool Calling documentation here https://github.com/vllm-project/vllm/blob/main/docs/features/tool_calling.md#hermes-models-hermes
merging to trigger nightly wheel generation first. we can enable the test later, and add the tool parser to the doc as suggested by @mgoin |
Essential Elements of an Effective PR Description Checklist
supported_models.md
andexamples
for a new model.Purpose
This PR is submitted to add kimi_k2 supports, mainly the tool parser
Test Plan
Corresponding tests were added into tests.
Test Result
tool use tests temporarily skipped
(Optional) Documentation Update
No updates.