You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I cannot share the full thing but I've replicated the issue with the example below. This is the intended structure of my response (I know it's not ideal but I cannot change it at this point):
Preamble with free text (except this sequence of characters "<|start|>"
<|start|>
generate some structured response
<|end|>
Issue
Take a look at the definition of preamble. Is there a better way to avoid a sequence of characters?
Is there a better way to avoid a sequence of characters?
Also do you think having too much free text penalises the performance? do you guys have some benchmarks on this?
Intuitively I think that the more structured the response the better because it can take advantage of skipping a few forward passess (decoding several tokens at once).
See issue I opened in vLLM
The text was updated successfully, but these errors were encountered: