Skip to content

fast-forward tokens with llguidance #965

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
mmoskal opened this issue Dec 2, 2024 · 0 comments
Open

fast-forward tokens with llguidance #965

mmoskal opened this issue Dec 2, 2024 · 0 comments
Labels
new feature New feature or request

Comments

@mmoskal
Copy link
Contributor

mmoskal commented Dec 2, 2024

"fast-forward" or "forced" tokens are ones that are determined by the current state of a constraint.

An example, where fast forward tokens are useful is generating data adhering to a certain JSON schema. The constraint first forces {"name":" to be generated, then the model generates John", the controller forces ,\n"age":, model generates 42, and so on. Another example is chain-of-thought reasoning, where after the model generated a sentence, the controller forces more instructions for the model, the model generates more text, and so on. If used, these significantly speed up generation process, because they can be processed in one forward pass, similar to the prompt.

One can think of them as 100% accurate speculation.

The llguidance library supports computing them, so it would be nice to allow them in mistral.rs.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
new feature New feature or request
Projects
None yet
Development

No branches or pull requests

1 participant