fast-forward tokens with llguidance #965

mmoskal · 2024-12-02T17:30:14Z

"fast-forward" or "forced" tokens are ones that are determined by the current state of a constraint.

An example, where fast forward tokens are useful is generating data adhering to a certain JSON schema. The constraint first forces {"name":" to be generated, then the model generates John", the controller forces ,\n"age":, model generates 42, and so on. Another example is chain-of-thought reasoning, where after the model generated a sentence, the controller forces more instructions for the model, the model generates more text, and so on. If used, these significantly speed up generation process, because they can be processed in one forward pass, similar to the prompt.

One can think of them as 100% accurate speculation.

The llguidance library supports computing them, so it would be nice to allow them in mistral.rs.

The text was updated successfully, but these errors were encountered:

mmoskal added the new feature New feature or request label Dec 2, 2024

mmoskal mentioned this issue Dec 2, 2024

use llguidance library for constraints (including json schemas) #899

Merged

4 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fast-forward tokens with llguidance #965

fast-forward tokens with llguidance #965

mmoskal commented Dec 2, 2024

fast-forward tokens with llguidance #965

fast-forward tokens with llguidance #965

Comments

mmoskal commented Dec 2, 2024