Skip to content

[BUG] Custom embedding model input context length is limited to 512 tokens #3949

Open
@reuschling

Description

@reuschling

I try to deploy https://huggingface.co/Alibaba-NLP/gte-multilingual-base as onnx model. The model was successfully deployed, but I had this log message:

[WARN ][a.d.h.t.HuggingFaceTokenizer] [pc-4156] maxLength is not explicitly specified, use modelMaxLength: 512

gte-multilingual-base can deal with a context size of 8192, which is one of the reasons I choosed this model.
After some investigation, I added the according truncation entry inside tokenizer.json which is null in the huggingface version:

"truncation": {
    "max_length": 8192,
    "stride": 0,
    "strategy": "LongestFirst"
  },

Unfortunately, I now get a new log message:

[WARN ][a.d.h.t.HuggingFaceTokenizer] [pc-4156] maxLength is greater then modelMaxLength, change to: 512

The max model length could be maybe configured with max_position_embeddings inside config.json, but this file is not recognized by OpenSearch. Thus, it seems to me the max 512 token length is somehow hartcoded inside OpenSearch, with no possibility to change it.

OpenSearch version: 3.1, Linux tar.gz

Metadata

Metadata

Labels

bugSomething isn't working

Type

No type

Projects

Status

On-deck

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions