[BUG] Custom embedding model input context length is limited to 512 tokens

I try to deploy https://huggingface.co/Alibaba-NLP/gte-multilingual-base as onnx model. The model was successfully deployed, but I had this log message:

```log
[WARN ][a.d.h.t.HuggingFaceTokenizer] [pc-4156] maxLength is not explicitly specified, use modelMaxLength: 512
```
gte-multilingual-base can deal with a context size of 8192, which is one of the reasons I choosed this model.
After some investigation, I added the according `truncation` entry inside `tokenizer.json` which is null in the huggingface version:

```json
"truncation": {
    "max_length": 8192,
    "stride": 0,
    "strategy": "LongestFirst"
  },
```

Unfortunately, I now get a new log message:

```log
[WARN ][a.d.h.t.HuggingFaceTokenizer] [pc-4156] maxLength is greater then modelMaxLength, change to: 512
```

The max model length could be maybe configured with `max_position_embeddings` inside `config.json`, but this file is not recognized by OpenSearch. Thus, it seems to me the max 512 token length is somehow hartcoded inside OpenSearch, with no possibility to change it.

OpenSearch version: 3.1, Linux tar.gz

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[BUG] Custom embedding model input context length is limited to 512 tokens #3949

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

[BUG] Custom embedding model input context length is limited to 512 tokens #3949

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions