Skip to content

openai/text-embedding-3-small does not work on WebFAQRetrieval #2699

@nqbao

Description

@nqbao

I'm using latest mteb package and i try to run evaluation for some vietnamese dataset but it does not seem to work

import mteb

tasks = mteb.get_tasks(tasks=["WebFAQRetrieval"], languages=['vie'])
model = mteb.get_model("openai/text-embedding-3-small")
evaluation = mteb.MTEB(tasks=tasks)

evaluation.run(model, encode_kwargs={"batch_size": 32})
[/usr/local/lib/python3.11/dist-packages/mteb/evaluation/evaluators/RetrievalEvaluator.py](https://localhost:8080/#) in search(self, corpus, queries, top_k, task_name, instructions, request_qid, return_sorted, **kwargs)
    121             )
    122         else:
--> 123             query_embeddings = self.model.encode(
    124                 queries,  # type: ignore
    125                 task_name=task_name,

[/usr/local/lib/python3.11/dist-packages/mteb/evaluation/evaluators/RetrievalEvaluator.py](https://localhost:8080/#) in encode(self, sentences, task_name, prompt_type, **kwargs)
    421                 sentences, task_name, prompt_type=prompt_type, **kwargs
    422             )
--> 423         return self.model.encode(
    424             sentences, task_name=task_name, prompt_type=prompt_type, **kwargs
    425         )

[/usr/local/lib/python3.11/dist-packages/mteb/models/openai_models.py](https://localhost:8080/#) in encode(self, sentences, **kwargs)
    142         all_embeddings = np.zeros((len(sentences), self._embed_dim), dtype=np.float32)
    143         if mask:
--> 144             all_embeddings[mask] = no_empty_embeddings
    145         return all_embeddings
    146 

IndexError: too many indices for array

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't working

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions