Question about adapter choice on MTEB Retrieval (retrieval vs text-matching)

#82

by liyongkang - opened Jan 31

Jan 31

Hi Jina team, thanks a lot for releasing Jina Embeddings v4 — it’s a really impressive model.

I have a question about the adapter / task setting you used for the MTEB Retrieval evaluation.

In your technical report, Table A11: “Evaluation Results for Various Models on MTEB Retrieval Task” indicates that results on MTEB Retrieval are obtained using the text-matching adapter. This confused me a bit, because intuitively I would expect the retrieval adapter to be used for retrieval benchmarks.

To sanity-check this, I ran a small reproduction on a couple of MTEB Retrieval datasets and observed behavior that suggests the optimal adapter may differ by dataset:

FiQA: using the retrieval adapter I can reach the reported score (~47.678), while the text-matching adapter gives only ~34.3.

ArguAna: using the text-matching adapter I can reach the reported score (~67.07), while the retrieval adapter is noticeably lower.

So I’m wondering:

Did you actually run different adapters for different Retrieval datasets (even though Table A11 labels them under text-matching)?

If so, is there a recommended mapping (which Retrieval datasets should use retrieval vs text-matching)?

Or is Table A11 possibly mislabeled, and MTEB Retrieval was evaluated with the retrieval adapter (or a mixed strategy)?

Any clarification would be greatly appreciated — I’d love to make sure I’m using the intended setup correctly.

Thanks again!

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment