Skip to content

[Feature] Generative Score API #5973

@chanh

Description

@chanh

Checklist

Motivation

Similar to the cross-encoder Score API proposed here: #5577

Goal is to score items "generatively" using decoder-only models.

E.g. "Given a user liked A, B, and C, will the user like this item? Please answer "yes" or "no." The item is: D"

API

{
  "text_1": [
    "Given a user liked A, B, and C, will the user like this item? Please answer "yes" or "no." The item is:",
  ],  
"text_2": [
     "D",
     "E"
   ],
  "positiveToken": "yes",
  "negativeToken": "no"
}

Returns:

{
  "scores": [
    0.874,
    0.231
  ]
}

Related resources

Original idea comes from this paper: Holistic Evaluation of Language Models which states the following:

We address the re-ranking task in a pointwise fashion: we formulate the information
retrieval problem using prompting as a binary log-probability problem, similar to Nogueira & Cho (2019):
Given a passage ci and a query q, we ask the model whether the passage contains an answer to the query. If
the model’s answer is Yes with a high probability, we rank the corresponding ci higher, while the No answer
with high probability achieves the opposite. Figure 12 depicts an example instance. The rankings produced
are then evaluated using standard information retrieval metrics

A Thorough Comparison of Cross-Encoders and LLMs for Reranking SPLADE https://arxiv.org/html/2403.10407v1

Actions Speak Louder than Words: Trillion-Parameter Sequential Transducers for Generative Recommendations https://proceedings.mlr.press/v235/zhai24a.html

More docs to be added

Metadata

Metadata

Assignees

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions