-
Notifications
You must be signed in to change notification settings - Fork 1.8k
[Bm25] Execution implementation #6939
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Bm25] Execution implementation #6939
Conversation
.collect(); | ||
|
||
// Create an HTTP mock | ||
let mock = server |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Mocking an inference server here to fully end2end test the logic of this PR.
src/common/inference/service.rs
Outdated
return Ok(None); | ||
}; | ||
|
||
if options.get("use_bm25") != Some(&Value::Bool(true)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why is this here and why it has hard-coded strings?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You're right, it should be a constant instead of directly coded into the code.
I think it's better to explicitly state that we want to use bm25, instead of "guessing" and always trying to parse the options
into Bm25Config. Therefore we have this additional check when trying to read BM25 configuration from the inputs options
.
Alternatively, we could rename it to something like "custom_model" or "local_model" and check the value against "bm25". This makes it more extendible for other 'local' model implementations in future.
return VectorPersisted::empty_sparse(); | ||
} | ||
|
||
let indices: Vec<u32> = tokens |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
DimId
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We don't have access to DimId
and DimWeight
in src/common
. I don't think we should add a dependency, just for this type alias.
If you prefer having some sort of alias here, I can add a new type alias here, but I think the semantics is already known from the variable names and context from the function.
The same applies to the other place in line 103
.unique() | ||
.collect(); | ||
|
||
let values: Vec<f32> = vec![1.0; indices.len()]; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
DimWeight
?
return Ok(None); | ||
}; | ||
|
||
let Some(local_model_field) = options.get(LOCAL_MODEL_KEY) else { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I still don't understand why this is needed
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we add a different local 'model', next to bm25, which also has tokenizer configs and a user only specifies those (so no Bm25 specific parameters), we can't distinguish which model should be used.
This requires the user to specify which local model to use.
We could use some dedicated model name e.g. 'local/bm25' to specify which local model to use. However I thought hard-coding some specific model name to be potentially confusing, especially if something like 'local/bm25' works but 'bm25' throws an error about some 'address' not being found. If you prefer this approach though, I'm totally fine with changing it.
0689e42
to
1bcb324
Compare
0c38fa9
to
82a78e5
Compare
c9f3626
into
uncouple_tokenizer_and_textindex_params
* Add Bm25 * Execute BM25 if config available * cargo format * Add mocked tests for inference and bm25 * Properly apply InferenceType * Fix tests * Review remarks * Review remarks * Add overwritten model name fix again * use enum instead of multiple constants * ensure handling all fields of InferenceInput in infer_local --------- Co-authored-by: Luis Cossío <luis.cossio@outlook.com>
* Uncouple tokenizer and TextIndexParams * Refactor token processing in `TokenizerConfig` (#6912) * Refactor token processing in TokenizerConfig * fix max length checking * [Bm25] Execution implementation (#6939) * Add Bm25 * Execute BM25 if config available * cargo format * Add mocked tests for inference and bm25 * Properly apply InferenceType * Fix tests * Review remarks * Review remarks * Add overwritten model name fix again * use enum instead of multiple constants * ensure handling all fields of InferenceInput in infer_local --------- Co-authored-by: Luis Cossío <luis.cossio@outlook.com> * review fixes * fmt * spell-check * deduplicate code --------- Co-authored-by: Luis Cossío <luis.cossio@qdrant.com> Co-authored-by: Luis Cossío <luis.cossio@outlook.com> Co-authored-by: Andrey Vasnetsov <andrey@vasnetsov.com>
* Uncouple tokenizer and TextIndexParams * Refactor token processing in `TokenizerConfig` (#6912) * Refactor token processing in TokenizerConfig * fix max length checking * [Bm25] Execution implementation (#6939) * Add Bm25 * Execute BM25 if config available * cargo format * Add mocked tests for inference and bm25 * Properly apply InferenceType * Fix tests * Review remarks * Review remarks * Add overwritten model name fix again * use enum instead of multiple constants * ensure handling all fields of InferenceInput in infer_local --------- Co-authored-by: Luis Cossío <luis.cossio@outlook.com> * review fixes * fmt * spell-check * deduplicate code --------- Co-authored-by: Luis Cossío <luis.cossio@qdrant.com> Co-authored-by: Luis Cossío <luis.cossio@outlook.com> Co-authored-by: Andrey Vasnetsov <andrey@vasnetsov.com>
Supersedes #6893 and #6904
Depends on #6891
Implements Bm25 by parsing the inference object's options.
Request example
A possible request could look like this:
Error handling
In case the option "use_bm25" is enabled but the configuration is invalid, a detailed error is returned:
Test coverage
There are also tests for the new BM25 and remote-inference logic that almost cover all parts of
infer()
and >97% of bm25.rs. "Almost" because returning errors are not covered.These tests 'mock' an inference server and ensure that we correctly merge bm25 and remote items together.