Skip to content

[v2] Refactor Retrieval tasks to use dataset directly #2692

@Samoed

Description

@Samoed

Currently, the corpus and queries are being transformed from the dataset to a dictionary. This process can be extremely slow for large datasets—for example, it took more than an hour to download MiracleRetrieval, with most of the time spent on type conversion to a dictionary

Metadata

Metadata

Assignees

No one assigned

    Labels

    v2Issues and PRs related to `v2` branch

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions