Skip to content

Add /indexes/{indexUID}/fields route to get all field names #5718

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 8 commits into
base: main
Choose a base branch
from

Conversation

qdequele
Copy link
Member

@qdequele qdequele commented Jun 27, 2025

Add rich /fields endpoint for index metadata

🚀 What’s new?

  1. New endpoint GET /indexes/{indexUid}/fields
    Returns every field (including deep-nested ones) together with its full configuration:

    {
      "name": "cuisine.type",
      "displayed":  { "enabled": true },
      "searchable": { "enabled": true },
      "distinct":   { "enabled": false },
      "filterable": {
        "enabled": true,
        "facetSearch": { "sortBy": "alpha" },
        "filter":      { "equality": true, "comparison": true }
      },
      "localized":  { "locales": ["eng", "fra"] }
    }
    
  2. Filtering & paging

    Query-param Type Default Description
    offset usize 0 Skip N fields
    limit usize 20 Max fields to return
    search string Wild-card pattern (user.*, *price, tags)
    filter string Boolean or contains expressions, combined with &&

    Filter grammar (v1):

    field.path = true|false
    field.path : value
    expr && expr …
    

    Examples
    filter=displayed.enabled = true
    filter=localized.locales : fra && filterable.enabled = true

  3. Field helper flags
    The response now exposes a top-level filterable.enabled flag and an overall enabled flag at /filterable, making it trivial to test filterable.enabled=true in the query above.

  4. Code organization

    • All field-related structs, helpers and handler have moved to crates/meilisearch/src/routes/indexes/fields.rs.
    • indexes::mod.rs just re-exports the module and wires the route.
  5. Exhaustive test-suite

    • tests/index/fields.rs – basic behaviour (empty index, 404, search filter)
    • tests/index/fields_recipe.rs – uploads a complex “Spaghetti Carbonara” recipe + full settings and validates:
      • correct flags on title, cuisine.type, nutrition.calories
      • 20 fields returned showing deep-nested extraction

    • Existing index::stats tests have been adapted to the new paginated format.

⚙️ How to use

Basic request

GET /indexes/movies/fields

Response:

{
  "results": [ { ...field objects… } ],
  "total": 42,
  "limit": 20,
  "offset": 0
}

Pagination

GET /indexes/movies/fields?offset=40&limit=20

Name wildcard search

GET /indexes/movies/fields?search=*time*

Advanced filtering

Only displayed and French-localized fields:

GET /indexes/movies/fields?filter=displayed.enabled=true&&localized.locales:fra

Combining with search

GET /indexes/movies/fields?search=nutrition.*&filter=filterable.enabled=true

✅ Compatibility

The new module is additive; existing endpoints are untouched.
Stats tests that expected a raw array have been updated to the new JSON wrapper (results/total/limit/offset).

🔬 Implementation notes

  • Field extraction reuses milli’s pattern matching against settings and now honours the latest faceting sort rules.
  • filter parsing is deliberately simple (no OR / parentheses yet); easy to extend if needed.
  • The handler is fully read-only, works inside an LMDB read txn, and does not cache handles, keeping existing index-listing guarantees intact.

Feel free to ping if you need more examples or want to support additional filter operators.

qdequele added 2 commits June 27, 2025 21:28
This commit introduces a new API endpoint `/indexes/{indexUid}/fields` that allows users to fetch all field names in an index, including nested fields. It also includes corresponding tests to validate the functionality, ensuring that the endpoint returns the correct fields for both empty and populated indexes, as well as handling errors for non-existent indexes.
@qdequele qdequele closed this Jun 28, 2025
@qdequele qdequele reopened this Jun 28, 2025
@Kerollmops Kerollmops added this to the v1.16.0 milestone Jun 29, 2025
@ManyTheFish
Copy link
Member

Hey @qdequele,

After discussing with the team, we could accept a minimal implementation that doesn't contain the field Metadata.
However, for future proofing, we'd like the below changes before merging the PR:

  1. the simple array of string must be transformed into an array of object containing a name field like [{"name": "title"}, ..], this will allow us to add metadata related to the field in the future.
  2. instead of a GET query, we'd prefer having a POST query allowing us to extend the request body easily.
  3. the response should be paginated, we can have up to 16535 fields in a single index, fortunatelly it's a rare case, but it remains possible ☺️

The keys API use an autopagination helper, see below:

let keys = auth_controller.list_keys()?;
let page_view = paginate
.auto_paginate_sized(keys.into_iter().map(|k| KeyView::from_key(k, &auth_controller)));
Ok(page_view)

@ManyTheFish ManyTheFish added the no db change The database didn't change label Jul 1, 2025
@ManyTheFish ManyTheFish marked this pull request as draft July 2, 2025 08:11
@ManyTheFish ManyTheFish removed this from the v1.16.0 milestone Jul 3, 2025
@qdequele qdequele force-pushed the new-field-endpoint branch 3 times, most recently from 33ae0b0 to dfa6ae7 Compare August 1, 2025 20:02
@qdequele qdequele marked this pull request as ready for review August 2, 2025 10:39
Copy link
Contributor

@Mubelotix Mubelotix left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hello! I have a few suggestions that might help improve performance, enhance documentation, and ensure consistency with the rest of the codebase

use super::{Pagination, PAGINATION_DEFAULT_LIMIT};

/// Field configuration for a specific field in the index
#[derive(Debug, Serialize, Clone, ToSchema)]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

All the structures you added with ToSchema need to be added to the big list of structures in the main OpenApi structure.


/// Field configuration for a specific field in the index
#[derive(Debug, Serialize, Clone, ToSchema)]
#[serde(rename_all = "camelCase")]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also to be added on all other schema-deriving structures

Suggested change
#[serde(rename_all = "camelCase")]
#[serde(rename_all = "camelCase")]
#[schema(rename_all = "camelCase")]

Comment on lines +102 to +104
fn into_pagination(self) -> Pagination {
Pagination { offset: self.offset.0, limit: self.limit.0 }
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No reason to take ownership here, it forces you to clone the fields later on

Suggested change
fn into_pagination(self) -> Pagination {
Pagination { offset: self.offset.0, limit: self.limit.0 }
}
fn to_pagination(&self) -> Pagination {
Pagination { offset: *self.offset.0, limit: *self.limit.0 }
}

Comment on lines +113 to +130
fn parse_filter(expr: &str) -> Vec<FilterCond> {
let mut conditions = Vec::new();
for cond_str in expr.split("&&") {
let cond_str = cond_str.trim();
if cond_str.is_empty() {
continue;
}
if let Some((lhs, rhs)) = cond_str.split_once('=') {
let path: Vec<String> = lhs.trim().split('.').map(|s| s.trim().to_string()).collect();
let value = matches!(rhs.trim(), "true" | "True" | "TRUE");
conditions.push(FilterCond::BoolEq { path, value });
} else if let Some((lhs, rhs)) = cond_str.split_once(':') {
let path: Vec<String> = lhs.trim().split('.').map(|s| s.trim().to_string()).collect();
conditions.push(FilterCond::Contains { path, value: rhs.trim().to_string() });
}
}
conditions
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can't fields contain ampersands, points or colons? If they can we need to provide a way to escape the path. I had this exact issue with embedder names last week and switching from a splitted iterator to a closure yielding the next potentially-escaped part was very easy

let mut next_part = || -> Result<Option<Token<'_>>, RenderError<'a>> {
if input.is_empty() {
return Ok(None);
}
let (mut remaining, value) = milli::filter_parser::parse_dotted_value_part(input)
.map_err(|_| ExpectedValue(input))?;
if !remaining.is_empty() {
if !remaining.starts_with('.') {
return Err(ExpectedDotAfterValue(remaining));
}
remaining = milli::filter_parser::Slice::slice(&remaining, 1..);
}
input = remaining;
Ok(Some(value))
};

}

fn field_satisfies(field: &Field, conds: &[FilterCond]) -> bool {
let field_value = serde_json::to_value(field).unwrap_or_default();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not fond of having to serialize every field metadata. Something like that would be more efficient

fn field_satisfies(field: &Field, conds: &[FilterCond]) -> bool {
    conds.iter().all(|cond| -> bool {
        match cond {
            FilterCond::BoolEq { path, value } => {
                match path.join(".").as_str() {
                    "displayed.enabled" => field.displayed.enabled == *value,
                    "searchable.enabled" => field.searchable.enabled == *value,
                    "distinct.enabled" => field.distinct.enabled == *value,
                    "filterable.enabled" => field.filterable.enabled == *value,
                    "filterable.filter.equality" => field.filterable.filter.equality == *value,
                    "filterable.filter.comparison" => field.filterable.filter.comparison == *value,
                    _ => false,
                }
            }
            FilterCond::Contains { path, value } => {
                match path.join(".").as_str() {
                    "localized.locales" => field.localized.locales.iter().any(|l| l.contains(value)),
                    "filterable.facet_search.sort_by" => field.filterable.facet_search.sort_by.contains(value),
                    "name" => field.name.contains(value),
                    locale if locale.starts_with("localized.locales.") => {
                        let locale = locale.strip_prefix("localized.locales.").unwrap();
                        field.localized.locales.iter().find(|l| l == locale).is_some_and(|l| l.contains(value))
                    }
                    _ => false,
                }
            }
        }
    })
}

We could add a test to ensure that we don't forget to add new fields here

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What do you think in term of readability vs performance, because I don't have the impression that performance is needed in that case.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure what's best ; both ideas make sense

Comment on lines +312 to +317
.collect();

if let Some(expr) = &params.filter {
let conds = parse_filter(expr);
enriched_fields.retain(|f| field_satisfies(f, &conds));
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Turning the iterator to a vec is a missed opportunity. Use filter on the iterator instead of retain on a vec, you will benefit from skip and limit from the pagination. The filter will be evaluated lazily and not all fields will be processed. Since filter isn't always set, you may need to use itertools' Either

@@ -125,7 +129,7 @@ pub struct ListIndexes {
}

impl ListIndexes {
fn as_pagination(self) -> Pagination {
fn into_pagination(self) -> Pagination {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is no need to take ownership. According to my interpretation of Rust's naming conventions, it should be to_pagination(&self)

@@ -580,3 +584,5 @@ pub async fn get_index_stats(
debug!(returns = ?stats, "Get index stats");
Ok(HttpResponse::Ok().json(stats))
}

// Field-related structs, helpers and the `get_index_fields` handler have been moved to `fields.rs`.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure what this is about

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Every test in this file should use snapshot instead of asserts. It will greatly simplify them

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Every test in this file should use snapshot instead of asserts. It will greatly simplify them

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
no db change The database didn't change
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants