Skip to content

XML protocol: numreturned and numresults #97

@ato

Description

@ato

We don't currently implement these fields of the xml query protocol:

  • numresults - number of total matching results (?)
  • numreturned - number of results returned (may differ from numresults due to limits)

It seems both are displayed in various places in the OpenWayback default templates. We never needed this at NLA as when we used OpenWayback we had custom templates that didn't display this information and we do not have many archived URLs that have been captured so much they need pagination. Pywb's implementation of the XML protocol is based on OutbackCDX and so does not use either value.

Unfortunately implementing each of them will have some impact.

To implement numreturned we'd need to do one of:

  • buffer the results in memory which opens the door to out of memory errors on large result sets
  • move the <request> element after the <results> element in the XML, it's possible this may break compatibility with some clients
  • perform the query twice, once to count matches and a second time to stream the results

To implement numresults we'd need to count all matching records instead of stopping at the limit. This will cause a performance penalty to any query that matches more results than the limit. Prefix queries which match very large numbers of URLs will begin to have unpredictable and likely sometimes unacceptable performance.

On a positive note there has been a feature request to just returned counts instead of results and I guess implementing numresults would achieve that when combined with a result limit of zero.

CC @kris-sigur

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions