Skip to content

Internal state of the page (e.g. queried posts or queried object) should factor into whether a stored URL metric is fresh #1466

@westonruter

Description

@westonruter

This was originally brought up in #878 (comment).

There is a TODO currently in od_get_normalized_query_vars():

* TODO: For non-singular requests, consider adding the post IDs from The Loop to ensure publishing a new post will invalidate the cache.

Consider a homepage that shows 10 posts. None of the posts have featured images. URL metrics have been fully populated for the page in this state. Then someone publishes a new post that includes a featured image. All of a sudden there is a new image on the page which may be LCP, and yet the URL metrics have no record of this element. Nevertheless, by default the URL metrics are considered fresh by default for 1 day, which means it could take a day for that new image to start getting optimized.

One way to deal with this would be to include the IDs of the posts in the query among the normalized query vars. This could similarly handle the case where the static front page is changed. For example, in od_get_normalized_query_vars():

if ( is_singular() ) {
    $normalized_query_vars['queried_object_id'] = get_queried_object_id();
} else {
    $normalized_query_vars['queried_post_ids'] = join( ',', wp_list_pick( $GLOBALS['wp_query']->posts, 'ID' ) );
}

When the query vars change, this results in an entirely separate od_url_metrics post being used, as the normalized query vars are passed into od_get_url_metrics_slug() to obtain a MD5 hash which is used as the post_name.

Nevertheless, this seems somewhat of an abuse of the normalized query vars here, since these new "query vars" aren't actually part of the query at all. In reality, a new URL metrics post shouldn't be created, as it should instead update the existing post. If it creates a new post, then the old one will stick around until it is garbage-collected after a month.

Instead of creating a new URL metrics post entirely, perhaps there should be some kind of "ETag" which is stored with each URL metric. This also is proposed in #1424 where the list of registered tag visitors should factor into the composition of the ETag for a URL metric so that when a new tag visitor is registered (e.g. after a plugin is activated), then all URL metrics would immediately be considered stale so that new ones can be collected.

This is also related to another TODO:

// TODO: When optimizing above, if we find that there is a stored LCP element but it fails to match, it should perhaps set $needs_detection to true and send the request with an override nonce. However, this would require backtracking and adding the data-od-xpath attributes.

Instead of having to manually collect the various parts of page state that may affect the elements stored in URL metrics, we could instead know by time we finish iterating over the document whether the elements we saw correspond to the elements which were stored in the most recent URL metric. If not, then the URL metric should be considered stale and new URL metrics should be collected. But there is a bit of a chicken-and-egg problem here because the REST API endpoint will reject any requests when the URL metrics are fresh. Since the REST API endpoint doesn't know the result of iterating over the page during optimization, it doesn't know that new URL metrics should in fact be allowed. Also, if the URL metrics were all fresh then the requisite data-od-xpath attributes aren't added to the output for detect.js to use when collecting the URL metrics. So once it gets to the end of iterating over all the tags, it would have to go back over them from the beginning and add these attributes. Or it could add them unconditionally on the first pass, and then strip them all out when not needed during a second pass (but this seems not ideal to have to do with every response). This is opposed to the current approach to conditionally add them:

if ( $did_visit && $needs_detection ) {
$processor->set_meta_attribute( 'xpath', $processor->get_xpath() );
}

Several things to ruminate on here.

Sub-issues

Metadata

Metadata

Assignees

Labels

Projects

Status

Done 😃

Relationships

None yet

Development

No branches or pull requests

Issue actions