-
Notifications
You must be signed in to change notification settings - Fork 4.5k
Description
This proposal seeks to build upon the existing block validation system to start being more proactive in reducing the cases where the final user is faced with an invalid block type dialog. This is a proposal that involves manipulating data on behalf of the user so it needs both a solid definition system (in steps) to ground the level of inference the engine is willing to do to restore content integrity.
Block Validation Flow
Block validation, as defined here, comes after the initial parsing step which yields block types and checks whether a corresponding block type handler exists (meaning, that a corresponding block type is registered in the system). That phase is still objectively part of the validation process but is not the focus of this issue since there are no further feasible optimization to do there regarding content reconstruction. What this one focuses on is what happens when the block source is run through its save
function, producing a new source.
// A block source is run through the `save` function of its `blockType`
blockType.save( source ) => newSource;
// The resulting operation is classified for every block given the following outcomes
block.isValid: Number;
The proposal here focuses on classifying the outcomes of the above operation by logically grouping the possible scenarios in a decreasing order of certainty over content integrity.
ValidBlock: 0 // idempotent operation of `save(source) => source`.
MigratedBlock: 1 // source is matched sequentially with defined deprecations
// until it produces a match.
PreservedSource: 2 // `newSource` produces equivalent `innerHtml` even if
// comment attributes differ and becomes idempotent
// after first reconciliation.
ReconstructedSource: 3 // `newSource` contains the same attributes as `source`
// (attribute integrity), including non-empty sourced
// attributes, while `innerHtml` is allowed to be rebuilt.
RawTransformedSource: 4 // source is passed to raw handling functions and it yields
// the same block type.
InvalidBlock: 5 // the block could not be safely restored, need user input
- 0: ValidBlock
This operation is run for all blocks and is the cheapest mechanism to ensure integrity. It has to be optimized for speed. This has already served us super well as a quick heuristic.
- 1: MigratedBlock
When the basic match operation in level 0
fails, the first path is to check whether deprecated shapes exist for the block. If they do, we run the source against each of those sequentially until we find a match. This is also the step in which source can be migrated to newer sources. It has been in place for a long time and has allowed transparently upgrading the shape of a blocks numerous times.
- 2: PreservedSource
Starting at this level the block validation mechanism has been unable to reconcile the source with the output, which means there's a problem it doesn't have explicit instructions on how to handle. From here on there's potential of some data loss and is the main point of discussion.
For this level 2
we'd suspend the integrity of comment attributes in favor of the integrity of the source, provided the innerHtml
of source and the innerHtml
of the output match. It's also important that we achieve idempotency immediately after reconciling the comment attributes.
// Example
// - source
<!-- wp:heading {"level":3} -->
<h2>Testing Header</h2>
<!-- /wp:heading -->
// - output
<!-- wp:heading -->
<h2>Testing Header</h2>
<!-- /wp:heading -->
// inner html matches both instances
This level can be seen as "clean up spurious comment attributes malformations". Worth pointing out that this relies on the fact the block author has made a choice for us regarding what matters for sourcing the content attributes, since it is prioritizing the h2 content tag and not the comment attribute "level".
- 3: ReconstructedSource
In this variant, the html comments coincide but the inner html doesn't. If there is attribute integrity in oldSource
and newSource
(and sourced
attributes are not empty) we let the block output be overwritten as the result of computing save
again. This is an indication the block author leans on the comment attribute as the source of truth rather than the html source, and we honor that decision.
// Example
// - source
<!-- wp:heading {"level":6,"textColor":"pale-pink"} -->
<h6>Testing Header</h6>
<!-- /wp:heading -->
// - output
<!-- wp:heading {"level":6,"textColor":"pale-pink"} -->
<h6 class="has-pale-pink-color has-text-color">Testing Header</h6>
<!-- /wp:heading -->
// restores generated class names
This step can be seen as an "implicit migration" for indefinite forms, in contrast with those defined in deprecations.
- 4: RawTransformedSource
At this level we have lost confidence in the block functions being able to reconstruct a source so we hand over the operation to the raw handling mechanism (this is currently exposed to users in the "convert to blocks" action on invalid block types).
What we seek here is to ensure the resulting blockType.name
is actually a match between source and output. This is the most aggressive manipulation since it doesn't rely directly on specific instructions from a block's save
method but it can employ its raw transformations (as if content was being pasted or was part of a freeform block "convert to blocks" operation).
// Example : where it succeeds restoring block type
// - source
<!-- wp:heading -->
<h2>Testing Header</p>
<!-- /wp:heading -->
// - output
<!-- wp:heading -->
<h2>Testing Header<p></p></h2>
<!-- /wp:heading -->
And here's an example where it currently fails to restore the block type.
// Example : where it fails restoring block type
// - source
<!-- wp:heading -->
<span>Testing Header</h2>
<!-- /wp:heading -->
// - output
<!-- wp:paragraph -->
<p><span>Testing Header</span></p>
<!-- /wp:paragraph -->
As noted, in the last example the wp:heading
is not preserved so we would not consider it passes level 4
.
User Experience
If we keep the type of invalidation that occurred stored in the block object, it would allow us to build UI and behaviour around it. For example, we might choose to show somewhere in the interface (block inspector or block toolbar) that a transformation above level 1
has taken place, so the user could review it if they see something amiss and take a different choice than what the system has done.
There's obviously also room to work upon the save( source ) => newSource
operation to discard or omit certain elements we might consider less strict (like class names or other html attributes in the wrapper elements) which would cascade through all the levels and likely prevent going from one level to another for some cases.
Interested in hearing your thoughts on this. Do the classifications make sense? Is the order what you'd expect? Is there another vector we might choose to classify as well?
There might be a chance to have another classification after level 3
, for example, where we attempt the reconstruction of both the comment and the inner html provided the block type name remains the same before passing on to pure raw handling mechanisms.