Merging cells: Rowspans [More Flexible Tables Pt.3b] #3501

PgBiel · 2024-02-27T05:33:54Z

Allows specifying a rowspan to grid.cell and table.cell to have a cell span more than one row.

Second part of the third task in #3001. Closes #131.

This PR does not contain major changes to the docs. It is expected that those will be done later.

DRAFT: Doing some final touches still.

User-facing API

You can now specify grid.cell(rowspan: amount, ...) and table.cell(rowspan: amount, ...) to have a cell span amount rows, as seen below.

#grid(
  columns: 4,
  fill: (x, y) => if calc.odd(x + y) { blue.lighten(50%) } else { blue.lighten(10%) },
  inset: 5pt,
  align: center,
  grid.cell(rowspan: 2, fill: orange)[*Left*],
  [Right A], [Right A], [Right A],
  [Right B], grid.cell(colspan: 2, rowspan: 2, fill: orange.darken(10%))[B Wide],
  [Left A], [Left A],
  [Left B], [Left B], grid.cell(colspan: 2, rowspan: 3, fill: orange)[Wide and Long]
)

#table(
  columns: 4,
  fill: (x, y) => if calc.odd(x + y) { blue.lighten(50%) } else { blue.lighten(10%) },
  inset: 5pt,
  align: center,
  table.cell(rowspan: 2, fill: orange)[*Left*],
  [Right A], [Right A], [Right A],
  [Right B], table.cell(colspan: 2, rowspan: 2, fill: orange.darken(10%))[B Wide],
  [Left A], [Left A],
  [Left B], [Left B], table.cell(colspan: 2, rowspan: 3, fill: orange)[Wide and Long]
)

You can now specify breakable: true/false/auto to cells, as in #table.cell(rowspan: ..., breakable: ...)[a], to either enforce that all rows spanned by that cell stay in the same page (true) or to allow the cell (and, thus, its rows) to span multiple pages (false). The default is auto - the cell will default to being unbreakable if it only spans fixed-size rows, or breakable if it spans at least one auto row.

Implementation details

The changes in this PR are divided as follows.

API and CellGrid changes

Cells now have rowspan and breakable fields. The former indicates how many rows it spans (default 1), the latter indicates whether or not the contents of the cell can be broken across pages (in particular, when it's not breakable, all rows spanned by the cell are forced to be displayed in the same page).
CellGrid was changed to properly mark rowspans' spanned positions as merged positions.

Basic rowspan layout

Created a rowspans.rs file for functions and types exclusive to rowspans.
The Rowspan struct has all data needed to layout a rowspan. It includes at which row it starts, how many rows it spans, and so on. One remarkable field in that struct is heights, which indicates the height the rowspan will have available in each page. The heights are increased by each spanned row's final height during finish_region. Thus, we will only know the heights a rowspan will have after all of its rows have been laid out.
There is a global vector of "not yet laid out" rowspans in the GridLayouter.
- Before each row is laid out, check_for_rowspans runs, which checks if any cells within it are rowspans; if so, creates a Rowspan and appends it to the vector. That's how we're aware of rowspans during finish_region.
  - We use a separate function to check for rowspans rather than integrating it to layout_single_row or layout_multi_row because auto rows might not call either of those functions (if they have 0 measured frames, which is rather common for rowspans - this will always happen when the auto row is 100% composed of rowspans which don't end at it).
The layout_rowspan function takes a single Rowspan and lays out its cell's contents over the spanned pages, with the appropriate heights as part of the backlog. To do this, it modifies the frames for each region in self.finished, inserting the cell's contents.
- However, it optionally takes an additional current_region frame, because the last region spanned by the row might not yet be finished when it is laid out. This is because layout_rowspan is triggered as soon as the last row - or any row after the last row - of the rowspan is finished and laid out within finish_region. However, it may also be triggered for remaining rowspans after all regions were finished (if their last rows weren't laid out for some reason, which can happen in some edge cases).
- Of note, a single row might be laid out in finish_region multiple times if it spans multiple pages. Therefore, I added a new parameter to Row::Frame, a boolean, which is true when this is the last region which this row occupies, triggering the layout of all rowspans ending in it.
Therefore, this is the full cycle of a rowspan, in principle:
1. check_for_rowspans appends Rowspans to the self.rowspans vector, before each row is laid out in fn layout.
2. In finish_region, we keep expanding the rowspan's height for each spanned row.
3. Once we reach the last row spanned by the rowspan (or any row after it), we take it from the vector and call layout_rowspan.
4. The rowspan has now been laid out.
As a consequence, any adjustment to the shape of the rowspan - including breakable vs unbreakable - is exclusively dependent on the rows it spans. That is, a rowspan will only occupy the space between its first spanned row and its last spanned row, whichever height that might represent.
- Fixed size rows don't really care about that. However, auto rows - specifically the last auto row spanned by a rowspan - might expand so that rowspans have enough height to layout their contents. This will be explained below.

Regarding fills and strokes

Fills were adapted such that a cell's fill is placed once for each spanned region. This means that a gradient or pattern would be repeated for each spanned region, not "stretched" at once across regions. This is done by checking the "local parent Y" of each cell, that is, the first row spanned by the cell in the current region (which will be its parent Y if the cell hasn't been broken across pages so far). If we're at the local parent Y at this moment, we draw the fill again.
Not sure if we should perhaps switch to drawing a single rectangle across multiple frames or something. Frankly, I don't know if that's even possible.
Hlines were adapted so that they aren't drawn when they are inside a rowspan. Cool.
But they were also adapted in another way. Usually, hlines fold the stroke of the bottom cell with the stroke of the top cell (usually giving priority to the bottom cell) together with the user-provided line's stroke to determine its final stroke. However, especially with rowspans thrown into the mix, the "top cell" might not actually have been laid out at all. This can happen if the hline is at the top of the region (which we already used to detect by checking if its index is 0, since hlines of index 0 are considered part of the top border and repeated across all regions), but also if the row immediately above the hline is an auto row which was fully empty and thus removed (can happen if the only things in the auto row are rowspans crossing through it). Now, hlines check the "top cell" in the "local top Y", that is, the last row above the hline in the current region. To achieve this, it receives a local_top_y parameter which is calculated once. It also receives the vector of rows in the current region as a parameter as it uses similar logic to determine whether or not it is inside a rowspan and thus shouldn't be drawn (in principle, it's fine to be "inside" a rowspan if none of the rowspan's rows above the hline were laid out in the current region).
As a consequence of that change, and looking to produce better results when using table borders in general, I made it so cells' strokes now fold with the table top and bottom borders. This avoids some visual problems when the table borders are produced only by cell strokes, e.g. when there's gutter, you'd see some small holes (column gutters) at the bottom and top of each page, even when there's a colspan at the bottom of the page which would request a continuous line under it.

Unbreakable rowspans

They are pretty simple, in theory. Here's how they work:

GridLayouter has a field called unbreakable_rows_left. While that field's value is positive, finish_region should not be called.
Before each non-gutter row is laid out, we run check_for_unbreakable_rows.
That function doesn't do anything while unbreakable_rows_left is positive, but if it's zero, it will run simulate_unbreakable_row_group.
That function will use check_for_unbreakable_cells to check each cell of the current row to see if it is the parent of an unbreakable cell/rowspan. If so, it begins a simulation to find out how many more rows will have to be laid out together, in the same region. This is done by checking each upcoming row and checking if they themselves have unbreakable cells; if so, the simulation will run for at least max(cell.rowspan for cell in row) more rows. The simulation also calculates the total height of the row group, and returns its collected data in an UnbreakableRowGroup struct instance.
Then, back to check_for_unbreakable_rows, we skip to the first region with enough height to fit the new unbreakable row group and update self.unbreakable_rows_left. That way, all upcoming rows within the unbreakable row group are laid out together in the same region, as finish_region won't be called at all while those rows are being laid out.

As a consequence, the rowspan spanning those rows will be in a single region, because all that matters for a rowspan is the rows it spans.

Of note, the unbreakable row group simulation will also run measure_auto_row when it finds an auto row. This can only happen, by default, when the user explicitly writes grid.cell(breakable: true)[this cell spans an auto row]; otherwise, we define that, without an explicit override, a rowspan cell is breakable iff it does not span any auto rows.
- measure_auto_row was adapted to work with unbreakable row groups - it will use Regions::one to measure cells when unbreakable_rows_left is positive (or we're simulating an unbreakable row group).
- When called during an unbreakable row group simulation, it also takes the current row group data to determine how much height would still be available considering the rows in the simulation.

Breakable rowspans

In principle, rowspans are breakable by default, as rowspans are defined by their rows, so the mere process of laying out rows and skipping pages when they don't fit, as is currently done, will generate breakable rowspans, as long as they span those rows.

The main problem is that rowspan cells might have content which requires more vertical space than what's made available by their fixed-size rows. Therefore, auto rows should expand just enough so that the rowspans can (in theory) fully display their content. As such, the problem being described only applies to rowspans spanning auto rows - by default, all such rowspans are breakable (although that can be overridden).

For unbreakable rowspans or rowspans within a single page, that problem is pretty simple to solve (ignoring fractional rows) - just sum all of the fixed-size rows and subtract from the height demanded by the content inside the rowspan. If the result, let's say h is positive, this means that we need to expand an auto row - which will always be the last spanned auto row - by at least h units, such that the content will be fully laid out.

However, for breakable rowspans spanning at least two pages, that isn't enough, for a few reasons:

Rowspan content height becomes an array of numbers (one height per page), instead of a single number. In my calculations, I just sum the heights in each page to be able to convert it back to a single number, but I acknowledge this doesn't produce the most precise calculations. Still, it's a pretty good estimation at least, and should be fully accurate for simple and contiguous content, such as fixed-height blocks and rects.
Fixed-size rows might have different sizes in different pages if they have a ratio (e.g. 50%), as different pages might have different heights. I do not tackle this specific case in my code, in principle, unless the grid has gutters (which leads us to the more complex case below).
Spanned gutter spacing between rows - henceforth "gutter rows" - do matter, as content inside a rowspan can appear above some spanned gutter row. However, gutter rows can be removed when they appear at the bottom or top of a page; they are effectively "replaced" by page breaks. Therefore, the simple calculation of spanned height - content height becomes inaccurate, as spanned height might change depending on which gutter rows are removed or not.

To tackle specifically 3 (and 2 along with it, but only when 3 is a problem, to simplify things for now), I had to adapt measure_auto_rows as follows.

When a rowspan has the current auto row as its last spanned auto row, we measure it considering the height it had in previous regions, in rows already laid out in the current region, and potentially considering the height it will have in future rows.
After getting the vector of heights in each region for the current rowspan, we first subtract, from the left of the array, all height it has already spanned in the current region. That height is final, so it'll be fully accurate to do this.
Next, we check if the rowspan is breakable, the grid gutter is active, and it spans further rows (this isn't the last row it spans). If all conditions are satisfied, we push it to pending_rowspans instead of doing anything with its remaining sizes. Rowspans in that vector will be simulated, together, later, to try to figure out which gutters will be removed later on.
Otherwise (rowspan is unbreakable or there's no gutter), we subtract, from the end of the vector of heights, the total height spanned by future rows.
- This is mostly accurate, except that it doesn't solve 2 (relative heights depend on region height). I could force this case to use the simulation process as well, but it's not really needed for now, I think. We can always change this later once we have more real use case data.
After we checked all cells in the auto row, we run the simulation (simulate_and_measure_rowspans_in_auto_row in rowspans.rs). Here, we first join all vectors of sizes of pending rowspans into a single vector of sizes (taking max(rowspan.heights.at(region) for rowspan in pending_rowspans) for each entry in the vector), as if they all became a single rowspan, so we can work with a single "height needed" number (by just adding them up). We also figure out the max spanned row by a rowspan such that our "unified" rowspan will go up to that row.
Then, in our simulation, we start at the last region which the auto row has grown to so far (based on data from cells which didn't require simulation); the idea is to see if we will need to expand it even more.
We do this by going through all rows from the row after the auto row until the max spanned row and adding up their heights. When we reach the max height of a region, we properly subtract the height of the latest gutter from the total (it'd be removed), same if it would be the first row in the (new) region.
After going through all such rows, we calculate that the auto row will have to expand by total_rowspan_height - total_spanned_height units. But, just to confirm, we run the simulation again, caching that value.
If, by running the simulation again, we happened to need to expand the auto row even more, because some large gutter row was removed or something, then we do so, and try again, performing the same comparison. Otherwise (if it wouldn't have to expand more, or would otherwise expand less), we consider the value as final, and the auto row is expanded by that much.
Since the auto row only increases in each simulation attempt, it's certain that the simulation will converge at some point (at worst, it will have the auto row expand the full max height of rowspans, wasting as much space as possible). Still, I added a cap of up to 5 attempts at simulation for now. We can expand this in the future.
- If the cap of 5 attempts is reached, we give up and just expand the auto row by total_rowspan_height - predicted_future_height where predicted_future_height is the sum of all upcoming fixed-size rows (ignoring gutter rows). This necessarily means the auto row will expand more than it should, but it's the best we can do in such a situation, at least for now.

And, with that, we have the full process run by measure_auto_rows. Of note, the simulation described above does consider unbreakable row groups and whatnot, it tries to replicate the real layout algorithm as much as possible, while attempting to repeat the least code possible from it. Also, for now, the simulation ignores fractional-sized rows, so these will cause the auto row to expand more than it should for now.

TODO

~~TODO:~~ Open draft PR
~~TODO:~~ Perform any further necessary rebases
~~TODO:~~ Cleanup, review again, check for mistakes
~~TODO:~~ Perhaps add a few more tests?
~~TODO:~~ Undraft PR

- For now, they are only formed when an unbreakable rowspan is detected - Groups rows together so they are laid out in a single region

We keep track of all rowspans in the table, and later backtrack to draw them over the final frames once all of their rows' heights have been resolved.

Even across regions.

Consider already covered height of rows which weren't resolved yet.

A rowspan's first few rows could be absent because they were fully empty auto rows. Hline splitting now detects that - hlines between rows in a rowspan will now be displayed if the previous rows don't actually exist (weren't rendered), thus avoiding a problem when using something like table.cell(colspan: all columns, rowspan: 10)[aaaa] with auto rows, since then the first 9 rows would be empty, and thus the hline above it would only appear above the last row, meaning it wouldn't appear at all.

There's still some work to do

No idea why subtracting `height_in_this_region` on multi-region rowspans breaks some examples and fixes others...

equivalent to previous approach, but simpler

seems like it was related to first frame expansion in 'layout_auto_row' after measurement, and now that we're removing entire frames from previous regions, there's no risk of some "stray frame" remaining at the start of 'sizes', but not sure here.

- Make variables a bit less messy (still missing 'started_in_this_region' -> 'frames_in_previous_regions') - Make breakable rowspan across unbreakable auto row make a bit more sense (keep the backlog) - Join the rowspan's current partial backlog with the current region's backlog to form its predicted backlog for measuring

very buggy right now

When the simulation is cancelled, we need to push it back. Or delay popping it (might explore that approach)

Should fix problems with content overflow, and properly allow auto rows to bypass the height of an unbreakable container when needed (they don't "glitch out" at least, which wouldn't be very useful)

- A bit more readable perhaps - Plus better comments

Fix extraneous spacing when things are sent to the next page

Makes a difference when the region base is different.

laurmaedje · 2024-03-03T19:32:32Z

Yay! :)

PgBiel changed the title ~~Rowspans~~ Merging cells: Rowspans [More Flexible Tables Pt.3b] Feb 27, 2024

PgBiel mentioned this pull request Feb 27, 2024

Tracking issue: More flexible tables #3001

Closed

18 tasks

PgBiel added 28 commits February 27, 2024 23:18

create rowspan cell fields and methods

44b70f8

implement rowspan support in CellGrid

1c3355d

initial hline splitting + vline thru rowspans

1ab763e

add rowspan field to Cell type

92176a5

add hline splitting unit tests

32aa02e

adapt 'measure_auto_columns' to consider rowspans

ba6d3f3

rowspans affect the height of last auto row

f2a617b

add missing gutter checks with rowspan

fc244de

create unbreakable row groups

74b8919

- For now, they are only formed when an unbreakable rowspan is detected - Groups rows together so they are laid out in a single region

fix measure_auto_row with rowspans

aa0597d

initial rowspan layout

686c072

We keep track of all rowspans in the table, and later backtrack to draw them over the final frames once all of their rows' heights have been resolved.

initial support for rowspan fills

95d652e

Even across regions.

measure_auto_row: consider height at this region

7c78a51

Consider already covered height of rows which weren't resolved yet.

fix fill for rowspans with absent rows

e1541d8

improve 'hline_stroke_at_column' parent y guards

db4fe32

an attempt at improving 'measure_auto_row'

8f7a7e3

There's still some work to do

yet another attempt at fixing rowspan measuring

6b95b0f

No idea why subtracting `height_in_this_region` on multi-region rowspans breaks some examples and fixes others...

change measure_auto_row strategy to skip frames

d475079

equivalent to previous approach, but simpler

well, this seems to have fixed it

84f20ab

seems like it was related to first frame expansion in 'layout_auto_row' after measurement, and now that we're removing entire frames from previous regions, there's no risk of some "stray frame" remaining at the start of 'sizes', but not sure here.

skip gutter rows at top or bottom of regions

c83ffdf

initial rowspan tests

b75c3de

update tests after rebasing on line customization

2d3d639

measure_auto_row: use 'frames_in_previous_regions'

dbeb0be

initial work on new rowspan algorithm

deee087

1st version of rowspan auto row algorithm

8fb4226

very buggy right now

fix simulated last size not being pushed

13d86d4

When the simulation is cancelled, we need to push it back. Or delay popping it (might explore that approach)

PgBiel added 19 commits February 29, 2024 02:47

add unbreakable auto rowspan test

c5820c7

add some rowspan rtl tests

eabdf2d

Merge branch 'typst:main' into rowspans

49b36ad

properly resolve breakability of absent cells

8dd6c3e

add cell breakability tests

5afe99c

don't check rowspans and breakability for gutter

b9a2a63

use Vec::as_slice for lines

6cb00d9

remove unnecessary deref

ac9aab8

use pattern matching for skip check

e9f282d

remove redundant checks in lines.rs

8425815

fix doc comment in rowspans.rs

fbe0fe2

measure auto rows with infinite height

be5e350

Should fix problems with content overflow, and properly allow auto rows to bypass the height of an unbreakable container when needed (they don't "glitch out" at least, which wouldn't be very useful)

update tests after infinite height measuring

19d5351

update missing test

9a71d91

improve rowspan simulation algorithm

7b55ef7

- A bit more readable perhaps - Plus better comments

add more rowspan split tests

7771e5e

move RTL check out of rowspan draw loop

2f885fa

remove inconsistent '.max(Abs::zero())' calls

df7db35

also skip empty rowspan sizes

fea59f5

PgBiel marked this pull request as ready for review March 1, 2024 08:06

fix bottom border stroke not getting priority

fec7585

laurmaedje mentioned this pull request Mar 1, 2024

Fix out of flow check #3533

Merged

PgBiel added 3 commits March 1, 2024 12:05

Merge branch 'main' into rowspans

9ce6028

update tests after inset bug fix

8baf81b

Fix extraneous spacing when things are sent to the next page

pass the regions to simulate_unbreakable_row_group

51b1a84

Makes a difference when the region base is different.

PgBiel mentioned this pull request Mar 3, 2024

Repeatable Table Headers [More Flexible Tables Pt.5a] #3545

Merged

laurmaedje added this pull request to the merge queue Mar 3, 2024

Merged via the queue into typst:main with commit decb4fd Mar 3, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

Merging cells: Rowspans [More Flexible Tables Pt.3b] #3501

Merging cells: Rowspans [More Flexible Tables Pt.3b] #3501

Uh oh!

PgBiel commented Feb 27, 2024 •

edited

Loading

Uh oh!

laurmaedje commented Mar 3, 2024

Uh oh!

Uh oh!

Uh oh!

Merging cells: Rowspans [More Flexible Tables Pt.3b] #3501

Merging cells: Rowspans [More Flexible Tables Pt.3b] #3501

Uh oh!

Conversation

PgBiel commented Feb 27, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

User-facing API

Implementation details

API and CellGrid changes

Basic rowspan layout

Regarding fills and strokes

Unbreakable rowspans

Breakable rowspans

TODO

Uh oh!

laurmaedje commented Mar 3, 2024

Uh oh!

Uh oh!

PgBiel commented Feb 27, 2024 •

edited

Loading