Skip to content

Conversation

PgBiel
Copy link
Contributor

@PgBiel PgBiel commented Feb 27, 2024

Allows specifying a rowspan to grid.cell and table.cell to have a cell span more than one row.

Second part of the third task in #3001. Closes #131.

This PR does not contain major changes to the docs. It is expected that those will be done later.

DRAFT: Doing some final touches still.

User-facing API

  1. You can now specify grid.cell(rowspan: amount, ...) and table.cell(rowspan: amount, ...) to have a cell span amount rows, as seen below.
#grid(
  columns: 4,
  fill: (x, y) => if calc.odd(x + y) { blue.lighten(50%) } else { blue.lighten(10%) },
  inset: 5pt,
  align: center,
  grid.cell(rowspan: 2, fill: orange)[*Left*],
  [Right A], [Right A], [Right A],
  [Right B], grid.cell(colspan: 2, rowspan: 2, fill: orange.darken(10%))[B Wide],
  [Left A], [Left A],
  [Left B], [Left B], grid.cell(colspan: 2, rowspan: 3, fill: orange)[Wide and Long]
)

#table(
  columns: 4,
  fill: (x, y) => if calc.odd(x + y) { blue.lighten(50%) } else { blue.lighten(10%) },
  inset: 5pt,
  align: center,
  table.cell(rowspan: 2, fill: orange)[*Left*],
  [Right A], [Right A], [Right A],
  [Right B], table.cell(colspan: 2, rowspan: 2, fill: orange.darken(10%))[B Wide],
  [Left A], [Left A],
  [Left B], [Left B], table.cell(colspan: 2, rowspan: 3, fill: orange)[Wide and Long]
)

some rowspans and stuff

  1. You can now specify breakable: true/false/auto to cells, as in #table.cell(rowspan: ..., breakable: ...)[a], to either enforce that all rows spanned by that cell stay in the same page (true) or to allow the cell (and, thus, its rows) to span multiple pages (false). The default is auto - the cell will default to being unbreakable if it only spans fixed-size rows, or breakable if it spans at least one auto row.

Implementation details

The changes in this PR are divided as follows.

API and CellGrid changes

  • Cells now have rowspan and breakable fields. The former indicates how many rows it spans (default 1), the latter indicates whether or not the contents of the cell can be broken across pages (in particular, when it's not breakable, all rows spanned by the cell are forced to be displayed in the same page).
  • CellGrid was changed to properly mark rowspans' spanned positions as merged positions.

Basic rowspan layout

  • Created a rowspans.rs file for functions and types exclusive to rowspans.
  • The Rowspan struct has all data needed to layout a rowspan. It includes at which row it starts, how many rows it spans, and so on. One remarkable field in that struct is heights, which indicates the height the rowspan will have available in each page. The heights are increased by each spanned row's final height during finish_region. Thus, we will only know the heights a rowspan will have after all of its rows have been laid out.
  • There is a global vector of "not yet laid out" rowspans in the GridLayouter.
    • Before each row is laid out, check_for_rowspans runs, which checks if any cells within it are rowspans; if so, creates a Rowspan and appends it to the vector. That's how we're aware of rowspans during finish_region.
      • We use a separate function to check for rowspans rather than integrating it to layout_single_row or layout_multi_row because auto rows might not call either of those functions (if they have 0 measured frames, which is rather common for rowspans - this will always happen when the auto row is 100% composed of rowspans which don't end at it).
  • The layout_rowspan function takes a single Rowspan and lays out its cell's contents over the spanned pages, with the appropriate heights as part of the backlog. To do this, it modifies the frames for each region in self.finished, inserting the cell's contents.
    • However, it optionally takes an additional current_region frame, because the last region spanned by the row might not yet be finished when it is laid out. This is because layout_rowspan is triggered as soon as the last row - or any row after the last row - of the rowspan is finished and laid out within finish_region. However, it may also be triggered for remaining rowspans after all regions were finished (if their last rows weren't laid out for some reason, which can happen in some edge cases).
    • Of note, a single row might be laid out in finish_region multiple times if it spans multiple pages. Therefore, I added a new parameter to Row::Frame, a boolean, which is true when this is the last region which this row occupies, triggering the layout of all rowspans ending in it.
  • Therefore, this is the full cycle of a rowspan, in principle:
    1. check_for_rowspans appends Rowspans to the self.rowspans vector, before each row is laid out in fn layout.
    2. In finish_region, we keep expanding the rowspan's height for each spanned row.
    3. Once we reach the last row spanned by the rowspan (or any row after it), we take it from the vector and call layout_rowspan.
    4. The rowspan has now been laid out.
  • As a consequence, any adjustment to the shape of the rowspan - including breakable vs unbreakable - is exclusively dependent on the rows it spans. That is, a rowspan will only occupy the space between its first spanned row and its last spanned row, whichever height that might represent.
    • Fixed size rows don't really care about that. However, auto rows - specifically the last auto row spanned by a rowspan - might expand so that rowspans have enough height to layout their contents. This will be explained below.

Regarding fills and strokes

  1. Fills were adapted such that a cell's fill is placed once for each spanned region. This means that a gradient or pattern would be repeated for each spanned region, not "stretched" at once across regions. This is done by checking the "local parent Y" of each cell, that is, the first row spanned by the cell in the current region (which will be its parent Y if the cell hasn't been broken across pages so far). If we're at the local parent Y at this moment, we draw the fill again.
    Not sure if we should perhaps switch to drawing a single rectangle across multiple frames or something. Frankly, I don't know if that's even possible.

  2. Hlines were adapted so that they aren't drawn when they are inside a rowspan. Cool.

  3. But they were also adapted in another way. Usually, hlines fold the stroke of the bottom cell with the stroke of the top cell (usually giving priority to the bottom cell) together with the user-provided line's stroke to determine its final stroke. However, especially with rowspans thrown into the mix, the "top cell" might not actually have been laid out at all. This can happen if the hline is at the top of the region (which we already used to detect by checking if its index is 0, since hlines of index 0 are considered part of the top border and repeated across all regions), but also if the row immediately above the hline is an auto row which was fully empty and thus removed (can happen if the only things in the auto row are rowspans crossing through it). Now, hlines check the "top cell" in the "local top Y", that is, the last row above the hline in the current region. To achieve this, it receives a local_top_y parameter which is calculated once. It also receives the vector of rows in the current region as a parameter as it uses similar logic to determine whether or not it is inside a rowspan and thus shouldn't be drawn (in principle, it's fine to be "inside" a rowspan if none of the rowspan's rows above the hline were laid out in the current region).

  4. As a consequence of that change, and looking to produce better results when using table borders in general, I made it so cells' strokes now fold with the table top and bottom borders. This avoids some visual problems when the table borders are produced only by cell strokes, e.g. when there's gutter, you'd see some small holes (column gutters) at the bottom and top of each page, even when there's a colspan at the bottom of the page which would request a continuous line under it.

Unbreakable rowspans

They are pretty simple, in theory. Here's how they work:

  1. GridLayouter has a field called unbreakable_rows_left. While that field's value is positive, finish_region should not be called.
  2. Before each non-gutter row is laid out, we run check_for_unbreakable_rows.
  3. That function doesn't do anything while unbreakable_rows_left is positive, but if it's zero, it will run simulate_unbreakable_row_group.
  4. That function will use check_for_unbreakable_cells to check each cell of the current row to see if it is the parent of an unbreakable cell/rowspan. If so, it begins a simulation to find out how many more rows will have to be laid out together, in the same region. This is done by checking each upcoming row and checking if they themselves have unbreakable cells; if so, the simulation will run for at least max(cell.rowspan for cell in row) more rows. The simulation also calculates the total height of the row group, and returns its collected data in an UnbreakableRowGroup struct instance.
  5. Then, back to check_for_unbreakable_rows, we skip to the first region with enough height to fit the new unbreakable row group and update self.unbreakable_rows_left. That way, all upcoming rows within the unbreakable row group are laid out together in the same region, as finish_region won't be called at all while those rows are being laid out.

As a consequence, the rowspan spanning those rows will be in a single region, because all that matters for a rowspan is the rows it spans.

  • Of note, the unbreakable row group simulation will also run measure_auto_row when it finds an auto row. This can only happen, by default, when the user explicitly writes grid.cell(breakable: true)[this cell spans an auto row]; otherwise, we define that, without an explicit override, a rowspan cell is breakable iff it does not span any auto rows.
    • measure_auto_row was adapted to work with unbreakable row groups - it will use Regions::one to measure cells when unbreakable_rows_left is positive (or we're simulating an unbreakable row group).
    • When called during an unbreakable row group simulation, it also takes the current row group data to determine how much height would still be available considering the rows in the simulation.

Breakable rowspans

In principle, rowspans are breakable by default, as rowspans are defined by their rows, so the mere process of laying out rows and skipping pages when they don't fit, as is currently done, will generate breakable rowspans, as long as they span those rows.

The main problem is that rowspan cells might have content which requires more vertical space than what's made available by their fixed-size rows. Therefore, auto rows should expand just enough so that the rowspans can (in theory) fully display their content. As such, the problem being described only applies to rowspans spanning auto rows - by default, all such rowspans are breakable (although that can be overridden).

For unbreakable rowspans or rowspans within a single page, that problem is pretty simple to solve (ignoring fractional rows) - just sum all of the fixed-size rows and subtract from the height demanded by the content inside the rowspan. If the result, let's say h is positive, this means that we need to expand an auto row - which will always be the last spanned auto row - by at least h units, such that the content will be fully laid out.

However, for breakable rowspans spanning at least two pages, that isn't enough, for a few reasons:

  1. Rowspan content height becomes an array of numbers (one height per page), instead of a single number. In my calculations, I just sum the heights in each page to be able to convert it back to a single number, but I acknowledge this doesn't produce the most precise calculations. Still, it's a pretty good estimation at least, and should be fully accurate for simple and contiguous content, such as fixed-height blocks and rects.
  2. Fixed-size rows might have different sizes in different pages if they have a ratio (e.g. 50%), as different pages might have different heights. I do not tackle this specific case in my code, in principle, unless the grid has gutters (which leads us to the more complex case below).
  3. Spanned gutter spacing between rows - henceforth "gutter rows" - do matter, as content inside a rowspan can appear above some spanned gutter row. However, gutter rows can be removed when they appear at the bottom or top of a page; they are effectively "replaced" by page breaks. Therefore, the simple calculation of spanned height - content height becomes inaccurate, as spanned height might change depending on which gutter rows are removed or not.

To tackle specifically 3 (and 2 along with it, but only when 3 is a problem, to simplify things for now), I had to adapt measure_auto_rows as follows.

  1. When a rowspan has the current auto row as its last spanned auto row, we measure it considering the height it had in previous regions, in rows already laid out in the current region, and potentially considering the height it will have in future rows.
  2. After getting the vector of heights in each region for the current rowspan, we first subtract, from the left of the array, all height it has already spanned in the current region. That height is final, so it'll be fully accurate to do this.
  3. Next, we check if the rowspan is breakable, the grid gutter is active, and it spans further rows (this isn't the last row it spans). If all conditions are satisfied, we push it to pending_rowspans instead of doing anything with its remaining sizes. Rowspans in that vector will be simulated, together, later, to try to figure out which gutters will be removed later on.
  4. Otherwise (rowspan is unbreakable or there's no gutter), we subtract, from the end of the vector of heights, the total height spanned by future rows.
    • This is mostly accurate, except that it doesn't solve 2 (relative heights depend on region height). I could force this case to use the simulation process as well, but it's not really needed for now, I think. We can always change this later once we have more real use case data.
  5. After we checked all cells in the auto row, we run the simulation (simulate_and_measure_rowspans_in_auto_row in rowspans.rs). Here, we first join all vectors of sizes of pending rowspans into a single vector of sizes (taking max(rowspan.heights.at(region) for rowspan in pending_rowspans) for each entry in the vector), as if they all became a single rowspan, so we can work with a single "height needed" number (by just adding them up). We also figure out the max spanned row by a rowspan such that our "unified" rowspan will go up to that row.
  6. Then, in our simulation, we start at the last region which the auto row has grown to so far (based on data from cells which didn't require simulation); the idea is to see if we will need to expand it even more.
  7. We do this by going through all rows from the row after the auto row until the max spanned row and adding up their heights. When we reach the max height of a region, we properly subtract the height of the latest gutter from the total (it'd be removed), same if it would be the first row in the (new) region.
  8. After going through all such rows, we calculate that the auto row will have to expand by total_rowspan_height - total_spanned_height units. But, just to confirm, we run the simulation again, caching that value.
  9. If, by running the simulation again, we happened to need to expand the auto row even more, because some large gutter row was removed or something, then we do so, and try again, performing the same comparison. Otherwise (if it wouldn't have to expand more, or would otherwise expand less), we consider the value as final, and the auto row is expanded by that much.
  10. Since the auto row only increases in each simulation attempt, it's certain that the simulation will converge at some point (at worst, it will have the auto row expand the full max height of rowspans, wasting as much space as possible). Still, I added a cap of up to 5 attempts at simulation for now. We can expand this in the future.
    • If the cap of 5 attempts is reached, we give up and just expand the auto row by total_rowspan_height - predicted_future_height where predicted_future_height is the sum of all upcoming fixed-size rows (ignoring gutter rows). This necessarily means the auto row will expand more than it should, but it's the best we can do in such a situation, at least for now.

And, with that, we have the full process run by measure_auto_rows. Of note, the simulation described above does consider unbreakable row groups and whatnot, it tries to replicate the real layout algorithm as much as possible, while attempting to repeat the least code possible from it. Also, for now, the simulation ignores fractional-sized rows, so these will cause the auto row to expand more than it should for now.

TODO

  • TODO: Open draft PR
  • TODO: Perform any further necessary rebases
  • TODO: Cleanup, review again, check for mistakes
  • TODO: Perhaps add a few more tests?
  • TODO: Undraft PR

@PgBiel PgBiel changed the title Rowspans Merging cells: Rowspans [More Flexible Tables Pt.3b] Feb 27, 2024
@PgBiel PgBiel mentioned this pull request Feb 27, 2024
18 tasks
- For now, they are only formed when an unbreakable rowspan is detected
- Groups rows together so they are laid out in a single region
We keep track of all rowspans in the table, and later backtrack to draw
them over the final frames once all of their rows' heights have been
resolved.
Consider already covered height of rows which weren't resolved yet.
A rowspan's first few rows could be absent because they were fully empty
auto rows. Hline splitting now detects that - hlines between rows in a
rowspan will now be displayed if the previous rows don't actually exist
(weren't rendered), thus avoiding a problem when using something like
table.cell(colspan: all columns, rowspan: 10)[aaaa] with auto rows,
since then the first 9 rows would be empty, and thus the hline above it would
only appear above the last row, meaning it wouldn't appear at all.
There's still some work to do
No idea why subtracting `height_in_this_region` on multi-region rowspans breaks some examples and fixes others...
equivalent to previous approach, but simpler
seems like it was related to first frame expansion in 'layout_auto_row'
after measurement,
and now that we're removing entire frames from previous regions, there's
no risk of some "stray frame" remaining at the start of 'sizes', but not sure here.
- Make variables a bit less messy (still missing
  'started_in_this_region' -> 'frames_in_previous_regions')
- Make breakable rowspan across unbreakable auto row make a bit more
  sense (keep the backlog)
- Join the rowspan's current partial backlog with the current region's
  backlog to form its predicted backlog for measuring
When the simulation is cancelled, we need to push it back. Or delay popping it (might explore that approach)
@PgBiel PgBiel marked this pull request as ready for review March 1, 2024 08:06
@laurmaedje laurmaedje mentioned this pull request Mar 1, 2024
PgBiel added 3 commits March 1, 2024 12:05
Fix extraneous spacing when things are sent to the next page
Makes a difference when the region base is different.
@laurmaedje
Copy link
Member

Yay! :)

Merged via the queue into typst:main with commit decb4fd Mar 3, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Combining grid/table rows and columns
2 participants