-
-
Notifications
You must be signed in to change notification settings - Fork 6.3k
[RFC] Represent Screen state as UTF-8 #7992
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Hmm, not all these assignments have coverage. For instance, we have no test that uses folded lines inside cmdline window. |
4e08af6
to
8e658cc
Compare
Question: currently this PR freezes the option |
'maxcombine' likely had the same motivation as 'encoding'. +1 for freezing & deprecating it. |
Away it goes then. Though unlike In the future we could consider implementing a NFG-like solution to support arbitrary graphemes without O(screensize) overhead (not necessarily real normalization, could just keep a mapping between UTF-8 grapheme strings and invented out-of-range codepoints) |
367ee40
to
d2c80ff
Compare
a0399b4
to
90174c0
Compare
src/nvim/globals.h
Outdated
typedef char_u schar_T; | ||
typedef unsigned short sattr_T; | ||
typedef char_u schar_T[MAX_MCO * 4 + 1]; | ||
typedef int16_t sattr_T; | ||
|
||
/* | ||
* The characters that are currently on the screen are kept in ScreenLines[]. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This comment block still contains references to ScreenLinesUC
.
src/nvim/globals.h
Outdated
@@ -155,14 +155,7 @@ EXTERN char_u *LineWraps INIT(= NULL); /* line wraps to next line */ | |||
* ScreenLinesC[0][off] is only to be used when ScreenLinesUC[off] != 0. | |||
* Note: These three are only allocated when enc_utf8 is set! | |||
*/ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Comment block contains old references.
src/nvim/screen.c
Outdated
// ScreenLines[off] Contains a copy of the whole screen, as it is currently | ||
// displayed | ||
// ScreenAttrs[off] Contains the associated attributes. | ||
// LineOffset[row] Contains the offset into ScreenLines*[] and ScreenAttrs[] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ScreenLines*[]
-> ScreenLines[]
src/nvim/screen.c
Outdated
// displayed | ||
// ScreenAttrs[off] Contains the associated attributes. | ||
// LineOffset[row] Contains the offset into ScreenLines*[] and ScreenAttrs[] | ||
// for each line. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Indentation is too small, probably because TAB
s were removed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These could probably be removed anyway, and moved to globals.h
if needed.
if (clear_width > 0 && !rlflag) { | ||
// blank out the rest of the line | ||
while (col < clear_width && ScreenLines[off_to][0] == ' ' | ||
&& ScreenLines[off_to][1] == NUL |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is the test for Some composing characters could follow.' '
not enough to know that ascii is encoded here?
feec318
to
2441d41
Compare
@justinmk Why 0.3.2 and not 0.3.1? The PR is done basically, and if merged soon, it will receive enough testing on master for 0.3.1. |
All tests pass, except for unrelated timer failure in QB. Internal docs are not perfect yet, but I aim to gradually improve them in following PRs. Running the new tests on master to confirm that behavior is consistent: #8534 |
2263158
to
7dcd1b6
Compare
if the failure |
src/nvim/eval.c
Outdated
c = ScreenLinesUC[off]; | ||
else | ||
c = ScreenLines[off]; | ||
c = mb_ptr2char(ScreenLines[off]); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why is it wrapped in mb_ptr2char() now?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Because for back compat this function must return the char as a Number. (And thus ignore composing chars) Should be utf_ptr2char
though.
* Returns the produced number of bytes. | ||
*/ | ||
int utfc_char2bytes(int off, char_u *buf) | ||
{ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
IIUC now utfc_char2bytes is not needed because the screen already contains what would have been the result of that. So ui_puts can be called directly on the screen bytes.
Would appreciate calling this out in the commit message as a concrete example.
src/nvim/screen.c
Outdated
// ScreenLines[]. | ||
// | ||
// update_screen() is the function that updates all windows and status lines. | ||
// It is called form the main loop when must_redraw is non-zero. It may be |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Typo
b0d0eec
to
050f397
Compare
Store text in ScreenLines as UTF-8, so it can be sent as-is to the UI layer. `utfc_char2bytes(off,buf)` is removed, as `ScreenLines[off]` now already contains this representation. To recover the codepoints that the screen arrays previously contained, use utfc_ptr2char (or utf_ptr2char to ignore composing chars). NB: This commit does NOT change how screen.c processes incoming UTF-8 data from buffers, cmdline, messages etc. Any algorithm that operates on UCS-4 (like arabic shaping, treatment of non-printable chars) is left unchanged for now.
@justinmk I added detailed commit message |
Failure in ui/incommand_spec
As it is retry failure it is probably unrelated indeterministic failure (also I think I have seen it in other PRs) |
Yeah that't the "terminal activity plus inccomand" test, see #8311. Seems like that did not fully fix it? |
Yes that test is a known issue. |
FEATURES: 07499a8 #8709 man.vim: C highlighting for EXAMPLES section 07f82ad #8699 TUI: urxvt: also send xterm focus-reporting seqs 40911e4 #8616 API: emit nvim_buf_lines_event from :terminal c46997a #8546 fillchars: Add "eob" flag FIXES: 74d19f6 #8576 startup: avoid blank stdin buffer if other files were opened 4874214 #8737 Only waitpid() for processes that we care about cd6e7e8 #8743 Check all child processes for exit in SIGCHLD handler c230ef2 #8746 channel.c: Prevent channel_destroy_early() from freeing uninitialized rpc stuff 0ed8b12 #8681 transstr_buf: fix length comparison d241f27 #8708 TUI: Fix standout mode 9afed40 #8698 man.vim: fix for mandoc e889640 #8682 provider/node: npm --loglevel silent 1cbc830 #8613 API: nvim_win_set_cursor: set curswant bf6048e #8628 checkhealth: Python: fix VIRTUAL_ENV check 3cc3506 #8528 checkhealth: node.js: also search yarn CHANGES: b751449 #8619 defaults: shortmess+=F 1248178 #8578 highlight: high-priority CursorLine if fg is set. 01570f1 #8726 terminal: handle &confirm and :confirm on unloading 56065bb #8721 screen: truncate showmode messages bf2460e #7551 buffer: fix copying :setlocal options c1c14fa #8520 Ex mode: always "improved" (gQ) 050f397 #7992 options: remove 'maxcombine` option (always 6) INTERNAL: 463da84 #7992 screen: use UTF-8 representation
FEATURES: 07499a8 neovim#8709 man.vim: C highlighting for EXAMPLES section 07f82ad neovim#8699 TUI: urxvt: also send xterm focus-reporting seqs 40911e4 neovim#8616 API: emit nvim_buf_lines_event from :terminal c46997a neovim#8546 fillchars: Add "eob" flag FIXES: 74d19f6 neovim#8576 startup: avoid blank stdin buffer if other files were opened 4874214 neovim#8737 Only waitpid() for processes that we care about cd6e7e8 neovim#8743 Check all child processes for exit in SIGCHLD handler c230ef2 neovim#8746 channel.c: Prevent channel_destroy_early() from freeing uninitialized rpc stuff 0ed8b12 neovim#8681 transstr_buf: fix length comparison d241f27 neovim#8708 TUI: Fix standout mode 9afed40 neovim#8698 man.vim: fix for mandoc e889640 neovim#8682 provider/node: npm --loglevel silent 1cbc830 neovim#8613 API: nvim_win_set_cursor: set curswant bf6048e neovim#8628 checkhealth: Python: fix VIRTUAL_ENV check 3cc3506 neovim#8528 checkhealth: node.js: also search yarn CHANGES: b751449 neovim#8619 defaults: shortmess+=F 1248178 neovim#8578 highlight: high-priority CursorLine if fg is set. 01570f1 neovim#8726 terminal: handle &confirm and :confirm on unloading 56065bb neovim#8721 screen: truncate showmode messages bf2460e neovim#7551 buffer: fix copying :setlocal options c1c14fa neovim#8520 Ex mode: always "improved" (gQ) 050f397 neovim#7992 options: remove 'maxcombine` option (always 6) INTERNAL: 463da84 neovim#7992 screen: use UTF-8 representation
This simplifies how
screen.c
stores rendered chars in ScreenLines, by storing them as the UTF-8 which will anyway be encoded and sent to the UI layer. Note this PR does NOT change in any way howscreen.c
processes incoming UTF-8 data from buffers, cmdline, messages etc, any algorithm that operates on UCS-4 (like arabic shaping, treatment of non-printable chars) will still do so. Simplifications can surely be done later on, but has higher risk than this PR that only changes how the result of these algorithms are stored on the grid. The immediate purpose is rather to make it simpler to read from and process the screen state after it has been rendered to the grid:screen_char
for instance could just maintain the first and last invalid column in each row, and instead the UI layer will transmit these chunks onui_flush
by directly translating from the grid to some (opt-in) more compact representation.