Skip to content

Conversation

MichaHoffmann
Copy link

Started working on #8466

@MichaHoffmann MichaHoffmann changed the title [WIP] add (failing) test for multibyte statusline issue [WIP] Last char in statusline does not show Jul 4, 2018
@marvim marvim added the WIP label Jul 4, 2018
Copy link
Author

@MichaHoffmann MichaHoffmann left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If buf_p + l = buf_e we are at the last byte before the NUL byte in our buffer. The break there was to early i think.

@@ -370,7 +370,8 @@ size_t transstr_buf(const char *const s, char *const buf, const size_t len)
while (*p != NUL && buf_p < buf_e) {
const size_t l = (size_t)utfc_ptr2len((const char_u *)p);
if (l > 1) {
if (buf_p + l >= buf_e) {
//ignore lengths that do not fit the buffer
if (buf_p + l > buf_e) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Similar case in line 387 for non printable character.

The loop in the else case further down could also be fixed. It iterates over all fields of array pcc even if it is not filled up (try unicode 0x9f as statusline character).

diff --git a/src/nvim/charset.c b/src/nvim/charset.c
index ab20996df..193d192ac 100644
--- a/src/nvim/charset.c
+++ b/src/nvim/charset.c
@@ -380,10 +380,10 @@ size_t transstr_buf(const char *const s, char *const buf, const size_t len)
         memmove(buf_p, p, l);
         buf_p += l;
       } else {
-        for (size_t i = 0; i < ARRAY_SIZE(pcc); i++) {
+        for (size_t i = 0; i < ARRAY_SIZE(pcc) && pcc[i]; i++) {
           char hexbuf[11];
           const size_t hexlen = transchar_hex(hexbuf, pcc[i]);
-          if (buf_p + hexlen >= buf_e) {
+          if (buf_p + hexlen > buf_e) {
             break;
           }
           memmove(buf_p, hexbuf, hexlen);

Copy link
Author

@MichaHoffmann MichaHoffmann Jul 5, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What unicode character is 0x9f? In any case, i will correct line 387 and add a testcase for some non-printable unicode!

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can input unicode with <CTRL-v>u9f<space> in insert mode or on the command line.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I incorporated your comments! u9f was indeed broken; adding the &pcc[i] check resolved that.
I added a test; please have a look.

]])
end)
end)
describe('statusline', function()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why not group the following tests with the previous describe() block?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To be honest, i was not sure where to put them! Do you have an idea where the most fitting place would be?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would have grouped them with the previous describe() block, because of the same name statusline.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, that was just a copy paste error, i will correct it.

|
]])
end)
end)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We could add a test with up to MAX_MCO composing characters.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this test work? it only renders one <9f> in the bottom at the moment

      local screen = Screen.new(24, 4)                                               
      screen:attach()                                                                
      command('set laststatus=2')                                                    
      feed([[                                                                        
      :set statusline=<C-v>u9f                                                       
      <C-v>u9f                                                                       
      <C-v>u9f                                                                       
      <C-v>u9f                                                                       
      <C-v>u9f                                                                       
      <C-v>u9f<SPACE><CR>                                                            
      ]])                                                                            
      screen:expect([[                                                               
      ^                        |                                                     
      ~                       |                                                      
      <9f><9f><9f><9f><9f><9f>|                                                      
                              |                                                      
      ]])                                                                            
    end)                                                                             

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Try putting all the character on one line: :set statusline=<C-v>u9f<C-v>u9f<C-v>u9f<C-v>u9f<C-v>u9f<C-v>u9f<CR>

Also U+009f is not a composing character. To input unicode with more than two hex digits you can use <C-v>U...:
a + U+fe20: a︠
a + U+fe20 + U+fe21: a︠︡

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This behaves weird:

    it('MAX_MCO (6) unicode control points', function()                              
      local screen = Screen.new(24, 4)                                               
      screen:attach()                                                                
      command('set laststatus=2')                                                    
      feed(string.format(":set statusline=a%s%s%s%s%s%s<CR>",                          
      "<C-v>ufe20", "<C-v>ufe21", "<C-v>u20df", "<C-v>ufe26",                          
      "<C-v>ufe2e", "<C-v>u20de"))                                                     
      screen:expect([[                                                                 
      ^                        |                                                       
      ~                       |                                                        
      a︠︡⃟︦︮⃞                       |                                                        
                              |                                                        
      ]])                                                                              
    end)

results in

Expected:
  |^                        |
  |~                       |
  |a︠︡⃟︦︮⃞                       |
  |*                        |
Actual:
  |^                        |
  |~                       |
  |a︠︡⃟︦︮⃞                       |
  |*:set statusline=a︠︡⃟︦︮⃞       |

although the same command works without composing characters. I'm not sure why that does not work ( the 6 composing chars are rendered correctly )

@@ -370,7 +370,8 @@ size_t transstr_buf(const char *const s, char *const buf, const size_t len)
while (*p != NUL && buf_p < buf_e) {
const size_t l = (size_t)utfc_ptr2len((const char_u *)p);
if (l > 1) {
if (buf_p + l >= buf_e) {
//ignore lengths that do not fit the buffer
if (buf_p + l > buf_e) {
break;
}
int pcc[MAX_MCO + 2];
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The last element in pcc is never initialized, but the loop in line 384 would read this element if six composing characters where found. That is because utfc_ptr2char(p,&pcc[1]) did not write a NUL byte when MAX_MCO composing characters were in p.

A fix could be to use int pcc[MAX_MCO+1]; instead.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thats very true, i'll do that.

Copy link
Author

@MichaHoffmann MichaHoffmann Jul 7, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

utfc_ptr2char says it wants a buffer with space for at least MAX_MCO+1 chars, and since we put in &pcc[1] pcc[MAX_MCO+2] seems to be ok. Propably the fix would be

if (i <= MAX_MCO)      /* last composing char must be 0 */                         
  pcc[i] = 0;  

in utfc_ptr2char so the 0 is set even if we found MAX_MCO composing chars.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

MAX_MCO+1 is not necessary for the current version of utfc_ptr2char(), the parameter comment is more general as it needs to be (see).

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yep you are right, i re-read the function more carefully this time. ( should i change the docstring in this PR? )

@MichaHoffmann
Copy link
Author

Is this expected behaviour ?
Test:

      local screen = Screen.new(24, 4)                                           
      screen:attach()                                                            
      command('set laststatus=2')                                                
      feed(string.format(":set statusline=a%s%sb<CR>",                           
      "<C-v>ufe20", "<C-v>ufe21"))                                                   
      screen:expect([[                                                               
      ^                        |                                                     
      ~                       |                                                      
      a︠︡b                      |                                                      
                              |                                                      
      ]])   

Result:

Expected:
  |^                        |
  |~                       |
  |a︠︡b                      |
  |*                        |
Actual:
  |^                        |
  |~                       |
  |a︠︡b                      |
  |*:set statusline=a︠︡b      |

@oni-link
Copy link
Contributor

oni-link commented Jul 7, 2018

Is this expected behaviour ?

Does it depend on the use of command(...) or feed(...)?

@MichaHoffmann
Copy link
Author

MichaHoffmann commented Jul 7, 2018

using command results in

Expected:
  |^                        |
  |~                       |
  |*a︠︡b                      |
  |                        |
Actual:
  |^                        |
  |~                       |
  |*a<C-v>ufe20<C-v>ufe21b  |
  |                        |

@oni-link
Copy link
Contributor

oni-link commented Jul 7, 2018

If I input the characters directly it works:
command('set statusline=a︠︡b')

To see what code points form a character, move the cursor on the character and press ga or for the byte representation use g8.

@MichaHoffmann
Copy link
Author

Thanks, that worked ( and viewing the code points helps too, these continuation characters are getting crowded .. )

@MichaHoffmann MichaHoffmann changed the title [WIP] Last char in statusline does not show [RFC] Last char in statusline does not show Jul 7, 2018
@marvim marvim added RFC and removed WIP labels Jul 7, 2018
screen:attach()
command('set laststatus=2')
command('set statusline=o̸⃯ᷰ⃐⃧⃝')
--o + U+1DF0 + U+20EF + U+0338 + U+20D0 + U+20E7 + UU+20DD
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

UU+20DD -> U+20DD

Also same test with leading non-printable character followed by six combining characters?

Tests could be written a bit shorter if we use something like this

    before_each(function()
      --clear()
      screen = Screen.new(40, 4)
      screen:attach()
      command('set laststatus=2')
    end)

    after_each(function()
        screen:detach()
    end)

    ...

    it('last char shows', function()
      command('set statusline=abc')
      screen:expect([[
      ^                                        |
      ~                                       |
      abc                                     |
                                              |
      ]])
    end)

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok i've done that.

end)

it('last char shows (multibyte)', function()
command('set laststatus=2')
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This line could be removed from all tests because it is set by before_each().

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thats very true, removed it in the testcases.

@@ -1267,5 +1268,68 @@ describe('API', function()
eq(expected, nvim("list_uis"))
end)
end)
describe('statusline', function()
before_each(function()
screen = Screen.new(40, 4)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What screen is used here? Should it be local to the describe() block?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You are right, made it local to the describe scope.

screen = Screen.new(40, 4)
screen:attach()
command('set laststatus=2')
end)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For me this has not the right indentation because of tabs.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry, dont know how that happend! Fixed it.

@oni-link
Copy link
Contributor

oni-link commented Jul 7, 2018

make lint complains about

src/nvim/charset.c:364:  Function attribute line should have 2-space indent  [whitespace/indent] [5]
src/nvim/charset.c:373:  Should have a space between // and comment  [whitespace/comments] [4]
src/nvim/charset.c:385:  At least two spaces is best between code and comments  [whitespace/comments] [2]
src/nvim/charset.c:385:  Should have a space between // and comment  [whitespace/comments] [4]

@@ -370,20 +370,21 @@ size_t transstr_buf(const char *const s, char *const buf, const size_t len)
while (*p != NUL && buf_p < buf_e) {
const size_t l = (size_t)utfc_ptr2len((const char_u *)p);
if (l > 1) {
if (buf_p + l >= buf_e) {
// ignore lengths that do not fit the buffer
if (buf_p + l > buf_e) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A check like this needs to be added to the l==1 branch. transchar_byte() could return a string of length greater 1.

@oni-link
Copy link
Contributor

oni-link commented Jul 9, 2018

LGTM after adding a length check to the l==1 branch in transstr_buf().

@@ -1267,5 +1268,65 @@ describe('API', function()
eq(expected, nvim("list_uis"))
end)
end)
describe('statusline', function()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

these tests should be in ui/, maybe ui/multibyte_spec.lua

@MichaHoffmann
Copy link
Author

MichaHoffmann commented Jul 10, 2018

Addressed your comments, moved the tests and added a check for length in the third branch.
The statusline in the tests is arbitrary; should it go?
There is some redundancy between transstr_buf and transstr_len and inside the conditionals inside those functions. Now that there are the tests one could think about refactorings. Should that be the scope of this PR?

@justinmk
Copy link
Member

Merged, thank you @MichaHoffmann and @oni-link !

@justinmk justinmk closed this Jul 10, 2018
@justinmk justinmk removed the RFC label Jul 10, 2018
justinmk pushed a commit that referenced this pull request Jul 10, 2018
closes #8466
closes #8664
Regression by 0d7daaa.

- Fix length comparison.
- Fix loop(s) which iterated over all fields of array `pcc` even if it
  was not filled up (try unicode 0x9f as statusline character).

Note about the tests:
- To input unicode with more than two hex digits you can use <C-v>U...:
  a + U+fe20: a︠
  a + U+fe20 + U+fe21: a︠︡
@MichaHoffmann MichaHoffmann deleted the bugfix_last_character_not_displayed_in_statusline branch July 14, 2018 12:36
justinmk added a commit that referenced this pull request Jul 18, 2018
FEATURES:
07499a8 #8709 man.vim: C highlighting for EXAMPLES section
07f82ad #8699 TUI: urxvt: also send xterm focus-reporting seqs
40911e4 #8616 API: emit nvim_buf_lines_event from :terminal
c46997a #8546 fillchars: Add "eob" flag

FIXES:
74d19f6 #8576 startup: avoid blank stdin buffer if other files were opened
4874214 #8737 Only waitpid() for processes that we care about
cd6e7e8 #8743 Check all child processes for exit in SIGCHLD handler
c230ef2 #8746 channel.c: Prevent channel_destroy_early() from freeing uninitialized rpc stuff
0ed8b12 #8681 transstr_buf: fix length comparison
d241f27 #8708 TUI: Fix standout mode
9afed40 #8698 man.vim: fix for mandoc
e889640 #8682 provider/node: npm --loglevel silent
1cbc830 #8613 API: nvim_win_set_cursor: set curswant
bf6048e #8628 checkhealth: Python: fix VIRTUAL_ENV check
3cc3506 #8528 checkhealth: node.js: also search yarn

CHANGES:
b751449 #8619 defaults: shortmess+=F
1248178 #8578 highlight: high-priority CursorLine if fg is set.
01570f1 #8726 terminal: handle &confirm and :confirm on unloading
56065bb #8721 screen: truncate showmode messages
bf2460e #7551 buffer: fix copying :setlocal options
c1c14fa #8520 Ex mode: always "improved" (gQ)
050f397 #7992 options: remove 'maxcombine` option (always 6)

INTERNAL:
463da84 #7992 screen: use UTF-8 representation
coditva pushed a commit to coditva/neovim that referenced this pull request Jul 28, 2018
closes neovim#8466
closes neovim#8664
Regression by 0d7daaa.

- Fix length comparison.
- Fix loop(s) which iterated over all fields of array `pcc` even if it
  was not filled up (try unicode 0x9f as statusline character).

Note about the tests:
- To input unicode with more than two hex digits you can use <C-v>U...:
  a + U+fe20: a︠
  a + U+fe20 + U+fe21: a︠︡
coditva pushed a commit to coditva/neovim that referenced this pull request Jul 28, 2018
FEATURES:
07499a8 neovim#8709 man.vim: C highlighting for EXAMPLES section
07f82ad neovim#8699 TUI: urxvt: also send xterm focus-reporting seqs
40911e4 neovim#8616 API: emit nvim_buf_lines_event from :terminal
c46997a neovim#8546 fillchars: Add "eob" flag

FIXES:
74d19f6 neovim#8576 startup: avoid blank stdin buffer if other files were opened
4874214 neovim#8737 Only waitpid() for processes that we care about
cd6e7e8 neovim#8743 Check all child processes for exit in SIGCHLD handler
c230ef2 neovim#8746 channel.c: Prevent channel_destroy_early() from freeing uninitialized rpc stuff
0ed8b12 neovim#8681 transstr_buf: fix length comparison
d241f27 neovim#8708 TUI: Fix standout mode
9afed40 neovim#8698 man.vim: fix for mandoc
e889640 neovim#8682 provider/node: npm --loglevel silent
1cbc830 neovim#8613 API: nvim_win_set_cursor: set curswant
bf6048e neovim#8628 checkhealth: Python: fix VIRTUAL_ENV check
3cc3506 neovim#8528 checkhealth: node.js: also search yarn

CHANGES:
b751449 neovim#8619 defaults: shortmess+=F
1248178 neovim#8578 highlight: high-priority CursorLine if fg is set.
01570f1 neovim#8726 terminal: handle &confirm and :confirm on unloading
56065bb neovim#8721 screen: truncate showmode messages
bf2460e neovim#7551 buffer: fix copying :setlocal options
c1c14fa neovim#8520 Ex mode: always "improved" (gQ)
050f397 neovim#7992 options: remove 'maxcombine` option (always 6)

INTERNAL:
463da84 neovim#7992 screen: use UTF-8 representation
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants