-
-
Notifications
You must be signed in to change notification settings - Fork 6.3k
[RFC] Add proper expression parser #7234
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
In the current state it is not going to pass any tests due to not handled values in |
c5b5cc9
to
6a98fab
Compare
Some minimal language is finished: supporting unary/binary plus and calling/nesting parenthesis alongside with a single actual value (register) makes me think I can proceed with extending current parser code to accept the rest of the language and not inventing something else. On the way is, in any order
Rest of parsing should be handled only after implementing the above. (BTW, I think it may be a good and relatively easy to test lexer with KLEE. The only thing I completely do not like is that they do not support llvm-4.0 yet and llvm-3.x is not slotted and thus may not be installed alongside with 4.0, so KLEE could only either run in docker on my machine or I need to compile LLVM myself.) |
Managed to get KLEE run some tests, though it did not find anything useful yet. At least, I could be made sure that lexer always skips something (i.e. will not make parser fall into the infinite loop) and does not crash. Should be more useful for the parser (when it is finished), also turning code which compiles for KLEE into an executable suitable for fuzzing is much easier. |
@@ -81,6 +81,8 @@ foreach(subdir | |||
event | |||
eval | |||
lua | |||
viml |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
what is the logic behind eval
/ viml
split? Is viml
for "abstract" parsing/manipulation etc while eval
is for actual evaluation/execution? Or is it just "old" vs "new" code? Is it important enough to warrant the split?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
eval
was intended for expression evaluation only. viml
is for the whole VimL, including Ex commands. The “expression evaluation” parts like parsing and execution are going to move to viml/ for sure, the core problem is that after the whole VimL is ported to a new parser, the difference between expression evaluation and Ex commands evaluation should be gone (both will use a new VM, whatever it will actually be).
What is left of eval is
typval_T
and related values manipulations. It is more logical to move that near the VM, to viml/ as well (just eval/typval* to viml/typval* when viml/ will actually contain some executor, not much hassle).- A big bunch of
f_…
functions. My idea is that most logical choice for that is treating them just likeex_…
functions used for Ex commands: scattered around the codebase, near the domain these functions belong to. But this is too heavy refactoring so probably somebody will move them to a separate file (as a part of porting 7.4.{2055,2057,2058} #5081) where that will live from now on. File may be moved later. - Utilities for things mentioned above: currently existing executing parser, functions, etc.
So it is just “old” vs “new” code, with “new” not using “eval” because it is not about only the expressions. I do not want to have expressions parsing in eval/
because it is going to be fairly isolated from what already is there, but sharing some things with Ex commands parser (for which there is no preexisting directory in any case).
987bfa6
to
a92378e
Compare
The KLEE finally finished, no errors in lexer found. For some reason I did not have time automatically reported (zsh has a feature which makes it automatically act as if command was run like |
Total generated 53 873 “test cases”, 221 MiB according to |
@ZyX-I so you plan to expose the AST somehow, e.g. as a dictionary , to plugins? Or is it possible? |
I plan to expose it, but only AST (not lexer). In the future - via two functions, one for parsing expressions exclusively, like And only via API, relying on the VimL function generator for VimL. Plus parser may be exposed via ASYNC function. |
src/nvim/viml/parser/expressions.c
Outdated
@@ -452,6 +464,8 @@ LexExprToken viml_pexpr_next_token(ParserState *const pstate, const bool peek) | |||
// | |||
// Used highlighting groups and assumed linkage: | |||
// | |||
// NVimInternalError -> highlight as fg:red/bg:red |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this correct, fg and bg having the same color?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As an internal error it should be something which will make user report it. Given that I do not want to spawn disturbing errors “invisible”, outstanding character is a good alternative.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Note that this is also the only group which I did not bother to (plan to) link to some preexisting group.
32a84f9
to
9ccafa8
Compare
Got working figure braces (dictionary, curly-braces-names in value position, lambdas, but not complex curly-braces-names setups like And no completion code yet, though I have an idea how to code it which needs only minor modifications to what is already there. |
8e677d4
to
549be54
Compare
Error while building with
Message contains
|
First intended to provide %r functionality like in Python (and also support for %*.*s, but this was not checked), second adds nice table formatting for use in cases similar to screen:snapshot_util().
8c06e1d
to
c0f3304
Compare
Got working lambdas, dictionaries, curly braces names, function calls and nesting parenthesis, ternary operator, comparisons, unary/binary plus, operator associativity in general. What is left of specifically the parser
|
Well, it appears that &isident … is responsible for determining identifier length inside |
Note: there are three changes to ascii_isident. Reverting first two (in find_special_key and first in get_special_key_code) normally fails the new test with empty &isident, but reverting the third does not. Hence adding `>` to &isident. Ref vim/vim#2389.
277ecef
to
0b4054e
Compare
Got that after building libnvim-test on travis:
What is that? In any case, I am restarting travis under assumption that problem is temporary. Other CIs succeeded. |
With verboser logging:
|
Hoping that could fix the LSAN issue: no idea what it is talking about.
This reverts commit 6299332.
The problem somehow solved itself. |
New build failure looks unrelated:
|
CI succeeded. |
FYI the #7234 (comment) failure is my fault, I will make another change to the TUI startup and see if it continues; if it continues I will change the test to avoid spurious failure. |
@@ -10573,5 +10573,124 @@ This is not allowed when the textlock is active: | |||
- closing a window or quitting Vim | |||
- etc. | |||
|
|||
============================================================================== | |||
13. Command-line expressions coloring *expr-coloring* |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@ZyX-I Is there a reason to call it "coloring" instead of "highlighting"?
Edit: I changed this in the merge commit.
@ZyX-I what is your plan for command parsing or general VimL parsing? Will this be a new API function that accepts entire sources? Is there a need for |
ptr++; \ | ||
} \ | ||
} while (0) | ||
switch (pre) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The switch
could be replaced with
vim_str2nr_dec:
PARSE_NUMBER(...);
goto xxx;
vim_str2nr_bin:
PARSE_NUMBER(...);
goto xxx;
vim_str2nr_oct:
PARSE_NUMBER(...);
goto xxx;
vim_str2nr_hex:
PARSE_NUMBER(...);
xxx:
Edit: reordered labels in case what
is invalid.
@justinmk There are use-cases for separate expression parser in the API other then making it easier to test some things (I mean, test as a plugin developer, Neovim developers are always going to have unit tests available), you may see a list in the documentation of the new function. For VimL parser I plan on exactly the API function accepting entire source, though internally it should be possible to save state and use it later to proceed (needed for alternative syntax highlighting: if we have a parser and highlighter in a C code, why limit highligter to command-line only?). |
FEATURES: 3cc7ebf #7234 built-in VimL expression parser 6a7c904 #4419 implement <Cmd> key to invoke command in any mode b836328 #7679 'startup: treat stdin as text instead of commands' 58b210e :digraphs : highlight with hl-SpecialKey #2690 7a13611 #8276 'startup: Let `-s -` read from stdin' 1e71978 events: VimSuspend, VimResume #8280 1e7d5e8 #6272 'stdpath()' f96d99a #8247 server: introduce --listen e8c39f7 #8226 insert-mode: interpret unmapped META as ESC 98e7112 msg: do not scroll entire screen (#8088) f72630b #8055 let negative 'writedelay' show all redraws 5d2dd2e win: has("wsl") on Windows Subsystem for Linux #7330 a4f6cec cmdline: CmdlineEnter and CmdlineLeave autocommands (#7422) 207b7ca #6844 channels: support buffered output and bytes sockets/stdio API: f85cbea #7917 API: buffer updates 418abfc #6743 API: list information about all channels/jobs. 36b2e3f #8375 API: nvim_get_commands 273d2cd #8329 API: Make nvim_set_option() update `:verbose set …` 8d40b36 #8371 API: more reliable/descriptive VimL errors ebb1acb #8353 API: nvim_call_dict_function 9f994bb #8004 API: nvim_list_uis 3405704 #7520 API/UI: forward option updates to UIs 911b1e4 #7821 API: improve nvim_command_output WINDOWS OS: 9cefd83 #8084, #8516 build/win: support MSVC ee4e1fd win: Fix reading content from stdin (#8267) TUI: ffb8904 #8309 TUI: add support for mouse release events in urxvt 8d5a46e #8081 TUI: implement "standout" attribute 6071637 TUI: support TERM=konsole-256color 67848c0 #7653 TUI: report TUI info with -V3 ('verbose' >= 3) 3d0ee17 TUI/rxvt: enable focus-reporting d109f56 #7640 TUI: 'term' option: reflect effective terminal behavior FIXES: ed6a113 #8273 'job-control: avoid kill-timer race' 4e02f1a #8107 'jobs: separate process-group' 451c48a terminal: flush vterm output buffer on pty output #8486 5d6732f :checkhealth fixes #8335 53f11dc #8218 'Fix errors reported by PVS' d05712f inccommand: pause :terminal redraws (#8307) 51af911 inccommand: do not execute trailing commands #8256 84359a4 terminal: resize to the max dimensions (#8249) d49c1dd #8228 Make vim_fgets() return the same values as in Vim 60e96a4 screen: winhl=Normal:Background should not override syntax (#8093) 0c59ac1 #5908 'shada: Also save numbered marks' ba87a2c cscope: ignore EINTR while reading the prompt (#8079) b1412dc #7971 ':terminal Enter/Leave should not increment jumplist' 3a5721e TUI: libtermkey: force CSI driver for mouse input #7948 6ff13d7 #7720 TUI: faster startup 1c6e956 #7862 TUI: fix resize-related segfaults a58c909 #7676 TUI: always hide cursor when flushing, never flush buffers during unibilium output 303e1df #7624 TUI: disable BCE almost always 249bdb0 #7761 mark: Make sure that jumplist item will not have zero lnum 6f41ce0 #7704 macOS: Set $LANG based on the system locale a043899 #7633 'Retry fgets on EINTR' CHANGES: ad60927 #8304 default to 'nofsync' f3f1970 #8035 defaults: 'fillchars' a6052c7 #7984 defaults: sidescroll=1 b69fa86 #7888 defaults: enable cscopeverbose 7c4bb23 defaults: do :filetype stuff unless explicitly "off" 2aa308c #5658 'Apply :lmap in macros' 8ce6393 terminal: Leave 'relativenumber' alone (#8360) e46534b #4486 refactor: Remove maxmem, maxmemtot options 131aad9 win: defaults: 'shellcmdflag', 'shellxquote' #7343 c57d315 #8031 jobwait(): return -2 on interrupt also with timeout 6452831 clipboard: macOS: fallback to tmux if pbcopy is broken #7940 300d365 #7919 Make 'langnoremap' apply directly after a map ada1956 #7880 'lua/executor: Remove lightuserdata' INTERNAL: de0a954 #7806 internal statistics for list impl dee78a4 #7708 rewrite internal list impl
Tasks for the parser:
<C-r>=
, etc.return {expr}
string.Minimal tasks before completing which PR must not be merged: highlighting and API.
Constraints:
<C-r>=(|)
with|
being a cursor position: this is not a valid expression, but it what parser will once receive for highlighting), AST for invalid input should be just sensible enough to allow sensible highlighting and completion.