Skip to content

Optimise ALE for larger files with wild numbers of errors #4208

@w0rp

Description

@w0rp

Problem

Introducing a syntax error early in a large file will cause many language servers and linters to spew out potentially hundreds of errors that are difficult for any editor to deal with. The problem is visible when profiling ALE.

FUNCTIONS SORTED ON SELF TIME
count  total (s)   self (s)  function
133921   1.974324   1.859639  ale#util#LocItemCompare()
   39   0.637155   0.527805  ale#engine#FixLocList()
 8303              0.328480  ale#GetLocItemMessage()
51057   1.132433   0.324146  ale#util#LocItemCompareWithText()
16849              0.204441  ale#util#GetItemPriority()
 7508              0.203799  <SNR>144_matchaddpos()
   22              0.188455  ale#lsp#response#ReadDiagnostics()
   39   3.264910   0.184220  ale#engine#HandleLoclist()
   44   0.553220   0.178273  ale#highlight#UpdateHighlights()
   36   1.289228   0.156795  <SNR>143_Deduplicate()
   36   0.349241   0.148448  <SNR>142_BuildSignMap()
   36   0.448409   0.121095  <SNR>143_FixList()
   17   0.193857   0.120347  ale_linters#python#flake8#Handle()
   20   0.118775   0.117452  ale#job#Start()
   36   0.120452   0.114881  ale#sign#GetSignCommands()
 3925   0.200420   0.110664  ale#sign#GetSignName()
   36   0.289443   0.108381  <SNR>142_UpdateLineNumbers()
 3154              0.098257  ale#util#Col()
 7508   0.322200   0.085194  <SNR>144_highlight_range()
   36              0.073866  <SNR>142_GroupLoclistItems()

Which points to this code in engine.vim.

    " We don't need to add items or sort the list when this list is empty.
    if !empty(l:linter_loclist)
        " Add the new items.
        call extend(l:info.loclist, l:linter_loclist)

        " Sort the loclist again.
        " We need a sorted list so we can run a binary search against it
        " for efficient lookup of the messages in the cursor handler.
        call sort(l:info.loclist, 'ale#util#LocItemCompare')
    endif

Solution

There's no one solution. Instead, there are many solutions, and we should do all of them at once to greatly improve ALE's efficiency.

  1. Where possible, mark which linters will report sorted output anyway, so we can avoid calling sort for problems from those linters.
  2. Avoid sorting the same list multiple times if possible.
  3. Avoid sorting when only one linter is reporting issues at a time.
  4. Introduce g: and b: scoped ALE setting with a low number that simply ignores all but N many problems from any source by default. (No point reporting 100 errors at a time, etc.)
  5. Introduce special code for particular tools like flake8 that simply report the first syntax error and throw all other errors away, as special cases.
  6. Consider using an alternative sort algorithm, which could mean breaking the rule of "only Vim script" in this one circumstance, with a Vim script fallback. Perhaps Lua or Python sorting is a lot faster. (We would add an ale#util#Sort internal API function.)

Metadata

Metadata

Assignees

Type

No type

Projects

No projects

Milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions