Skip to content

Line terminators in programming languages #96142

@alexdima

Description

@alexdima

Extracted from microsoft/TypeScript#38078

VS Code currently recognizes CRLF, CR and LF as end-of-line sequences.

But programming languages have different definitions for what constitutes a line terminator. Here is a summary from some of the specifications I found:

char/seq Unicode JS C# HTML Python PHP Java YAML
CRLF
CR
LF
LS
PS
NEL
FF
VT

Specifications:

Some language servers might use their specification variant for what a line is, so the (line,char) coordinates used for document change events, reference positions, diagnostics etc. might be off for files containing LS, PS, or NEL.

Since the characters LS, PS, or NEL are very rarely used in practice, and most likely they appear unintentionally in source code (through copy-pasting), I suggest we prompt users when such a file is opened and ask them to "fix" the line terminators for the file. Fixing would mean to replace LS, PS, or NEL with the current configured EOL sequence.

Metadata

Metadata

Assignees

Labels

Type

No type

Projects

No projects

Milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions