Skip to content

String processing: wrap concatenated strings in parens in all cases #3292

@yilei

Description

@yilei

Exending the dicussions in #3260 and #2553, I'd like to once more try to bring up the topic of how black should format concatenated strings.

After a lot of reading and experimenting, I'm now in favor of always wrapping concatenated strings in parens, i.e. extending the cases from #3159 to all. Examples:

# Unformatted:
function_call(
    " lorem ipsum dolor sit amet consectetur adipiscing elit sed do eiusmod tempor incididunt ut labore et dolore magna aliqua Ut enim ad minim",
    " veniam quis nostrud exercitation ullamco laboris nisi ut aliquip ex ea commodo",
)
some_list = [
    " lorem ipsum dolor sit amet consectetur adipiscing elit sed do eiusmod tempor incididunt ut labore et dolore magna aliqua Ut enim ad minim",
    " veniam quis nostrud exercitation ullamco laboris nisi ut aliquip ex ea commodo",
]

# v22.8.0 --preview:
function_call(
    " lorem ipsum dolor sit amet consectetur adipiscing elit sed do eiusmod tempor"
    " incididunt ut labore et dolore magna aliqua Ut enim ad minim",
    " veniam quis nostrud exercitation ullamco laboris nisi ut aliquip ex ea commodo",
)
some_list = [
    (
        " lorem ipsum dolor sit amet consectetur adipiscing elit sed do eiusmod tempor"
        " incididunt ut labore et dolore magna aliqua Ut enim ad minim"
    ),
    " veniam quis nostrud exercitation ullamco laboris nisi ut aliquip ex ea commodo",
]

# Proposed:
function_call(
    (
        " lorem ipsum dolor sit amet consectetur adipiscing elit sed do eiusmod tempor"
        " incididunt ut labore et dolore magna aliqua Ut enim ad minim"
    ),
    " veniam quis nostrud exercitation ullamco laboris nisi ut aliquip ex ea commodo",
)
some_list = [
    (
        " lorem ipsum dolor sit amet consectetur adipiscing elit sed do eiusmod tempor"
        " incididunt ut labore et dolore magna aliqua Ut enim ad minim"
    ),
    " veniam quis nostrud exercitation ullamco laboris nisi ut aliquip ex ea commodo",
]

This in turn makes #3260 and #2553 redundant/obsolete.

Pros

  1. Code is more readable since the extra indentation makes the scope more clear.

  2. Avoids common coding errors when commas are accidentally left-out or missed in function calls, and especially list/tuple/set literals.

  3. It's consistent no matter where the string appears. --preview (even without Add parens around implicit string concatenations where increases readability #3162) already does this in a few places:

    def function():
        value = "a very long string a very long string a very long string a very long string a very long string a very long string"
        dct = {"key": "a very long string a very long string a very long string a very long string a very long string a very long string"}
        return (
            "a very long string a very long string a very long string a very long string a"
            " very long string a very long string"
        )
    
    # output:
    def function():
        value = (
            "a very long string a very long string a very long string a very long string a"
            " very long string a very long string"
        )
        dct = {
            "key": (
                "a very long string a very long string a very long string a very long"
                " string a very long string a very long string"
            )
        }
        return (
            "a very long string a very long string a very long string a very long string a"
            " very long string a very long string"
        )
  4. It's much easier to explain Black's behavior: if a string (after merging implicitly or explicitly concated parts)
    exceeds the line length, it's wrapped in parens then split. (FWIW, this is the point that finally convinces me we should adopt this approach.)

Cons

  1. It can introduce decent amount of diffs

How about explicit concatenation without parens?

It introduces somewhat less amount of diffs, but the formatting is less readable some cases:

some_list = [
    " lorem ipsum dolor sit amet consectetur adipiscing elit sed do eiusmod tempor"
    + " incididunt ut labore et dolore magna aliqua Ut enim ad minim",
    " veniam quis nostrud exercitation ullamco laboris nisi ut aliquip ex ea commodo",
]

## Modified example from https://github.com/psf/black/blob/main/tests/data/preview/long_strings__regression.py
class A:
    def append(self):
        if True:
            xxxx.xxxxxxx.xxxxx(
                "xxxxxxxxxx xxxx xx xxxxxx xxxxxxxxxx xxxx xx xxxxxx xxxxxxxxxx xxxx xx"
                + " xxxxxx (%x) xx %x xxxx xx xxx %x.xx"
                % (len(self) + 1, xxxx.xxxxxxxxxx, xxxx.xxxxxxxxxx)
                + " %.3f (%s) to %.3f (%s).\n"
                % (
                    xxxx.xxxxxxxxx,
                    xxxx.xxxxxxxxxxxxxx(xxxx.xxxxxxxxx),
                    x,
                    xxxx.xxxxxxxxxxxxxx(xx),
                )
            )

How about explicit concatenation with parens?

Since it's already wrapped in parens, the explicit concatenation is redundant and doesn't
increase readability (and also occupies two extra columns).

How about keep organising parens added by users?

Since Black removes parens if the string fits on a single line, keeping organising parens
has the similar issue as the magic-trailing-comma: #2237 (comment).

cc @felix-hilden @JelleZijlstra @ichard26

Metadata

Metadata

Assignees

No one assigned

    Labels

    T: styleWhat do we want Blackened code to look like?

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions