ByteString.ToString is sometimes broken for Unicode encoding

Hi.

Currently, `ByteString.ToString(Encoding)` calls `Encoding.GetString` [for each underlying segment](https://github.com/akkadotnet/akka.net/blob/c12d08582bbeda6bb4d024df4f3eb992f4a00e0f/src/core/Akka/Util/ByteString.cs#L581-L590) that comprises the `ByteString`. Unfortunately, this breaks when you use an encoding such as `System.Text.Encoding.Unicode` and the underlying segments do not represent entire characters.

Here's a quick repro that works on Akka.NET 1.3.1 (tested on Windows, but it's probably the same everywhere):

```csharp
[Fact]
public void UnicodeByteString_Failure()
{
    const string expected = "ABC";
    Encoding encoding = Encoding.Unicode;

    byte[] rawData = encoding.GetBytes(expected);

    ByteString data = ByteString.Empty;
    data += ByteString.CopyFrom(rawData, 0, 3); // 1 and a half characters
    data += ByteString.CopyFrom(rawData, 3, 3); // 1 and a half characters
    Assert.Equal(rawData.Length, data.Count);

    string actual = data.ToString(encoding);
    Assert.Equal(expected, actual);
}
```

Note that if you compact the `ByteString` before calling ToString then it works fine.

```csharp
[Fact]
public void UnicodeByteString_Success()
{
    const string expected = "ABC";
    Encoding encoding = Encoding.Unicode;

    byte[] rawData = encoding.GetBytes(expected);

    ByteString data = ByteString.Empty;
    data += ByteString.CopyFrom(rawData, 0, 3);
    data += ByteString.CopyFrom(rawData, 3, 3);
    Assert.Equal(rawData.Length, data.Count);

    data = data.Compact();

    string actual = data.ToString(encoding);
    Assert.Equal(expected, actual);
}
```

I'd suggest that perhaps `ToString` should either call `Compact` or manually assemble a byte array before calling `Encoding.GetString`.

I'm happy to open a PR for this but wanted to discuss it with you, first, since it technically represents a change in behaviour.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

ByteString.ToString is sometimes broken for Unicode encoding #3147

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

ByteString.ToString is sometimes broken for Unicode encoding #3147

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions