Ensure ByteStrings are compact when calling ToString. #3148

tintoy · 2017-10-12T00:26:20Z

Aaronontheweb · 2017-10-12T03:55:56Z

@tintoy looks like some of the JSON framing specs are failing for Akka.Streams, but the Akka.Persistence issues might be related to this: #3149 - should find out once CI has a change to run, but you should take a look at the failing specs for streams.

tintoy · 2017-10-12T04:21:55Z

Yep, if I revert my change, then that test passes (but of course the other tests that I added fail). As far as I can tell, it's the UnfoldResource test that seems to be wrong (although I'm not sure why since I find the code a little hard to follow). It specifies a chunk size of 50, but then expects a string that's only 10 long, which doesn't sound right.

tintoy · 2017-10-12T05:13:41Z

Ok, found the bug - I didn't realise ByteBuffer is an alias for ArraySegment<byte>, so I was ignoring the other properties of the ArraySegment in the case that IsCompact == true. I'm just running the remaining streams tests then I'll push a fix.

…String.ToString (akkadotnet#3147).

Horusiath · 2017-10-18T05:30:16Z

src/core/Akka/Util/ByteString.cs

+            if (IsCompact)
+                return encoding.GetString(_buffers[0].Array, _buffers[0].Offset, _buffers[0].Count);
+
+            byte[] buffer = ToArray();


What advantage does that give over using StringBuilder? I'm just curious.

Hi! The problem is that when you have a ByteString composed of several ByteStrings and some of those ByteStrings represent incomplete Unicode sequences (e.g. "AB" in Unicode is 4 bytes long, and if you split the ByteString after 3 bytes, resulting in a ByteString representing 1-and-a-half characters), then the resulting string will be incorrect because some of its components will be incorrect.

Most of the existing code and tests used UTF-8, which encodes characters from the ASCII character set using only a single byte (so the issue may not have come up before). As soon as I switched to Unicode it gave wrong results unless I called .Compact() before calling .ToString().

Aaronontheweb · 2017-10-18T15:08:02Z

I like that there are multiple test cases for UTF8, Unicode, et al here. Looks like a good change to me. We'll get this into 1.3.2

Ensure ByteStrings are compact when calling ToString.

22e4a6c

akkadotnet#3147

Aaronontheweb added the needs review label Oct 12, 2017

tintoy added 2 commits October 12, 2017 16:20

Constrain buffer region used by Encoding.GetString for compacted Byte…

c31b3cf

…String.ToString (akkadotnet#3147).

Add spec for ToString on a sliced ByteString (akkadotnet#3147).

4c7c588

Horusiath reviewed Oct 18, 2017

View reviewed changes

Merge branch 'dev' into tintoy/bytestring-fix

ed6c35d

Aaronontheweb merged commit 48204e2 into akkadotnet:dev Oct 18, 2017

Aaronontheweb removed the needs review label Oct 18, 2017

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Ensure ByteStrings are compact when calling ToString. #3148

Ensure ByteStrings are compact when calling ToString. #3148

Uh oh!

tintoy commented Oct 12, 2017

Uh oh!

Aaronontheweb commented Oct 12, 2017

Uh oh!

tintoy commented Oct 12, 2017

Uh oh!

tintoy commented Oct 12, 2017

Uh oh!

Horusiath Oct 18, 2017

Uh oh!

tintoy Oct 18, 2017

Uh oh!

tintoy Oct 18, 2017

Uh oh!

tintoy Oct 18, 2017

Uh oh!

Aaronontheweb commented Oct 18, 2017

Uh oh!

Uh oh!

Ensure ByteStrings are compact when calling ToString. #3148

Ensure ByteStrings are compact when calling ToString. #3148

Uh oh!

Conversation

tintoy commented Oct 12, 2017

Uh oh!

Aaronontheweb commented Oct 12, 2017

Uh oh!

tintoy commented Oct 12, 2017

Uh oh!

tintoy commented Oct 12, 2017

Uh oh!

Horusiath Oct 18, 2017

Choose a reason for hiding this comment

Uh oh!

tintoy Oct 18, 2017

Choose a reason for hiding this comment

Uh oh!

tintoy Oct 18, 2017

Choose a reason for hiding this comment

Uh oh!

tintoy Oct 18, 2017

Choose a reason for hiding this comment

Uh oh!

Aaronontheweb commented Oct 18, 2017

Uh oh!

Uh oh!