Skip to content

Conversation

trishume
Copy link
Contributor

This patch switches from slow byte-serial CRC hashing to the much faster
wyhash, with a two-level hashing scheme that allows fast scrolling
detection and acceleration by finding a rectangle of scrolled content
and submitting that as a large screen-to-screen copy rather than updated
tiles.

There's three components that together make the experience with screen updates that contain a ton of pixels, common with modern apps on a 4k screen, much better:

  • Switch from CRC32 to wyhash. CRC is a fast hash function in hardware but the byte-serial table is slow in software since every byte requires a serial wait for the L1 cache. wyhash takes much better advantage of larger operations and out of order execution. This accelerates a full 4k update on my computer from 123ms in rdpCapture to 38ms
  • Lazily convert to YUV only the tiles that actually changed. Some modern apps like Sublime Text don't take care to submit minimal damage rectangles, and thus huge areas are often unchanged, this avoids the color conversion on those areas which is now the slowest part of capture. This cuts capture down to around 11ms with a small amount of change.
  • Detect scrolled regions and make them a screen-to-screen copy: Scrolling is probably the most common cause of large updates during normal desktop use. Other remote desktop solutions like Windows RDP and Citrix detect them and optimize them to screen to screen copies. This PR adds that functionality to xrdp by using a two-level hashing scheme that makes detecting a scrolled rectangle on a 4k screen take under 2ms.

Known issues

  • The scrolling detection makes scrolling much nicer but adds tearing around the edges as the screen copy happens before the new tile updates. I prefer the faster scrolling to the minor tearing but opinions may differ.
  • Scroll copies can propogate minor tile compression boundary artifacts into more noticeable slight color fringes along solid backgrounds. They're not very noticeable but it makes some compression artifacts more noticeable than before.

Both of these would require the new Progressive RemoteFX codec to fix, for its ability to mark the beginning and end of a frame involving both tile and screen copy updates, and its tweaked RFX compression to reduce boundary artifacts. This seems like a lot of work and I don't plan on doing it. Might be worth adding a build flag to disable scroll detection if during testing this seems like it might annoy some people.

TODO list (will add items and tick off as suggestions are made):

There's still some things to do, I'm submitting this now to solicit comments on any other changes that would be preferred before merging this patch.

  • I test using it day to day for a while to see if I notice any major issues, I've only briefly done focused testing so far
  • rename "crc" variables to "hash"
  • squash this down to one commit it's just a bajillion commits right now for my personal git convenience.

@metalefty
Copy link
Member

Thank you so much for taking care of xorgxrdp performance. It's a big change so it will take some time to review, test, and merge. We might request some changes but we're open. We accept any kind of performance improvements with keeping compatibility, no license violations. I really appreciate your contribution!

@metalefty metalefty marked this pull request as draft June 24, 2020 04:41
@bolkedebruin
Copy link

@trishume awesome work! We tried deploying this for testing, but we havent been able to see a difference. This might be due to the fact that we are using guacamole on top of xrdp and are thus dependent on its screen updates as well. We’ll test it without and see if that changes anything.

@metalefty
Copy link
Member

@bolkedebruin Thank you so much too for active testing!

@trishume
Copy link
Contributor Author

@bolkedebruin Something I just realized that I may need to improve: I have some hand-tuned constants in the scroll detection code that I forgot to make adaptive to screen size, so it might be that scroll detection is only reliable on 4k displays or at least with large windows on more normal displays. Basically if you're not seeing tearing at the edges of your window when scrolling, then scroll detection isn't activating.

The other improvement also probably will still be better on a 1080p monitor but instead of a ~100ms worst case improvement it'll be more like 25ms or less which is the kind of difference it's hard to notice without a side-by-side comparison, and only helps some situations.

@bolkedebruin
Copy link

@trishume I think I got it working. The tearing is quite significant. Content becomes unreadable when scrolling. See screenshots from a Windows 2019 client.

Screenshot 2020-06-25 at 11 51 46

![Uploading Screenshot 2020-06-25 at 11.51.38.png…]()

@trishume
Copy link
Contributor Author

@bolkedebruin woah that's really weird. I have no idea where the blue comes from, it should just be one part of the screen moving a little before other parts. Definitely something going wrong there.

Also you mention Windows 2019 client but what's that macOS notification in the upper right? If you're double-hopping through some macOS remote desktop thing it's plausible that would exaggerate tearing, although it still doesn't explain the blue.

@trishume
Copy link
Contributor Author

@bolkedebruin Okay I think I can reproduce that now that I've finally tested it on a Windows client (took me a bit to figure an issue out first), don't know why it happens though, could actually be that screen to screen copies aren't supported in the windows client in RemoteFX mode which would be real sad because I'd need to either give up on scrolling acceleration or do a larger patch that includes changes to xrdp.

@bolkedebruin
Copy link

bolkedebruin commented Jun 25, 2020

(I was running w2019 in a VM)

Actually, the client needs to support Remote FX which the official Mac client doesn't do. I can only get fast scrolling / low Latency when remote FX works. Freerdp also supports it when /rfx is used. It makes a huge difference if it is turned on.

I think screen copies are supported but it copies something wrongly. The blue / green color is from the background that XRDP sends at login. I think that login screen is not rfx accelerated. So could it be a buffer issue maybe?

@jsorg71
Copy link
Contributor

jsorg71 commented Jun 25, 2020

Wow, looks like some interesting bits in here. Thanks for this. I also been looking at Adler32 instead of Crc32 mainly because I want to run the calculation in the fragment shader.
I like the idea of doing the crc(hash) first and then you know what further work needs to be done.
I'm interested in the scroll detection too. We have CopyArea and WindowCopy but they don't help for things like a web browser scrolling.
Unfortunately, I don't think we can mix orders and codecs(surface commands). You can mix orders with bitmap updates but when in codec mode like RemoteFX, everything has to be RemoteFX.
Same with GFX, you can't mix. When you start GFX, everything has to be GFX. GFX has scroll functions so that's good. I did start a GFX branch in xrdp so lets work together on that.

@bolkedebruin
Copy link

@jsorg71 can you share where that GFX branch is?

@trishume won’t it be possible to use rfx tiles (not sure but maybe here https://docs.microsoft.com/en-us/openspecs/windows_protocols/ms-rdprfx/89e669ed-b6dd-4591-a267-73a72bc6d84e) search for “tile”. Isn’t RFX progressive “just” a encoding format?

@trishume
Copy link
Contributor Author

@bolkedebruin I found it here neutrinolabs/xrdp@devel...jsorg71:egfx

@jsorg71 Cool, I'm glad there already looks to be good progress towards implementing EGFX and I hope the scroll detection possibilities provide additional reason behind completing it. I'm still unsure if I'll have the motivation to help out with that but we'll see, maybe I will. This has mostly been a fun side project for me, and it turns out that when testing with a Windows client and my good internet, scrolling already feels pretty good to me without any acceleration so I don't have as strong a demand for that anymore.

@bolkedebruin
Copy link

@jsorg71 @trishume @metalefty I'm willing to sponsor some of this. In short: GFX (RemoteFX is deprecated I guess and is not properly supported by the latest Mac client) and online resizing. You can reach me at bolke at apache dot org to work something out.

@trishume
Copy link
Contributor Author

trishume commented Jul 8, 2020

So an update on this is that I've been using it regularly with a 0 && added to the scroll detection if statement, so just the wyhash and lazy color-convert, and it makes a big difference. Chrome and scrolling are noticeably more responsive, and Sublime Text (which does large update regions where little changes) has much lower latency.

It turns out that in the Windows RDP client at least with my 200Mbps internet scrolling is now good enough for my tastes even on a 4k display, so I don't really feel the need for the scroll detection that much.

I'm bad at reliably getting around to open source stuff, so I'm not sure if I'll test and submit a trimmed down version of this patch without the scrolling stuff in any reasonable time. However if anyone else wants to take over getting it merged then all you need to do is take the code from before the scroll detection (trishume@b9e8dbf) instead of this latest commit, fix up any formatting and C89 compliance, and combine it with the port of wyhash to C89 from this PR.

scroll_rect->y2 = 0;

/* only try to detect a scroll rectangle if area is big enough */
if(extents_rect.y2 - extents_rect.y1 >= SCROLL_DET_MIN_H &&
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Alternatively if anyone just wants to quickly reap the benefits of this patch without caring about dead code, you can do what I did and edit the patch to put 0 && in this if statement.

@trishume
Copy link
Contributor Author

trishume commented Jul 8, 2020

@bolkedebruin I'm not personally available for sponsoring, but someone else might be.

Also if you're using a mac client you should know that FreeRDP-based mac clients like Royal TSX currently suffer from FreeRDP/FreeRDP#6066 which makes RFX super slow on macOS under FreeRDP for a dumb reason. Eventually they'll update with the fix. I don't even know if FreeRDP supports Progressive RFX/EGFX.

Also worth noting for anyone who wants to test this patch with the Windows RDP client, that it only affects the RemoteFX path and thus you need to connect with color depth set to "Highest Quality (32 bit)" since the RemoteFX path won't activate if you set that to say 16 bit.

@bolkedebruin
Copy link

@trishume thanks for the clarification. In case you change your mind about sponsoring/contracting let me know.

@matt335672 What do you think? Not necessarily the sponsoring (but can if you are interested).

@matt335672
Copy link
Member

This is an interesting thread, but I don't know enough about these bits of the protocol yet to contribute to it I'm afraid. Thanks for the thought though.

@bolkedebruin
Copy link

ping @jsorg71 @metalefty wdyt? do I need to reach out to your company?

@trishume
Copy link
Contributor Author

@bolkedebruin just checking: what improvement are you hoping for with EGFX and have you tried out this patch (with scroll detection disabled)? I and many coworkers have seen dramatic improvements in feel of interacting with and scrolling Chrome and a variety of other apps with this patch, especially on 4k monitors.

Collective apologies to everyone else who still suffers from much slower xrdp on high res displays because I haven't bothered to do the minor cleanup work for upstreaming this patch. If anyone else wants night and day improvements I'd be happy for anyone else to do the tiny bit of cleanup and testing left (see my earlier comment).

@bolkedebruin
Copy link

@trishume we are running with your patch which is awesome. However, the latest Mac client does not support RemoteFX (at least the version that is in xrdp) and as we are about 50/50 the benefit is limited for our user base.

What we (=company) would hope for EGFX integration is future compatibility. It is clear that MS will not improve upon RemoteFX anymore and support in its clients will be limited or even maybe deprecated. In addition as more and more people will get HiDPI monitors improvements in support might not be possible with RemoteFX anymore. As we rely on xrdp to provide our services to our internal userbase of over 500 people now I wanted to do the "right thing" by funding/sponsoring a development which benefits the community as whole. The alternative is to hire a person that then can hopefully upstream the patches.

Of course, we can investigate other alternatives as well, but my preference is to work with the community.

@metalefty
Copy link
Member

@bolkedebruin
Hi, sorry for there's no update for a long time. I know EGFX is really wanted and I'm also the one but there's no ETA of EGFX. You look like trying this PR already. Does this PR improve performance significantly even without EGFX?

FYI: Each of the active xrdp developers is employed by different companies in different countries. I'm just contributing to xrdp as a community developer.

@trishume
Copy link
Contributor Author

trishume commented Dec 18, 2020

This PR does make a very noticeable difference to performance, especially in Chrome on 4k monitors, and in apps like Sublime Text which submit the entire pane as a damage rectangle even when only adding one character. Some operations go from 350ms of input to display latency to 100ms.

Tons of people at my company have been using this patch for months (with the broken scroll detection disabled) and really liking the difference it makes.

Although in order to use this patch on a 4k monitor without occasional crashes we also run a patch I emailed in instead of PR-ing a while ago because the bug it fixes could potentially have security implications.

@Nexarian
Copy link
Contributor

Nexarian commented Dec 18, 2020 via email

@metalefty metalefty added this to the v0.2.16 milestone Dec 21, 2020
@metalefty
Copy link
Member

@trishume I've released v0.2.15. I would like to merge this soon then we can have 4 months to test before the next v0.2.16 release. Squash the commits if you want.

@trishume
Copy link
Contributor Author

trishume commented Dec 30, 2020

Sorry for not replying for a while, at this point it seems like I may just end up procrastinating on this forever. People at my company use this patch, but I did it as a hobby project on my own time. It requires a bunch of setup for me with my Linux VM to get set up to build and test changes again, which means I keep doing other things with my free time instead of cleaning this up. If you're also not interested in cleaning up the patch that's totally fine, if you or anyone else is I'd appreciate it. Maybe someone who wants the improved performance for their company without using a patched version can do the cleanup.

The main thing needed to land it right now is to remove the scroll detection code, I was lazy and just disabled the if statement to trigger it but you may not want a bunch of dead code merged in. Although maybe you do want to keep the scroll detection code in there and disabled since once EGFX is available it should be a noticeable improvement, and I worked hard on making the algorithm efficient. There may also be some places it doesn't conform to the style of the rest of the code.

In order to not crash on 4k displays occasionally it also needs the patch to the main xrdp repo which I sent into the xrdp-core mailing list months ago. The crash is unrelated to my patch, it just happens sometimes when you try to use RFX encoding instead of legacy encoding on a 4k display, and this patch only helps the RFX encoding. If you read the email and aren't concerned by the security implications I mention and don't want to clean it up yourself I can post the patch as a PR in case someone else wants to clean it up or use it to fix their crashes.

@jsorg71
Copy link
Contributor

jsorg71 commented Dec 30, 2020

@trishume I would like to integrate this with the EGFX branch. As you stated you can't mix screen to screen copy with codecs in RemoteFX 7.1 mode surface commands but you can with EGFX. I've made some great progress with EGFX. The xserver does have a call for screen to screen copies but some apps like Chrome draw the whole app as an image so some detection is nice.
I think the other big improvement is doing to hash before the color conversion to see if the tile changed. That I can pull into EGFX too. Oh and switching from winzip crc32 to some hash algorithm.
If you are using EGFX capable clients at your company, maybe you can try when I'm done to make sure you get the expected performance.

@Nexarian
Copy link
Contributor

Nexarian commented Mar 6, 2021

Now that my work on neutrinolabs/xrdp#448 is nearing an end (I suspect I'll need to do more work to clean it up, but it is functional and more a formality to complete with the innovative bits ready to rock-and-roll), I'd like to think about how to get the spirit of this patch merged into xorgxrdp.

As scroll detection is broken, I won't try to solve that as that relies on the GFX changes. I'll first target moving your wyhash and color conversion fixes and get a cleaned up PR for that.

@trishume You mention a 4k patch that you sent to xrdp-core -- Do you still have that? I'd like to clean that up as well.

metalefty added a commit to metalefty/xorgxrdp that referenced this pull request Apr 2, 2024
metalefty added a commit to metalefty/xorgxrdp that referenced this pull request Apr 10, 2024
Originally suggested by @trishume at neutrinolabs#167.

(cherry picked from commit 813e613)
@metalefty metalefty closed this Apr 10, 2024
nedix pushed a commit to nedix/xorgxrdp-fork that referenced this pull request Oct 4, 2024
Originally suggested by @trishume at neutrinolabs#167.

(cherry picked from commit 813e613)
nedix pushed a commit to nedix/xorgxrdp-fork that referenced this pull request Oct 4, 2024
Originally suggested by @trishume at neutrinolabs#167.

(cherry picked from commit 813e613)
nedix pushed a commit to nedix/xorgxrdp-fork that referenced this pull request Oct 4, 2024
Originally suggested by @trishume at neutrinolabs#167.

(cherry picked from commit 813e613)
nedix pushed a commit to nedix/xorgxrdp-fork that referenced this pull request Oct 4, 2024
Originally suggested by @trishume at neutrinolabs#167.

(cherry picked from commit 813e613)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants