Skip to content

Conversation

Kludex
Copy link
Collaborator

@Kludex Kludex commented Dec 14, 2024

No description provided.

@Kludex
Copy link
Collaborator Author

Kludex commented Dec 14, 2024

@aaugustin Do you have availability to review this in the next days?

@Kludex
Copy link
Collaborator Author

Kludex commented Dec 14, 2024

Seems like I need to drop Python 3.8 to make my life easier. 👀

@aaugustin
Copy link
Contributor

I'm making a note to look into it tomorrow. If I don't find time tomorrow, then I'll do over the Christmas break.

Comment on lines 144 to 145
if event.opcode == Opcode.CONT:
self.handle_cont(event)
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Recommendation on how to test this? The other implementations don't have the continuation frame 🤔

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This happens when a message is fragmented. See https://datatracker.ietf.org/doc/html/rfc6455#section-5.4 for context. In short:

  • the first frame is TEXT or BINARY
  • every subsequent frame is CONT
  • the last frame has the FIN bit set

This is correctly implemented in handle_cont. As far as I can tell, ASGI doesn't support fragmented WebSocket messages so you have to buffer and reassemble.


For comparison, websockets provides both options — buffer and reassemble the message (if you don't care) or receive it frame by frame (if you need streaming). If you're curious, it's implemented here.

Copy link
Contributor

@aaugustin aaugustin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not very familiar with ASGI & uvicorn so I focused my review on the WebSocket protocol.

Your usage of websockets is straightforward.

I noticed that:

  • You aren't detecting the case when data_to_send() returns the empty bytestring to tell you to half-close the connection; that could be OK if you're confident that you're closing the transport in every scenario where the connection is terminating.
  • You aren't using the close_expected API to check whether you should close the TCP connection; that may be harmless on the server side because the server is expected to initiate the TCP closing handshake.

Comment on lines 144 to 145
if event.opcode == Opcode.CONT:
self.handle_cont(event)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This happens when a message is fragmented. See https://datatracker.ietf.org/doc/html/rfc6455#section-5.4 for context. In short:

  • the first frame is TEXT or BINARY
  • every subsequent frame is CONT
  • the last frame has the FIN bit set

This is correctly implemented in handle_cont. As far as I can tell, ASGI doesn't support fragmented WebSocket messages so you have to buffer and reassemble.


For comparison, websockets provides both options — buffer and reassemble the message (if you don't care) or receive it frame by frame (if you need streaming). If you're curious, it's implemented here.


extensions = []
if self.config.ws_per_message_deflate:
extensions = [ServerPerMessageDeflateFactory()]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Default settings lead to high-memory usage with marginal performance benefits e.g. 316KiB RAM per connection instead of 64KiB for an additional 10% compression in a basic benchmark.

FYI websockets defaults to:

            ServerPerMessageDeflateFactory(
                server_max_window_bits=12,
                client_max_window_bits=12,
                compress_settings={"memLevel": 5},
            )

See here for much more than you ever wanted to know.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So is the ws_per_message_deflate setting useless?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Compression is useful. However, I recommend to adjust the settings like I showed, because it's likely to lead to better performance in typical use cases (lots of connections receiving frequent, small messages).

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just to be clear here, you are not suggesting a specific window/memory level - you're just saying that it would be smart to change it, right? Since the normal websocket server would receive a lot of connections with small messages.

I see this paragraph on the docs:

websockets defaults to Window Bits = 12 and Memory Level = 5 to stay away from Window Bits = 10 or Memory Level = 3 where performance craters, raising doubts on what could happen at Window Bits = 11 and Memory Level = 4 on a different corpus.

Is your point that I should move the default to window bits = 11, and memory level = 4?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My point is that you should not default to window bits = 15 and memory level = 9 — which would be the default if you do nothing special.

I don't have a strong opinion on 12/5 vs. 11/4 (or any other combo in the 11–13 and 4–6 ranges). Whatever default you choose is going to work better for some users and worse from others depending on the contents of their messages, on the latency and throughput of their network, and on their CPU and RAM costs.

If you want a reco, I stand by the default in websockets (12/5) for servers. No one ever suggested something different. While that doesn't prove that it's optimal, at least it proves that it doesn't cause major trouble.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks. 👍

extensions=extensions,
max_size=self.config.ws_max_size,
logger=logging.getLogger("uvicorn.error"),
)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are you checking the Origin header elsewhere in uvicorn? If not, supporting it in the config would be good, as it helps defend against Cross-Site WebSocket Hijacking attacks. (I have forgotten about the details of this attack.)

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We do not check "Origin" in uvicorn. I'll read about it.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FWIW I don't think that's a release blocker. For everyone running behind a reverse proxy (Cloudflare, Cloudfront, etc.), the reverse proxy does it.


def connection_lost(self, exc: Exception | None) -> None:
code = 1005 if self.handshake_complete else 1006
self.queue.put_nowait({"type": "websocket.disconnect", "code": code})
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In a normal closing handshake scenario where you generate a "websocket.disconnect" event when you receive a close frame, it looks like this will generete a second "websocket.disconnect" event when the TCP connection terminates, which sounds incorrect. Ideally, you'd generate that event only if you haven't generated one already.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think ASGI allows sends multiple "websocket.disconnect". I'll confirm.

self.transport.close()

def eof_received(self) -> None:
pass
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would expect self.conn.receive_eof() here.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does it make a difference? None of the other protocols have this implemented.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since you are only implementing a server, you can get away without it, as the server is expected to close the TCP connection first.

Including it could cause websockets to send a close frame or log a message when losing connectivity, which could make troubleshooting easier vs. the connection gets garbage collected and you don't know what happened.

data_type = self.curr_msg_data_type
msg: WebSocketReceiveEvent
if data_type == "text":
msg = {"type": "websocket.receive", data_type: self.bytes.decode()}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You may want to handle the case when decode() raises an exception. In that case, websockets closes the connection with code 1007 (invalid data).

Invalid UTF-8 is relatively uncommon so you could choose to ignore it. I forgot this case in the new asyncio implementation and it took some time to detect the bug.


def handle_ping(self, event: Frame) -> None:
output = self.conn.data_to_send()
self.transport.write(b"".join(output))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would suggest that you always do this in handle_events() rather than only in handle_xxx() methods where you believe that websockets sent an automatic response. I'm not sure how much of a different it will make in practice but it feels cleaner.

Copy link
Collaborator Author

@Kludex Kludex Jun 21, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You mean those 2 lines, right?

Which other method besides this one we would be able to do it tho? The others seem to close the transport conditionally.

@Kludex Kludex mentioned this pull request Dec 26, 2024
11 tasks
@ilyakamens
Copy link

Hi! I'm just leaving a comment to say that I'm running into the issue this fixes, and I'm greatly anticipating this being merged. Let me know if there's anything I can do to help.

@FeodorFitsner
Copy link

Is there any hope this PR could be finished/merged? 😅

@Kludex
Copy link
Collaborator Author

Kludex commented Jun 18, 2025

Is there any hope this PR could be finished/merged? 😅

It's going to happen at some point. No ETA. Not much needed to finish this PR tho.


Funny. I am at PyCon SG and I heard a talk about flet yesterday. Nice stuff! 😁

I don't understand why your project has both hypercorn and Uvicorn as dependencies tho.

@FeodorFitsner
Copy link

Ah, great! :)

I don't understand why your project has both hypercorn and Uvicorn as dependencies tho.

Good catch, it was probably added during the testing of some options. Flet uses Uvicorn as a built-in web server and it's awesome.

@Kludex Kludex requested a review from aaugustin June 21, 2025 05:25
@Kludex
Copy link
Collaborator Author

Kludex commented Jun 21, 2025

@aaugustin Pipeline is passing. Do you have time to give a last look? (also, sorry for the delay)

Things that I skip in this PR:

  1. We are not testing CONT frames.
  2. We are not testing when there's a parser issue.

I'd like to have those 2 points tested, but I don't want to spend more time in this to have this merged.


I also think that WebSockets merits more documentation in Uvicorn.


Besides that, I checked all the comments above, and none of them seem to block this PR to get merged (correct me if wrong @aaugustin).

@aaugustin
Copy link
Contributor

I will have a look this week-end.

Copy link
Contributor

@aaugustin aaugustin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't see any blockers for merging this PR.

@@ -105,9 +105,9 @@ async def websocket_app(scope: Scope, receive: ASGIReceiveCallable, send: ASGISe
elif message["type"] == "websocket.disconnect":
break

async def open_connection(url):
async def open_connection(url: str):
async with websockets.client.connect(url) as websocket:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You should use websockets.asyncio.client.connect and adjust imports accordingly to avoid the deprecation warning.

Then you don't have to add "ignore: websockets.client.connect is deprecated.*:DeprecationWarning" in pyproject.toml.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've tried this. I think this will need to be a next step... It seems is not just replacing it with the new structures. The behavior itself seems to change in some tests.


def send_receive_event_to_app(self) -> None:
if self.curr_msg_data_type == "text":
self.queue.put_nowait({"type": "websocket.receive", "text": self.bytes.decode()})
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you receive invalid UTF-8, self.bytes.decode() will raise UnicodeDecodeError; should it be handled?

Perhaps it's fine to just let the exception bubble up; I have no idea how error handling works in uvicorn.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"Fatal error on transport TCPTransport"

I guess I do need to handle. 😆

@aaugustin
Copy link
Contributor

We are not testing CONT frames.

They're uncommon in real life and your implementation is simple enough that it's probably right :-)

If you want to do a quick manual test, you can run this:

import asyncio
from websockets.asyncio.client import connect

async def fragmented_hello():
    async with connect("<address of echo server built with uvicorn>") as websocket:
        await websocket.send(["Hell", "o!"])
        await websocket.recv()

if __name__ == "__main__":
    asyncio.run(hello())

We are not testing when there's a parser issue.

The parser is hardened and quite hard to crash. oss-fuzz has been trying to crash if for several years now (https://github.com/python-websockets/websockets/blob/main/fuzzing/fuzz_websocket_parser.py) and hasn't crashed it yet.

@Kludex Kludex enabled auto-merge (squash) June 24, 2025 10:16
@Kludex Kludex disabled auto-merge June 24, 2025 10:16
@Kludex Kludex merged commit b960626 into master Jun 24, 2025
16 checks passed
@Kludex Kludex deleted the ws-sansio branch June 24, 2025 10:16
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants