Skip to content

urllib3 with version 2.0.* treats control character differently #3053

@XiangFrank

Description

@XiangFrank

Subject

The problem is after version 2.0.0, if you have control characters in your request body, urllib3 will count/treat them differently than previous version.

For control characters like '\x08', '\x01', urllib3 after version 2.0.0 and before will count them into different bytes, which might cause the Content-Length mismatch in some cases.

Environment

The environment I have is Python3.10, and I found this bug exists in all 2.0.* versions.

Steps to Reproduce

This bug is easy to be reproduced. Here is a sample code I used:

import urllib3

poolmanager = urllib3.PoolManager(num_pools=10, maxsize=3, block=False)
conn = poolmanager.connection_from_url("https://httpbin.org")
resp = conn.urlopen(method="POST", url="/post", body='test\x80\x80\x01\x01\x81',  headers={
                                 'User-Agent': 'python-api/2', 'Accept-Encoding': 'gzip, deflate', 'Accept': '*/*', 'Connection': 'keep-alive'})
print(resp._body)

Run this code under different version of urllib3 (in my case, I used ver 1.26.6 and 2.0.2 for testing) you will find the 'Content-Length' of the response is totally different.

Expected Behavior

The response body of the request under two different versions of urllib3 should be the same.

Actual Behavior

The 'Content-Length' for ver 1.26.6 is 9 bytes, but for ver 2.0.2, it is 12 bytes.

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions