Skip to content

Urequests updates #500

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 11 commits into from
Closed

Conversation

andrewleech
Copy link
Contributor

@andrewleech andrewleech commented Jun 19, 2022

In the spirit of #488 I've started to pull together a number of updates to the urequests library. So far I haven't contributed any new features myself, simply rebased (past the restructuring), merged together and applied black to all commits.

Inspired by a similar change in pycopy, but updated to be more compatible. Also closes #394

  • requests: Fix raising unsupported Transfer-Encoding exception.

From #398 (Note I manually split this into the two separate urequest commits)

From #311

From: #469

  • urequests: Always open sockets in SOCK_STREAM mode.

Inspired by: #276:

  • urequests: Provide error message when server doesn't respond with valid http.

From #263:

  • urequests: Add timeout, passed to underlying socket if supported.

From pycopy:

  • urequests: Explicitly add "Connection: close" to request headers.
  • urequests: Add ability to parse response headers.

For reference, the rebase & black on each branch (pre-merge) has been done in a single command:

git rebase -Xtheirs -i --exec 'black --fast --line-length=99 */urequests python-stdlib/binascii;git add */urequests python-stdlib/binascii;git commit --amend --no-edit' $(git merge-base HEAD master)

Testing TBD - none of the above commits include any unit tests.

resp_d = None
if parse_headers is not False:
resp_d = {}

s = usocket.socket(ai[0], ai[1], ai[2])
Copy link
Contributor

@mattytrentini mattytrentini Jun 19, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is the socket always closed correctly? Should we use a context manager to control the lifetime of the socket?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure how this should be handled to be honest. It needs to be left open at the return of the request() function as the caller will generally read the content later.

The Response() object has the close function, it probably should be updated with enter and exit so it can be used as a context manager the same as in cpython requests.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ref: #278

@@ -1,4 +1,4 @@
srctype = micropython-lib
type = module
version = 0.6
version = 0.6.1
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This seems a more significant that a patch release.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ah yep, this came in with one of the branches, I'll filter it back out into a new commit to update the overall version

chunked = data and is_chunked_data(data)

if auth is not None:
headers.update(encode_basic_auth(auth[0], auth[1]))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is neat but does make me think that we should have at least some documentation (since it's not that obvious). I'll pull together a README, or at least suggestions what should be in it.

@mattytrentini
Copy link
Contributor

Basic auth appears to work correctly 👍 :

>>> import urequests
>>> r = urequests.get('http://httpbin.org/basic-auth/user/pass', auth=('user', 'pass'))
>>> r.status_code
200
>>> r.text
'{\n  "authenticated": true, \n  "user": "user"\n}\n'
>>> r = urequests.get('http://httpbin.org/basic-auth/user/pass', auth=('user', 'fail'))
>>> r.status_code
401

@mattytrentini
Copy link
Contributor

We should probably also make accessing headers case insensitive to match requests. It's inexpensive and will make documenting easier.

@andrewleech andrewleech force-pushed the urequests branch 3 times, most recently from ff5190f to 22e4f23 Compare June 20, 2022 03:13
@andrewleech
Copy link
Contributor Author

andrewleech commented Jun 20, 2022

We should probably also make accessing headers case insensitive to match requests. It's inexpensive and will make documenting easier.

I'm not too sure just how inexpensive it is to be honest.... it depends on how far you want to take it: https://stackoverflow.com/questions/2082152/case-insensitive-dictionary

One of the more basic versions from here would likely be fine though.

@andrewleech andrewleech force-pushed the urequests branch 2 times, most recently from d5dc912 to e33d845 Compare June 20, 2022 21:59
@mattytrentini
Copy link
Contributor

Tested redirects (GET google.com generates a 301) and basic auth on the unix port (in the published v1.19 container). Tested with and without SSL. All good! ✔️

> # pwd is a clone of micropython-lib with this PR active (gh pr checkout 500)
> docker run -ti --rm -v $(pwd):/code -w /code micropython/unix bash -c 'MICROPYPATH="python-ecosys/urequests" micropython-dev'
MicroPython v1.19-dirty on 2022-06-16; linux [GCC 8.3.0] version
Use Ctrl-D to exit, Ctrl-E for paste mode
>>> import urequests
>>> urequests.get("http://google.com").status_code
200
>>> urequests.get("https://google.com").status_code
200
>>> urequests.get('http://httpbin.org/basic-auth/user/pass', auth=('user', 'pass')).status_code
200
>>> urequests.get('https://httpbin.org/basic-auth/user/pass', auth=('user', 'pass')).status_code
200
>>> urequests.get('https://httpbin.org/basic-auth/user/pass', auth=('user', 'fail')).status_code
401

@mattytrentini
Copy link
Contributor

Timeouts look good too, though a different exception is raised to requests. It could be a good idea to raise the same requests.exceptions.ReadTimeout so that code can be ported more easily? Certainly not urgent.

>>> # httpstat.us can be configured to delay for a sleep period in milliseconds. requests timeout is in seconds.
>>> r = urequests.get('http://httpstat.us/200?sleep=5000', timeout=1)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "python-ecosys/urequests/urequests.py", line 176, in get
  File "python-ecosys/urequests/urequests.py", line 121, in request
OSError: [Errno 110] ETIMEDOUT
> python
Python 3.8.10 (default, Mar 15 2022, 12:22:08)
[GCC 9.4.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import requests
>>> r = requests.get('http://httpstat.us/200?sleep=5000', timeout=1)
Traceback (most recent call last):
  File "/usr/lib/python3/dist-packages/urllib3/connectionpool.py", line 421, in _make_request
    six.raise_from(e, None)
  File "<string>", line 3, in raise_from
  File "/usr/lib/python3/dist-packages/urllib3/connectionpool.py", line 416, in _make_request
    httplib_response = conn.getresponse()
  File "/usr/lib/python3.8/http/client.py", line 1348, in getresponse
    response.begin()
  File "/usr/lib/python3.8/http/client.py", line 316, in begin
    version, status, reason = self._read_status()
  File "/usr/lib/python3.8/http/client.py", line 277, in _read_status
    line = str(self.fp.readline(_MAXLINE + 1), "iso-8859-1")
  File "/usr/lib/python3.8/socket.py", line 669, in readinto
    return self._sock.recv_into(b)
socket.timeout: timed out

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/usr/lib/python3/dist-packages/requests/adapters.py", line 439, in send
    resp = conn.urlopen(
  File "/usr/lib/python3/dist-packages/urllib3/connectionpool.py", line 719, in urlopen
    retries = retries.increment(
  File "/usr/lib/python3/dist-packages/urllib3/util/retry.py", line 400, in increment
    raise six.reraise(type(error), error, _stacktrace)
  File "/usr/lib/python3/dist-packages/six.py", line 703, in reraise
    raise value
  File "/usr/lib/python3/dist-packages/urllib3/connectionpool.py", line 665, in urlopen
    httplib_response = self._make_request(
  File "/usr/lib/python3/dist-packages/urllib3/connectionpool.py", line 423, in _make_request
    self._raise_timeout(err=e, url=url, timeout_value=read_timeout)
  File "/usr/lib/python3/dist-packages/urllib3/connectionpool.py", line 330, in _raise_timeout
    raise ReadTimeoutError(
urllib3.exceptions.ReadTimeoutError: HTTPConnectionPool(host='httpstat.us', port=80): Read timed out. (read timeout=1)

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/usr/lib/python3/dist-packages/requests/api.py", line 75, in get
    return request('get', url, params=params, **kwargs)
  File "/usr/lib/python3/dist-packages/requests/api.py", line 60, in request
    return session.request(method=method, url=url, **kwargs)
  File "/usr/lib/python3/dist-packages/requests/sessions.py", line 533, in request
    resp = self.send(prep, **send_kwargs)
  File "/usr/lib/python3/dist-packages/requests/sessions.py", line 646, in send
    r = adapter.send(request, **kwargs)
  File "/usr/lib/python3/dist-packages/requests/adapters.py", line 529, in send
    raise ReadTimeout(e, request=request)
requests.exceptions.ReadTimeout: HTTPConnectionPool(host='httpstat.us', port=80): Read timed out. (read timeout=1)

@dpgeorge
Copy link
Member

I think OSError(ETIMEDOUT) is good enough for now (and maybe good enough forever!).

@@ -130,3 +190,15 @@ def patch(url, **kw):

def delete(url, **kw):
return request("DELETE", url, **kw)


def encode_basic_auth(username, password):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this need to be a separate function? It would be more efficient and smaller bytecode if it were inlined at its point of use.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Minimising the overhead of extra functions is a good goal. I've done this here as well as the is_chunked_data() one.

import ubinascii

formated = b"{}:{}".format(username, password)
formated = ubinascii.b2a_base64(formated)[:-1].decode("ascii")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please use str(..., "ascii") instead of decode.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, fixed here as well as one other usage of decode in a different commit.

@@ -331,7 +331,7 @@ def a2b_base64(ascii):
table_b2a_base64 = "ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz0123456789+/"


def b2a_base64(bin):
def b2a_base64(bin, newline=True):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This change doesn't seem to have anything to do with urequests, but otherwise it's OK to have here (it's a separate commit, which is good).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah I'd thought it was related - was in the same original PR as the redirect/chunked change. Can confirm it was not actually used there, though looks like a clean change anyway.



def is_chunked_data(data):
return getattr(data, "__iter__", None) and not getattr(data, "__len__", None)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this need to be a separate function? Smaller code if it's not.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

inlined now

elif l.startswith(b"Location:") and not 200 <= status <= 299:
raise NotImplementedError("Redirects not yet supported")
if status in [301, 302, 303, 307, 308]:
redirect = l[10:-2].decode()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please use str(..., "utf-8").

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done thanks

if parse_headers is False:
pass
elif parse_headers is True:
l = l.decode()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please use str(..., "utf-8")

try:
s.settimeout(timeout)
except AttributeError:
raise AttributeError("Socket does not support timeout on this platform")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think it makes sense to convert the error string. It'll already raise a sensible message.

Paul Sokolovsky and others added 11 commits June 28, 2022 16:55
This is controlled by parse_headers param to request(), which defaults to
True for compatibility with upstream requests. In this case, headers are
available as .headers of Response objects. They are however normal (not
case-insensitive) dict.

If parse_headers=False, old behavior of ignore response headers is used,
which saves memory on the dict.

Finally, parse_headers can be a custom function which can e.g. parse only
subset of headers (again, to save memory).
Even though we use HTTP 1.0, where closing connection after sending
response should be the default, some servers ignore this requirement
and keep connection open. So, explicitly send corresponding header
to get the expected behavior. This follow a similar change done
previosuly to uaiohttpclient module (8c1e077).
Would lead to recursive TypeError because of str + bytes.
On the ESP32, socket.getaddrinfo() might return SOCK_DGRAM instead of SOCK_STREAM, eg with ".local" adresses.
As a HTTP request is always a TCP stream, we don't need to rely on the values returned by getaddrinfo.
@mattytrentini
Copy link
Contributor

I think OSError(ETIMEDOUT) is good enough for now (and maybe good enough forever!).

For now, for sure. Forever? Not so convinced... 😛 Two reasons: a) It makes it harder to port libraries that have error handling expecting that exception and b) That's quite a general exception for an error condition that's very specific.

Anyway, I agree it's unimportant...for now!

@jimmo
Copy link
Member

jimmo commented Jun 28, 2022

Thanks @andrewleech! LGTM

@dpgeorge
Copy link
Member

Tested on PYBD-SF2W (new features work as show above), and merged in 5854ae1 through 70e422d

Great work, thanks to everyone involved!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

9 participants
pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy