The curious case of slow/fast grequests code

saurabh.hirani@gmail.com / @sphirani

What's in it for you?

  • An interesting demoable debugging story
  • Surprising results in profiling and tracing
  • grequests + gevent internals
  • Demo code, slides
  • How many monkeys fit in a Clint Eastwood movie title?

Use case

  • Python program - thousands of HTTPS GETs
  • Ran slow on Python 2.7
  • Ran fast on Python 3.7

The End

Use case

  • Reporting tool: thousands of HTTP GETs
  • Ran on AWS EC2
  • Once per day
  • 20-25 mins
  • requests module - 1 URL at a time

Migrate from

  • AWS EC2 to AWS Lambda
  • HTTP to HTTPS
  • Python2.7 to Python3.7

Problem

  • AWS Lambda 15 mins max runtime
  • Existing code took 20-25 mins to run

Solution

  • Convert serial GETs to concurrent
  • requests -> grequests (gevent + requests)
  • requests - blocking socket
  • requests = send, block, recv, send, block, recv
  • grequests - non-blocking socket
  • grequests = send, send, block, recv, recv

Local setup

Stage-0 - Trigger

Add debug statements

Visualize - requests

grequests + HTTPS

Sr. No. n requests
1. 3 3.30
2. 5 5.50
3. 10 11

Visualize - Python3.7 grequests flow

Python 2.7 + HTTPS

Sr. No. n requests Python3.7 grequests
1. 3 3.30 1.20
2. 5 5.50 1.30
3. 10 11 1.55

Visualize - Python2.7 grequests flow

grequests + HTTPS

Sr. No. n requests Python2.7 grequests Python3.7 grequests
1. 3 3.30 3.30 1.20
2. 5 5.50 5.50 1.30
3. 10 11 11 1.55

What's going on?

  • requests
    • send, block, recv, send, block, recv
  • Python3.7 grequests
    • send, send, block, recv, recv
  • Python2.7 grequests
    • open, open, send, block, recv, send, block, recv

Observations

  • Same code - Variable speed
  • Depends on runtime environment
  • Isolate using docker

Local setup

  • Stage-0 - laptop
  • Stage-1 - docker
  • Minimal set of modules
  • Each container represents a stage in our problem
  • Container used in place of virtualenv - easy to demo

grequests https status

Stage Setup Python2.7 Python3.7
0 laptop slow fast
1 minimal ??? ???

Stage-1 - setup

  • Modules: grequests, requests, urllib3, gevent
  • Python2.7 container: test_grequests_python27_1
  • Python3.7 container: test_grequests_python37_1

Stage-1 - observations

  • HTTPS demo
  • Python2.7 fast
  • Python3.7 fast
  • Stage-0 Python2.7 was slow

Stage-0 v/s Stage-1

Sr. No. Python2.7 Stage-0 Python2.7 Stage-1
1. slow fast
2. Profiling Profiling
3. Tracing Tracing

Verify pyopenssl involvement

  • Demo
  • Laptop Python2.7 had pyopenssl
  • Laptop Python3.7 did not have pyopenssl

grequests https status

Stage Setup Python2.7 Python3.7
0 laptop slow fast
1 minimal fast fast
2 Stage-1 + pyopenssl ??? ???

Stage-2 - setup

  • Modules: grequests, requests, urllib3, gevent, pyopenssl
  • Stage-1 + pyopenssl
  • Python2.7 container: test_grequests_python27_2
  • Python3.7 container: test_grequests_python37_2

Stage-2 - observations

  • HTTPS Demo
  • Python2.7 slow
  • Python3.7 slow
  • Stage-1 Python2.7/Python3.7 were fast
  • Reason: pyopenssl monkey patching SSLSocket

grequests https status

Stage Setup Python2.7 Python3.7
0 laptop slow fast
1 minimal fast fast
2 Stage-1 + pyopenssl slow slow

gevent monkey patching

  • gevent + monkey-patching = non-blocking socket

Without monkey patching


$ python

import inspect
import socket

print(socket.socket)
<class 'socket._socketobject'>

inspect.getsourcefile(socket.ssl)
'/usr/local/Cellar/python@2/2.7.15_3/Frameworks/Python.framework/Versions/2.7/lib/python2.7/socket.py'

exit()
							

With monkey patching


$ python

from gevent import monkey
monkey.patch_all()

import inspect
import socket

print(socket.socket)
<class 'gevent._socket2.socket'>

inspect.getsourcefile(socket.ssl)
'/usr/local/lib/python2.7/site-packages/gevent/_socket2.py'

exit()
					

Without monkey patching - blocking socket blocks


$ cd monkey_patching_demo

$ cat blocking_socket.py

def socket():
    print("I will block")

$ python

import blocking_socket
blocking_socket.socket()

I will block

exit()
					

Without monkey patching - non-blocking socket doesn't


$ cd monkey_patching_demo

$ cat non_blocking_socket.py

def socket():
    print("I will not block")

$ python

import non_blocking_socket
non_blocking_socket.socket()

I will not block

exit()
					

With monkey patching - blocking socket replaced


$ python

import patcher
patcher.patch_all()

import blocking_socket
blocking_socket.socket()

I will not block

exit()
					

That which monkey patches



$ cat patcher.py

def patch_all():
    import blocking_socket
    import non_blocking_socket
    blocking_socket.socket = non_blocking_socket.socket
					

gevent In-depth explanation

  • Notes + Kavya Joshi's excellent in-depth talk
  • How is pyopenssl making the code slow?

    Double monkey patching code flow

    Solution

    • Uninstall pyopenssl if you don't need it OR
    • Check pyopenssl alernatives e.g. cryptography OR
    • Use gevent-openssl
    • gevent-openssl = pyopenssl compatible with gevent

    gevent-openssl patching flow

    grequests https status

    Stage Setup Python2.7 Python3.7
    0 laptop slow fast
    1 minimal fast fast
    2 Stage-1 + pyopenssl slow slow
    3 Stage-2 + gevent-openssl ??? ???

    Stage-3 - setup

    • Modules: grequests, requests, urllib3, gevent, pyopenssl, gevent-openssl
    • Stage-2 + gevent-openssl
    • Python2.7 container: test_grequests_python27_3
    • Python3.7 container: test_grequests_python37_3

    Stage-3 - observations

    • HTTPS Demo
    • Python 2.7 fast
    • Python 3.7 slow!!
    • Stage-1 Python 3.7 was fast

    grequests https status

    Stage Setup Python2.7 Python3.7
    0 laptop slow fast
    1 minimal fast fast
    2 Stage-1 + pyopenssl slow slow
    3 Stage-2 + gevent-openssl fast slow

    gevent-openssl making Python3.7 HTTPS slow?

    Partial monkey patching details

    grequests https status

    Stage Setup Python2.7 Python3.7
    0 laptop slow fast
    1 minimal fast fast
    2 Stage-1 + pyopenssl slow slow
    3 Stage-2 + gevent-openssl fast slow
    4 Stage-3 + patch ??? ???

    Stage-4 - setup

    • Modules: grequests, requests, urllib3, gevent, pyopenssl, gevent-openssl
    • Stage-3 + patch
    • Python2.7 container: test_grequests_python27_4
    • Python3.7 container: test_grequests_python37_4

    Stage-4 - observations

    grequests https status

    Stage Setup Python2.7 Python3.7
    0 laptop slow fast
    1 minimal fast fast
    2 Stage-1 + pyopenssl slow slow
    3 Stage-2 + gevent-openssl fast slow
    4 Stage-3 + patch fast fast

    Takeaways

    • One migration at a time - Python3.7, HTTPS, Lambda
    • Monkey patching order matters

    The answer

    • Q: How many monkeys fit in a Clint Eastwood movie title?
    • A: 5.x

    Q & A

    saurabh.hirani@gmail.com / @sphirani

    Why pyopenssl?

    • requests optionally includes pyopenssl for SNI support
    • stdlib ssl did not provide SNI in Python 2.7.8 and older
    • pyopenssl provides SNI support
    • stdlib in Python 2.7.9+ - ssl.HAS_SNI flag is True
    • pyopenssl other use cases - generate root certs , reading SSL cert

    grequests =~ gevent + requests?

    • No
    • grequests monkey.patch_all skips select
    • gevent.monkey.patch_all() patches all stdlib
    • pyopenssl calls urllib3 wait - which calls select
    • Unpatched select is blocking

    Why - opens concurrent even if Python2.7 + HTTPS is slow?

    Concurrent conns but Stage-2 send/recv serial