THE JOURNEY OF ASYNCIO ADOPTION
IN INSTAGRAM
Jimmy Lai
in PyCon TW 2018
OUTLINE
2
1 What's asyncio?
2 Asyncio Adoption in Instagram
3 Q&A
ABOUT ME - JIMMY LAI
• Software Engineer in Instagram Infrastructure
• I like Python
• Recent interests: Python efficiency
• profiling
• Cython
• asyncio
3
INSTAGRAM BACKEND
• Python + Django
• Serving with uwsgi
• Data fetching from backends
• No. of processes > No. CPU
4
Server
uwsgi
Django process
sharedmemory
memcached
cassandra
thrift services
https://instagram-engineering.com/
...
CPU
Django process
Django process
Django process
Django process
Django process
BLOCKING I/O PROBLEMS
• Slow API: API takes longer time to finish. Bad user experience.
• CPU idle: Context switch between processes come with overhead.
• Harakiri: Long request process termination (uwsgi Harakiri). Restarting process has high
overhead.
5
WHAT'S ASYNCIO
• Asynchronous I/O
• Running I/O concurrently
• Blocking IO mode
• Async IO mode
6https://rarehistoricalphotos.com/samuel-reshevsky-age-8-france-1920/
• Simultaneous Exhibition
CPU I/O CPU I/O
CPU I/O
CPU I/O
CPU I/O
CPU I/O
time
ASYNCIO AS SOLUTION
• Slow API: API runs faster and user get better experiences.
• CPU idle: In-thread context switch vs process context switch.
• Harakiri: Just cancel pending async call. No need to kill process.
7
MYTHS ABOUT ASYNCIO
1. asyncio is multi-processes or parallel computing. It's single single-threaded.
• Only one function could be executed at one time.
• Only I/O could run concurrently.
2. asyncio is always faster regarding CPU and Latency.
• Overhead of event loop and context switch could be significant.
8
CPYTHON ASYNCIO
• asyncio module became available starting in CPython 3.4
• Instagram used version 2.7 for a long time and migrated to 3.5 in 2017
9
ASYNC SYNTAX
• async def, await, coroutine
10
1 In [1]: async def sleep_and_return(sec):
2 ...: await asyncio.sleep(sec)
3 ...: return sec
4 ...:
5
6 In [2]: sleep_and_return()
7 Out[2]: <coroutine object sleep_and_return
at 0x10556ae60>
ASYNC SYNTAX
• async def, await, coroutine
• run async function in event loop
11
1 In [1]: async def sleep_and_return(sec):
2 ...: await asyncio.sleep(sec)
3 ...: return sec
4 ...:
5
6 In [2]: sleep_and_return()
7 Out[2]: <coroutine object sleep_and_return
at 0x10556ae60>
8
9 In [3]:
asyncio.get_event_loop().run_until_complete(
sleep_and_return(1))
10 Out[3]: 1
ASYNC SYNTAX
• async def, await, coroutine
• run async function in event loop
12
1 In [1]: async def sleep_and_return(sec):
2 ...: await asyncio.sleep(sec)
3 ...: return sec
4 ...:
5
6 In [2]: sleep_and_return()
7 Out[2]: <coroutine object sleep_and_return
at 0x10556ae60>
• gather async functions to run IO concurrently
8
9 In [3]:
asyncio.get_event_loop().run_until_complete(
sleep_and_return(1))
10 Out[3]: 1
9 In [3]: async def run():
10 ...: results = await asyncio.gather(
11 ...: sleep_and_return(1),
12 ...: sleep_and_return(1),
13 ...: sleep_and_return(2),
14 ...: )
15 ...: print(results)
16 ...:
17
18 In [4]: %timeit -r 1
asyncio.get_event_loop().run_until_complete(run())
19 ...:
20 ...:
21 [1, 1, 2]
22 [1, 1, 2]
23 2 s ± 0 ns per loop (mean ± std. dev. of 1
run, 1 loop each)
ASYNC SYNTAX
• async def, await, coroutine
• run async function in event loop
13
1 In [1]: async def sleep_and_return(sec):
2 ...: await asyncio.sleep(sec)
3 ...: return sec
4 ...:
5
6 In [2]: sleep_and_return()
7 Out[2]: <coroutine object sleep_and_return
at 0x10556ae60>
• gather async functions to run IO concurrently
8
9 In [3]:
asyncio.get_event_loop().run_until_complete(
sleep_and_return(1))
10 Out[3]: 1
9 In [3]: async def run():
10 ...: results = await asyncio.gather(
11 ...: sleep_and_return(1),
12 ...: sleep_and_return(1),
13 ...: sleep_and_return(2),
14 ...: )
15 ...: print(results)
16 ...:
17
18 In [4]: %timeit -r 1
asyncio.get_event_loop().run_until_complete(run())
19 ...:
20 ...:
21 [1, 1, 2]
22 [1, 1, 2]
23 2 s ± 0 ns per loop (mean ± std. dev. of 1
run, 1 loop each)
gather() is the key to get latency win!
HOW ASYNCIO WORKS?
• nonblocking I/O mode: socket.setblocking(False)
• register I/O to EpollSelector and wait until I/O ready by select( )
14Source code are simplified for explanation purpose.
1 class BaseSelectorEventLoop:
2 async def sock_recv(self, sock, n):
3 """Receive data from the socket."""
4 fut = self.create_future()
5 fd = sock.fileno()
6 handle = events.Handle(
7 self._sock_recv, args, self, None
8 )
9 self._selector.register(
10 fd, selectors.EVENT_READ, (handle, None)
11 )
12 return await fut
13
14 def _sock_recv(self, fut, registered_fd, sock, n):
15 try:
16 data = sock.recv(n)
17 except (BlockingIOError, InterruptedError):
18 ...
19
20 def run_until_complete(self, future):
21 """Run until the Future is done."""
22 self.run_forever()
23
24 def run_forever(self):
25 """Run until stop() is called."""
26 while True:
27 self._run_once()
28 if self._stopping:
29 break
30
31 def _run_once(self):
32 """Run one full iteration of the event loop."""
33 event_list = self._selector.select(None)
34 self._process_events(event_list)
35 ntodo = len(self._ready)
36 for i in range(ntodo):
37 handle = self._ready.popleft()
38 handle._run()
1
2
3
ASYNCIO ADOPTION IN INSTAGRAM
ASYNCIO ADOPTION IN INSTAGRAM JUST LIKE
decorate some trees in a forest
16
Instagram started using
Django and launched in
2010.
Large repo and many
developers.
ASYNCIO ADOPTION CHALLENGES
• scale: collaboration in large code repo with a lot of developers
• usability: asyncio utility and bug fix
• prioritization: too much blocking calls to migrate
• automation: reduce repeated manual effort
• efficiency: asyncio CPU overhead is very high
17
BACKEND CLIENT LIBRARIES ASYNCIO SUPPORT
• Thrift
• fbthrift py3 and py.asyncio namespaces
• Http
• aiohttp replaces requests
• Other backends
• https://github.com/aio-libs
18
• wait_for • async_test
MAKE ASYNCIO EASIER
19
1 import asyncio
2
3 def wait_for(coro):
4 loop = asyncio.get_event_loop()
5 return loop.run_until_complete(coro)
6
7 result = wait_for(async_func())
1 def async_test(func):
2 def inner(*args, **kwargs):
3 return wait_for(
4 func(*args, **kwargs)
5 )
6 return inner
7
8 class TestAsyncMethods(unittest.TestCase):
9 @async_test
10 async def test_async_method(self):
11 obj = Cls()
12 self.assertTrue(await obj.async_func())
ASYNC STACK MIGRATION
20
1 def func():
2 blocking_thrift_call()
3
4 ## after migrating to async
5
6 async def func():
7 await async_thrift_call()
IDENTIFY BLOCKING CALLS
Blocking Call Finder
• Figure out blocking call stack
and prioritize among tons of
stacks
• Prioritize stack by latency/call
count
• Implementation:
• use profile to collect
runtime stack trace
• use pygraphviz to render
graph view
21
1 def f():
2 blocking_thrift_call()
3
4 def g():
5 h()
6
7 def h():
8 blocking_http_call()
9
10 def api():
11 f()
12 g()
api
f g
blocking_thrift_call
h
blocking_http_call
20ms
50k calls
10ms
10k calls
9ms
9k calls
9ms
9k calls
20ms
50k calls
WHEN TOO MANY DEPENDENCY IN STACK
• Use sync wrapper
22
SYNC
func = sync(async_func)
• Provide async and non-async versions
given a function.
• Supports classmethod, staticmethod,
etc.
• Clean up sync wrapper line after
migrate all callsite to async.
23
1 def sync(async_func):
2 is_classmethod = False
3 if isinstance(async_func, classmethod):
4 async_func = async_func.__func__
5 is_classmethod = True
6 elif isinstance(async_func, staticmethod):
7 async_func = async_func.__func__
8 if not asyncio.iscoroutinefunction(async_func):
9 async_func = asyncio.coroutine(async_func)
10
11 @functools.wraps(async_func)
12 def _no_profile_sync(*args, **kwargs):
13 return wait_for(async_func(*args, **kwargs))
14
15 if is_classmethod:
16 return classmethod(_no_profile_sync)
17 else:
18 return _no_profile_sync
19
20 func = sync(async_func)
NESTED EVENT LOOP
RuntimeError: This event loop is already running
24
run_until_complete( )
async def f( )
def g( )
def h( )
run_until_complete( )
async def i( )
• Use new event loop when loop
is already running.
• Loop pool for reusing event loop
• Set current event loop and
running loop when loop is
already running.
• Restore event loop after finish
run_until_complete.
1 import asyncio
2 from contextlib import contextmanager
3
4 def wait_for(coro):
5 with get_event_loop() as loop:
6 return loop.run_until_complete(coro)
7
8 @contextmanager
9 def get_event_loop():
10 loop = asyncio.get_event_loop()
11 if not loop.is_running():
12 yield loop
13 else:
14 new_loop = loop_pool.borrow_loop()
15 asyncio.set_event_loop(new_loop)
16 running_loop = asyncio.events._get_running_loop()
17 asyncio.events._set_running_loop(None)
18 try:
19 yield new_loop
20 finally:
21 loop_pool.return_loop(new_loop)
22 asyncio.set_event_loop(loop)
23 asyncio.events._set_running_loop(running_loop)
RUNTIME ERROR: EVENT LOOP STOPPED BEFORE FUTURE
COMPLETED.
25
1 def test_run_until_complete_loop_orphan_future_close_loop(self):
2 class ShowStopper(BaseException):
3 pass
4
5 async def foo(delay):
6 await asyncio.sleep(delay, loop=self.loop)
7
8 def throw():
9 raise ShowStopper
10
11 self.loop._process_events = mock.Mock()
12 self.loop.call_soon(throw)
13 try:
14 self.loop.run_until_complete(foo(0.1))
15 except ShowStopper:
16 pass
17
18 # This call fails if run_until_complete does not clean up
19 # done-callback for the previous future.
20 self.loop.run_until_complete(foo(0.2))
Fix in run_until_complete( )
https://github.com/python/cpython/pull/1688
GLOBAL VARIABLE ISSUE
• Execution order is not guaranteed. Shared
mutable global variable may cause
unexpected result.
26
1 var = Container()
2
3 async def f():
4 var.val = await read_from_db1()
5 await write_to_db1(var)
6
7 async def g():
8 var.val = await read_from_db2()
9 await write_to_db2(var)
10
11 async def run():
12 await asyncio.gather(f(), g())
1 import contextvars
2 var = contextvars.ContextVar('var')
3
4 async def f():
5 var.set(await read_from_db1())
6 await write_to_db1(var.get())
7
8 async def g():
9 var.set(await read_from_db2())
10 await write_to_db2(var.get())
11
12 async def run():
13 await asyncio.gather(f(), g())
• Context Variable added in Python 3.7
GATHER DESIGN PATTERN
• To achieve the maximum concurrency
27
1 async def identity(value):
2 return value
3
4 async def run():
5 awaitables = [
6 f(),
7 g() if a is True else identity(None),
8 h() if b is True else identity(None),
9 ]
10 _, var1, var2 = await asyncio.gather(*awaitables)
1 async def run():
2 await f()
3 var1 = None
4 if a is True:
5 var1 = await g()
6
7 var2 = None
8 if b is True:
9 var2 = await h()
LINT
Provide guidance to write better asyncio code
• Rules:
1. async function should be named with async_ prefix
• e.g. async_func( ) vs func( )
2. gather await in loop
3. warning when adding new blocking calls
• implemented with ast + flake8
28
1 for data in data_list:
2 await async_func(data)
3
4 # use gather to run faster
5 await asyncio.gather(*[async_func(data) for data in data_list])
AUTOMATION
• Many of asyncio changes are simple and repetitive
• smart code modifier for asyncio adoption:
• collect caller-callee from runtime profiling and offline pyan static analysis
• modify source code ast tree
• change blocking call to async call
• add await
• auto formatting code using isort and black
29
source
code
ast
code
modifier
change
set
pull
request
CPU OVERHEAD
• Adopting asyncio could cost ~20% CPU instructions on Instagram servers.
• CPython asyncio was slow due to Python implementation of event loop and helpers.
• Optimization strategies:
• simplify the code and remove redundant computation
• Cython
• C API
• Available optimizations:
• uvloop: libuv + Cython binding for event loop
• CPython 3.6 implement Future and Task in C
• CPython 3.7 implement get_event_loop( ) in C. Future and gather( ) also become
faster.
30
CUSTOM OPTIMIZATION
• Example: gather( ) -> ensure_future( ) -> isfuture/iscoroutine/isawaitable
• Reorder: check iscoroutine first
• gather deduplicate coroutines using a dict. Remove the assumption.
• Implement all helper functions by C API
• Optimization result: reduce the overall asyncio CPU overhead by 2X (10%)
31
CURRENT RESULTS
• API latency become 30% faster on server side
• Better user engagement
• more media views
• more time spent
• Next Steps
• 100% asyncio
• concurrent request handling
32
Q&A
jimmylai@instagram.com

The journey of asyncio adoption in instagram

  • 1.
    THE JOURNEY OFASYNCIO ADOPTION IN INSTAGRAM Jimmy Lai in PyCon TW 2018
  • 2.
    OUTLINE 2 1 What's asyncio? 2Asyncio Adoption in Instagram 3 Q&A
  • 3.
    ABOUT ME -JIMMY LAI • Software Engineer in Instagram Infrastructure • I like Python • Recent interests: Python efficiency • profiling • Cython • asyncio 3
  • 4.
    INSTAGRAM BACKEND • Python+ Django • Serving with uwsgi • Data fetching from backends • No. of processes > No. CPU 4 Server uwsgi Django process sharedmemory memcached cassandra thrift services https://instagram-engineering.com/ ... CPU Django process Django process Django process Django process Django process
  • 5.
    BLOCKING I/O PROBLEMS •Slow API: API takes longer time to finish. Bad user experience. • CPU idle: Context switch between processes come with overhead. • Harakiri: Long request process termination (uwsgi Harakiri). Restarting process has high overhead. 5
  • 6.
    WHAT'S ASYNCIO • AsynchronousI/O • Running I/O concurrently • Blocking IO mode • Async IO mode 6https://rarehistoricalphotos.com/samuel-reshevsky-age-8-france-1920/ • Simultaneous Exhibition CPU I/O CPU I/O CPU I/O CPU I/O CPU I/O CPU I/O time
  • 7.
    ASYNCIO AS SOLUTION •Slow API: API runs faster and user get better experiences. • CPU idle: In-thread context switch vs process context switch. • Harakiri: Just cancel pending async call. No need to kill process. 7
  • 8.
    MYTHS ABOUT ASYNCIO 1.asyncio is multi-processes or parallel computing. It's single single-threaded. • Only one function could be executed at one time. • Only I/O could run concurrently. 2. asyncio is always faster regarding CPU and Latency. • Overhead of event loop and context switch could be significant. 8
  • 9.
    CPYTHON ASYNCIO • asynciomodule became available starting in CPython 3.4 • Instagram used version 2.7 for a long time and migrated to 3.5 in 2017 9
  • 10.
    ASYNC SYNTAX • asyncdef, await, coroutine 10 1 In [1]: async def sleep_and_return(sec): 2 ...: await asyncio.sleep(sec) 3 ...: return sec 4 ...: 5 6 In [2]: sleep_and_return() 7 Out[2]: <coroutine object sleep_and_return at 0x10556ae60>
  • 11.
    ASYNC SYNTAX • asyncdef, await, coroutine • run async function in event loop 11 1 In [1]: async def sleep_and_return(sec): 2 ...: await asyncio.sleep(sec) 3 ...: return sec 4 ...: 5 6 In [2]: sleep_and_return() 7 Out[2]: <coroutine object sleep_and_return at 0x10556ae60> 8 9 In [3]: asyncio.get_event_loop().run_until_complete( sleep_and_return(1)) 10 Out[3]: 1
  • 12.
    ASYNC SYNTAX • asyncdef, await, coroutine • run async function in event loop 12 1 In [1]: async def sleep_and_return(sec): 2 ...: await asyncio.sleep(sec) 3 ...: return sec 4 ...: 5 6 In [2]: sleep_and_return() 7 Out[2]: <coroutine object sleep_and_return at 0x10556ae60> • gather async functions to run IO concurrently 8 9 In [3]: asyncio.get_event_loop().run_until_complete( sleep_and_return(1)) 10 Out[3]: 1 9 In [3]: async def run(): 10 ...: results = await asyncio.gather( 11 ...: sleep_and_return(1), 12 ...: sleep_and_return(1), 13 ...: sleep_and_return(2), 14 ...: ) 15 ...: print(results) 16 ...: 17 18 In [4]: %timeit -r 1 asyncio.get_event_loop().run_until_complete(run()) 19 ...: 20 ...: 21 [1, 1, 2] 22 [1, 1, 2] 23 2 s ± 0 ns per loop (mean ± std. dev. of 1 run, 1 loop each)
  • 13.
    ASYNC SYNTAX • asyncdef, await, coroutine • run async function in event loop 13 1 In [1]: async def sleep_and_return(sec): 2 ...: await asyncio.sleep(sec) 3 ...: return sec 4 ...: 5 6 In [2]: sleep_and_return() 7 Out[2]: <coroutine object sleep_and_return at 0x10556ae60> • gather async functions to run IO concurrently 8 9 In [3]: asyncio.get_event_loop().run_until_complete( sleep_and_return(1)) 10 Out[3]: 1 9 In [3]: async def run(): 10 ...: results = await asyncio.gather( 11 ...: sleep_and_return(1), 12 ...: sleep_and_return(1), 13 ...: sleep_and_return(2), 14 ...: ) 15 ...: print(results) 16 ...: 17 18 In [4]: %timeit -r 1 asyncio.get_event_loop().run_until_complete(run()) 19 ...: 20 ...: 21 [1, 1, 2] 22 [1, 1, 2] 23 2 s ± 0 ns per loop (mean ± std. dev. of 1 run, 1 loop each) gather() is the key to get latency win!
  • 14.
    HOW ASYNCIO WORKS? •nonblocking I/O mode: socket.setblocking(False) • register I/O to EpollSelector and wait until I/O ready by select( ) 14Source code are simplified for explanation purpose. 1 class BaseSelectorEventLoop: 2 async def sock_recv(self, sock, n): 3 """Receive data from the socket.""" 4 fut = self.create_future() 5 fd = sock.fileno() 6 handle = events.Handle( 7 self._sock_recv, args, self, None 8 ) 9 self._selector.register( 10 fd, selectors.EVENT_READ, (handle, None) 11 ) 12 return await fut 13 14 def _sock_recv(self, fut, registered_fd, sock, n): 15 try: 16 data = sock.recv(n) 17 except (BlockingIOError, InterruptedError): 18 ... 19 20 def run_until_complete(self, future): 21 """Run until the Future is done.""" 22 self.run_forever() 23 24 def run_forever(self): 25 """Run until stop() is called.""" 26 while True: 27 self._run_once() 28 if self._stopping: 29 break 30 31 def _run_once(self): 32 """Run one full iteration of the event loop.""" 33 event_list = self._selector.select(None) 34 self._process_events(event_list) 35 ntodo = len(self._ready) 36 for i in range(ntodo): 37 handle = self._ready.popleft() 38 handle._run() 1 2 3
  • 15.
  • 16.
    ASYNCIO ADOPTION ININSTAGRAM JUST LIKE decorate some trees in a forest 16 Instagram started using Django and launched in 2010. Large repo and many developers.
  • 17.
    ASYNCIO ADOPTION CHALLENGES •scale: collaboration in large code repo with a lot of developers • usability: asyncio utility and bug fix • prioritization: too much blocking calls to migrate • automation: reduce repeated manual effort • efficiency: asyncio CPU overhead is very high 17
  • 18.
    BACKEND CLIENT LIBRARIESASYNCIO SUPPORT • Thrift • fbthrift py3 and py.asyncio namespaces • Http • aiohttp replaces requests • Other backends • https://github.com/aio-libs 18
  • 19.
    • wait_for •async_test MAKE ASYNCIO EASIER 19 1 import asyncio 2 3 def wait_for(coro): 4 loop = asyncio.get_event_loop() 5 return loop.run_until_complete(coro) 6 7 result = wait_for(async_func()) 1 def async_test(func): 2 def inner(*args, **kwargs): 3 return wait_for( 4 func(*args, **kwargs) 5 ) 6 return inner 7 8 class TestAsyncMethods(unittest.TestCase): 9 @async_test 10 async def test_async_method(self): 11 obj = Cls() 12 self.assertTrue(await obj.async_func())
  • 20.
    ASYNC STACK MIGRATION 20 1def func(): 2 blocking_thrift_call() 3 4 ## after migrating to async 5 6 async def func(): 7 await async_thrift_call()
  • 21.
    IDENTIFY BLOCKING CALLS BlockingCall Finder • Figure out blocking call stack and prioritize among tons of stacks • Prioritize stack by latency/call count • Implementation: • use profile to collect runtime stack trace • use pygraphviz to render graph view 21 1 def f(): 2 blocking_thrift_call() 3 4 def g(): 5 h() 6 7 def h(): 8 blocking_http_call() 9 10 def api(): 11 f() 12 g() api f g blocking_thrift_call h blocking_http_call 20ms 50k calls 10ms 10k calls 9ms 9k calls 9ms 9k calls 20ms 50k calls
  • 22.
    WHEN TOO MANYDEPENDENCY IN STACK • Use sync wrapper 22
  • 23.
    SYNC func = sync(async_func) •Provide async and non-async versions given a function. • Supports classmethod, staticmethod, etc. • Clean up sync wrapper line after migrate all callsite to async. 23 1 def sync(async_func): 2 is_classmethod = False 3 if isinstance(async_func, classmethod): 4 async_func = async_func.__func__ 5 is_classmethod = True 6 elif isinstance(async_func, staticmethod): 7 async_func = async_func.__func__ 8 if not asyncio.iscoroutinefunction(async_func): 9 async_func = asyncio.coroutine(async_func) 10 11 @functools.wraps(async_func) 12 def _no_profile_sync(*args, **kwargs): 13 return wait_for(async_func(*args, **kwargs)) 14 15 if is_classmethod: 16 return classmethod(_no_profile_sync) 17 else: 18 return _no_profile_sync 19 20 func = sync(async_func)
  • 24.
    NESTED EVENT LOOP RuntimeError:This event loop is already running 24 run_until_complete( ) async def f( ) def g( ) def h( ) run_until_complete( ) async def i( ) • Use new event loop when loop is already running. • Loop pool for reusing event loop • Set current event loop and running loop when loop is already running. • Restore event loop after finish run_until_complete. 1 import asyncio 2 from contextlib import contextmanager 3 4 def wait_for(coro): 5 with get_event_loop() as loop: 6 return loop.run_until_complete(coro) 7 8 @contextmanager 9 def get_event_loop(): 10 loop = asyncio.get_event_loop() 11 if not loop.is_running(): 12 yield loop 13 else: 14 new_loop = loop_pool.borrow_loop() 15 asyncio.set_event_loop(new_loop) 16 running_loop = asyncio.events._get_running_loop() 17 asyncio.events._set_running_loop(None) 18 try: 19 yield new_loop 20 finally: 21 loop_pool.return_loop(new_loop) 22 asyncio.set_event_loop(loop) 23 asyncio.events._set_running_loop(running_loop)
  • 25.
    RUNTIME ERROR: EVENTLOOP STOPPED BEFORE FUTURE COMPLETED. 25 1 def test_run_until_complete_loop_orphan_future_close_loop(self): 2 class ShowStopper(BaseException): 3 pass 4 5 async def foo(delay): 6 await asyncio.sleep(delay, loop=self.loop) 7 8 def throw(): 9 raise ShowStopper 10 11 self.loop._process_events = mock.Mock() 12 self.loop.call_soon(throw) 13 try: 14 self.loop.run_until_complete(foo(0.1)) 15 except ShowStopper: 16 pass 17 18 # This call fails if run_until_complete does not clean up 19 # done-callback for the previous future. 20 self.loop.run_until_complete(foo(0.2)) Fix in run_until_complete( ) https://github.com/python/cpython/pull/1688
  • 26.
    GLOBAL VARIABLE ISSUE •Execution order is not guaranteed. Shared mutable global variable may cause unexpected result. 26 1 var = Container() 2 3 async def f(): 4 var.val = await read_from_db1() 5 await write_to_db1(var) 6 7 async def g(): 8 var.val = await read_from_db2() 9 await write_to_db2(var) 10 11 async def run(): 12 await asyncio.gather(f(), g()) 1 import contextvars 2 var = contextvars.ContextVar('var') 3 4 async def f(): 5 var.set(await read_from_db1()) 6 await write_to_db1(var.get()) 7 8 async def g(): 9 var.set(await read_from_db2()) 10 await write_to_db2(var.get()) 11 12 async def run(): 13 await asyncio.gather(f(), g()) • Context Variable added in Python 3.7
  • 27.
    GATHER DESIGN PATTERN •To achieve the maximum concurrency 27 1 async def identity(value): 2 return value 3 4 async def run(): 5 awaitables = [ 6 f(), 7 g() if a is True else identity(None), 8 h() if b is True else identity(None), 9 ] 10 _, var1, var2 = await asyncio.gather(*awaitables) 1 async def run(): 2 await f() 3 var1 = None 4 if a is True: 5 var1 = await g() 6 7 var2 = None 8 if b is True: 9 var2 = await h()
  • 28.
    LINT Provide guidance towrite better asyncio code • Rules: 1. async function should be named with async_ prefix • e.g. async_func( ) vs func( ) 2. gather await in loop 3. warning when adding new blocking calls • implemented with ast + flake8 28 1 for data in data_list: 2 await async_func(data) 3 4 # use gather to run faster 5 await asyncio.gather(*[async_func(data) for data in data_list])
  • 29.
    AUTOMATION • Many ofasyncio changes are simple and repetitive • smart code modifier for asyncio adoption: • collect caller-callee from runtime profiling and offline pyan static analysis • modify source code ast tree • change blocking call to async call • add await • auto formatting code using isort and black 29 source code ast code modifier change set pull request
  • 30.
    CPU OVERHEAD • Adoptingasyncio could cost ~20% CPU instructions on Instagram servers. • CPython asyncio was slow due to Python implementation of event loop and helpers. • Optimization strategies: • simplify the code and remove redundant computation • Cython • C API • Available optimizations: • uvloop: libuv + Cython binding for event loop • CPython 3.6 implement Future and Task in C • CPython 3.7 implement get_event_loop( ) in C. Future and gather( ) also become faster. 30
  • 31.
    CUSTOM OPTIMIZATION • Example:gather( ) -> ensure_future( ) -> isfuture/iscoroutine/isawaitable • Reorder: check iscoroutine first • gather deduplicate coroutines using a dict. Remove the assumption. • Implement all helper functions by C API • Optimization result: reduce the overall asyncio CPU overhead by 2X (10%) 31
  • 32.
    CURRENT RESULTS • APIlatency become 30% faster on server side • Better user engagement • more media views • more time spent • Next Steps • 100% asyncio • concurrent request handling 32
  • 33.