Skip to content

Conversation

casperdcl
Copy link
Member

@casperdcl casperdcl commented Jul 11, 2020

Would love to call this tqdm.async but unfortunately async is a reserved keyword.

  • choose a name: asyncio coroutine, asynchronous, ...
  • support iterables
  • support async iterables
  • support coroutine.send
  • add as_completed wrapper
  • add async context manager why? don't see any use case (async with tqdm() as t?)
    • support async update why? don't see any use case (await update()? lol...)
  • confirm that there's no better/more correct way to wrap coroutines/async generators
  • test properly
    • basic tests
    • fix tests
    • mock awaitables, etc...
  • add/update documentation
    • inline
    • readme
    • examples/ folder

Examples:

import asyncio
from tqdm.asyncio import tqdm, trange
def count(start=0, step=1):
    i = start
    while True:
        new_start = yield i
        if new_start is None:
            i += step
        else:
            i = new_start
async def main():
    N = int(1e6)
    async for row in tqdm(trange(N, desc="inner"), desc="outer"):
        if row >= N:
            break
    with tqdm(count(), desc="coroutine", total=N + 2) as pbar:
        async for row in pbar:
            if row == N:
                pbar.send(-10)
            elif row < 0:
                assert row == -9
                break
    # should be under 10 seconds
    for i in tqdm.as_completed(list(map(asyncio.sleep, [1] * 10)),
                               desc="as_completed"):
        await i
asyncio.run(main())  # Python 3.7+
inner: 100%|███████████████| 1000000/1000000 [00:01<00:00, 928957.82it/s]
outer: 100%|███████████████| 1000000/1000000 [00:01<00:00, 928975.72it/s]
coroutine: 100%|██████████| 1000002/1000002 [00:00<00:00, 1483744.37it/s]
as_completed: 100%|██████████████████████| 10/10 [00:01<00:00,  9.97it/s]

CC other participants: @Midnighter @sametmax @malarinv @Ekevoo
CC 👍 reactors: @AdrienPensart @supermodo @lukaville @mikulas-mrva @WestXu @SCoder12 @malarinv

Disclaimer: I wrap C++ and/or CUDA anytime I need real concurrent speed, but don't use async Python much so don't know if I'm missing something. Feedback would be appreciated ❤️

@casperdcl casperdcl self-assigned this Jul 11, 2020
@casperdcl casperdcl added c8-hard 🕗 Complexity high help wanted 🙏 We need you (discussion or implementation) p3-enhancement 🔥 Much new such feature question/docs ‽ Documentation clarification candidate submodule ⊂ Periphery/subclasses synchronisation ⇶ Multi-thread/processing to-fix ⌛ In progress labels Jul 11, 2020
@casperdcl casperdcl added this to the Non-breaking milestone Jul 11, 2020
@codecov-commenter
Copy link

codecov-commenter commented Jul 11, 2020

Codecov Report

Merging #1004 into devel will increase coverage by 0.25%.
The diff coverage is 95.23%.

@@            Coverage Diff             @@
##            devel    #1004      +/-   ##
==========================================
+ Coverage   87.16%   87.42%   +0.25%     
==========================================
  Files          21       22       +1     
  Lines        1278     1320      +42     
  Branches      217      223       +6     
==========================================
+ Hits         1114     1154      +40     
  Misses        143      143              
- Partials       21       23       +2     

@casperdcl casperdcl changed the title add tqdm.coroutine add tqdm.asyncio Jul 11, 2020
@Midnighter
Copy link

I have one question about your design. If I understand your code correctly, tqdm_asyncio presents an asynchronous interface even for synchronous iterables. You went through some extra work to make that possible but is it desirable? I would be happy to use a different async progress class but it certainly is convenient to not have to care whether my iterable is async or sync.

@casperdcl
Copy link
Member Author

casperdcl commented Jul 12, 2020

it certainly is convenient to not have to care whether my iterable is async or sync

yes exactly. This PR aims to address multiple issues for ease of use.

Also as I understand async and coroutine are actually two different things - there are differently-implemented "coroutines" in python2, for example. Would this be of interest? Maybe just in documentation?

from tqdm.auto import tqdm as tqdm_auto


# based on http://www.dabeaz.com/coroutines/copipe.py
def autonext(func):
    def inner(*args, **kwargs):
        res = func(*args,**kwargs)
        res.next()
        return res
    return inner

class tqdm_coro(tqdm_auto):
    @classmethod
    @autonext
    def pipe(cls, target, **tkwargs):
        """
        Python2 coroutine-style pipe.

        This:
            r = receiver()
            p = producer(r)
            r.next()
            p.next()
        Becomes:
            r = receiver()
            t = tqdm.pipe(r)
            p = producer(t)
            r.next()
            p.next()
        """
        with cls(**tkwargs) as pbar:
            while True:
                obj = (yield)
                target.send(obj)
                pbar.update()

def source(target):
    for i in ["foo", "bar", "baz", "pythonista", "python", "py"]:
        target.send(i)

@autonext
def grep(pattern, target):
    while True:
        line = (yield)
        if pattern in line:
            target.send(line)

@autonext
def sink():
    while True:
         line = (yield)
         tqdm_coro.write(line)

source(
    tqdm_coro.pipe(
        grep('python',
            sink())))

@Midnighter
Copy link

Also as I understand async and coroutine are actually two different things - there are differently-implemented "coroutines" in python2, for example. Would this be of interest? Maybe just in documentation?

I don't have the bandwidth in my own projects so I started only supporting Python 3.6+. Looking at the classifiers on PyPI, you go as far back as supporting 2.6 which is pretty tough I think.

As far as I understood the topic so far, all asynchronous functions are coroutines. The syntax simply changed from the @asyncio.coroutine decorator to the (in my opinion) cleaner async keyword. It's possible that there are multiple implementations in Python 2 since asyncio only became widely useable in Python 3.5.

@casperdcl
Copy link
Member Author

I'm talking about something which predates even @asyncio.coroutine.

In python2, the word coroutine referred to a type of generator obect driven by send() rather than next(), and had nothing to do with asynchronicity.

@casperdcl casperdcl changed the base branch from master to devel July 16, 2020 20:04
@casperdcl casperdcl merged commit cb60ea9 into devel Jul 16, 2020
@casperdcl casperdcl added to-merge ↰ Imminent and removed to-fix ⌛ In progress labels Jul 16, 2020
@casperdcl casperdcl mentioned this pull request Jul 16, 2020
@Midnighter
Copy link

Midnighter commented Jul 16, 2020

Thank you for making this happen @casperdcl 🙂

@casperdcl casperdcl deleted the coroutines branch July 16, 2020 21:56
@casperdcl
Copy link
Member Author

casperdcl commented Jul 16, 2020

whew all done in tqdm>=4.48.0 https://tqdm.github.io/releases/#v4480-2020-07-16

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
c8-hard 🕗 Complexity high help wanted 🙏 We need you (discussion or implementation) p3-enhancement 🔥 Much new such feature question/docs ‽ Documentation clarification candidate submodule ⊂ Periphery/subclasses synchronisation ⇶ Multi-thread/processing to-merge ↰ Imminent
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Asynchronous tqdm
5 participants