Skip to content

Conversation

lrq3000
Copy link
Member

@lrq3000 lrq3000 commented Sep 3, 2016

Implement what was asked in #255.

This PR implements a barebone tqdm with only the basic features. The goal is twofold: maximum speed and a short and easy implementation that other languages can use as a base to implement a minimalist tqdm.

I made the choice of keeping it a function with closures. I think this is the best way to maximize performance. But it will impose severe implementation constraints. /EDIT: reimplemented with class-based approach in branch tqdm_bare_class, so we now have both approaches and must choose one!

/EDIT: I realized it's stupid to try to implement a tqdm_bare only for performance reasons, because if we strip out features to the bare, of course the opcode count will drop down significantly but the overhead will rise up: for example, dynamic_miniters wouldn't be implemented, which means that by default miniters=1. In this case, even if tqdm_bare has fewer opcodes than core tqdm, it would be far slower because it would have a bigger overhead, checking time and printing more often than core tqdm does! And that's normal, core tqdm is already heavily optimized, most features are optional with virtually no cost.

So, since minimizing opcodes is useless, I tried to minimize code statements: the new goals of this module are:

  1. Implement major features of tqdm in the fastest way possible.
  2. Minimize number of code statements.
  3. Standalone (should avoid external calls even to ._utils except to standard lib): the file should contain all logic necessary to reimplement it elsewhere. (I exceptionally call ._utils to load colorama because I think it's not part of our logic but rather a workaround for Windows console limitations.)

The final objective is to have a standalone shortened tqdm containing major features as a minimal example for developers working on ports to other languages.

So here we are: tqdm_bare is 387 lines long including docstrings, comments and spaces and it supports major tqdm features that we love it for, such as automagic nested loops support!

Examples:

from tqdm import tqdm_bare, tbrange
from tqdm import tqdm_bare_class, tbcrange
from time import sleep

#######################
# Function based tqdm_bare

for i in tbrange(int(1E6), leave=True, unit='b', unit_scale=True):
    pass

for i in tbrange(10, desc='PAR', leave=True):
    for j in tbrange(100, desc='NESTED', leave=True):
        sleep(0.01)

for i in tbrange(100, desc='ITER', leave=True):
    sleep(0.1)

t = tqdm_bare(total=100, desc='MANU', leave=True)
for i in xrange(100):
    t()
    sleep(0.1)

t = tqdm_bare(desc='NOTOTAL', leave=True)
for i in xrange(100):
    t()
    sleep(0.1)

t = tbrange(10, desc='INIT', leave=True)

##########################
# Class based tqdm_bare_class

for i in tbcrange(int(1E6), leave=True, unit='b', unit_scale=True):
    pass

for i in tbcrange(10, desc='PAR', leave=True):
    for j in tbcrange(100, desc='NESTED', leave=True):
        sleep(0.01)

for i in tbcrange(100, desc='ITER', leave=True):
    sleep(0.1)

t = tqdm_bare_class(total=100, desc='MANU', leave=True)
for i in xrange(100):
    t()
    sleep(0.1)

t = tqdm_bare_class(desc='NOTOTAL', leave=True)
for i in xrange(100):
    t.update()
    sleep(0.1)

t = tbcrange(10, desc='INIT', leave=True)

Supported arguments/features (including dynamic_miniters):

def tqdm_bare(iterable=None, desc=None, total=None, leave=True,
                    file=sys.stdout, width=78, mininterval=0.1,
                    maxinterval=10, miniters=None, disable=False,
                    unit='it', unit_scale=False,
                    smoothing=0.3, initial=0, position=None,
                    **kwargs):

Same for tqdm_bare_class (and same performance, etc.).

TODO:

  • Choose between tqdm_bare (function-based with closures, 387 lines 334 SLOC) and tqdm_bare_class (object-oriented, 429 lines 368 SLOC). Or keep both? For reference, core tqdm is 954 lines and 829 SLOC. /EDIT: both are merged in the same branch but in two different scripts _tqdm_bare.py and _tqdm_bare_class.py. Thus, this will show two different ways of implementing tqdm in other languages, depending on whether the language is functional oriented or object oriented.
  • Unit tests (use the ones for core _tests_tqdm.py? Or use old ones from v1.0-2.0?). Goal is to have 100% coverage like core tqdm.
  • Flake8
  • Add some missing features that would not impact performance, like smoothing (Exponential Moving Average)? (But remember that we don't want to reinvent the wheel, we already have core tqdm).
  • Add automated nested loops support
  • Add new benchmarks in examples/simple_examples.py (or create a new benchmark.py script?)

Signed-off-by: Stephen L. <lrq3000@gmail.com>
@lrq3000 lrq3000 added p4-enhancement-future 🧨 On the back burner need-feedback 📢 We need your response (question) submodule ⊂ Periphery/subclasses labels Sep 3, 2016
@coveralls
Copy link

Coverage Status

Coverage decreased (-9.2%) to 81.536% when pulling 8067fd0 on tqdm_bare into 02cbd9c on master.

@codecov-io
Copy link

codecov-io commented Sep 3, 2016

Codecov Report

❗ No coverage uploaded for pull request base (master@1104d07).

@@            Coverage Diff            @@
##             master     #258   +/-   ##
=========================================
  Coverage          ?   57.76%           
=========================================
  Files             ?        9           
  Lines             ?     1018           
  Branches          ?      227           
=========================================
  Hits              ?      588           
  Misses            ?      428           
  Partials          ?        2

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 1104d07...b172028. Read the comment docs.

Signed-off-by: Stephen L. <lrq3000@gmail.com>
@coveralls
Copy link

Coverage Status

Coverage decreased (-16.6%) to 74.22% when pulling 3e224a5 on tqdm_bare into 02cbd9c on master.

Signed-off-by: Stephen L. <lrq3000@gmail.com>
@lrq3000 lrq3000 changed the title Add tqdm_bare and tbrange #255 Add tqdm_bare and tbrange as minimal example for ports developers Sep 4, 2016
@coveralls
Copy link

coveralls commented Sep 4, 2016

Coverage Status

Coverage decreased (-16.7%) to 74.098% when pulling bc34ca3 on tqdm_bare into 02cbd9c on master.

Signed-off-by: Stephen L. <lrq3000@gmail.com>
@lrq3000
Copy link
Member Author

lrq3000 commented Sep 4, 2016

I stop here, I'll add unit tests if enough people are interested.

(Adding unit tests shouldn't be too hard because we can reuse the tests of core tqdm but some specs were changed because I stripped down some stuff and checks or because it wasn't possible without a class, such as move and refresh nested bars display when a bar closes, so the spec here is that new bars will fill up empty spaces or if not possible will be appended below).

@coveralls
Copy link

Coverage Status

Coverage decreased (-17.04%) to 73.736% when pulling ba23d59 on tqdm_bare into 02cbd9c on master.

…ine update code in iterable mode (optimization)

Signed-off-by: Stephen L. <lrq3000@gmail.com>
@coveralls
Copy link

coveralls commented Sep 5, 2016

Coverage Status

Coverage decreased (-20.7%) to 70.078% when pulling e09ecc5 on tqdm_bare into 02cbd9c on master.

Signed-off-by: Stephen L. <lrq3000@gmail.com>
@coveralls
Copy link

coveralls commented Sep 5, 2016

Coverage Status

Coverage decreased (-22.0%) to 68.798% when pulling 1a66525 on tqdm_bare into 02cbd9c on master.

Signed-off-by: Stephen L. <lrq3000@gmail.com>
@coveralls
Copy link

Coverage Status

Coverage decreased (-22.8%) to 67.97% when pulling c68638b on tqdm_bare into 02cbd9c on master.

@lrq3000
Copy link
Member Author

lrq3000 commented Sep 5, 2016

I did some benchmark against different versions of tqdm:

bare: 100%|#################|100000000/100000000 [00:20<00:00, 5079080.71it/s]
v1.0: |##########| 100000000/100000000 100% [elapsed: 00:19 left: 00:00, 5175179.85 iters/sec]
noclass: 100%|##########| 100000000/100000000 [00:19<00:00, 5159426.28 it/s]
class: 100%|##########| 100000000/100000000 [00:20<00:00, 4893086.06it/s]
core: 100%|#################| 100000000/100000000 [00:19<00:00, 5140595.26it/s]
  • Bare is this PR.
  • v1.0 is the original tqdm.
  • noclass is tqdm v1.0 with optimizations (status_printer is not a class anymore but a function with closures), just before passing to class-based approach in v2.0.
  • class (tqdm v2.0) is when we transitioned original tqdm to a class based approach instead of function based with closures. I took a commit where the class based approach was fixed to avoid the slowdown (by inlining the same code in __iter__() and update()).
  • core is current core tqdm.

All bars were benchmarked with trange(int(1e8), miniters=4500000, mininterval=0.1, desc='bare', leave='True'). Note that without miniters=4500000, only core tqdm and bare tqdm achieve reasonable performance at 4500000 it/s whereas all others drop significantly.

So as we can see, tqdm_bare is somewhat as fast as all the other bars, except class (tqdm v2.0) which is significantly slower than tqdm_bare.

Note that core tqdm is still sensibly faster than tqdm bare, which shows once again that we did a pretty good job at optimizing core tqdm. But I don't know why this is so, since there is technically less opcodes, so my guess is that the function based approach with closures is slower than a class based approach when we implement some of the major features of tqdm.

I'm going to try to reimplement tqdm_bare with a class based approach and see if it speeds up a bit (because if that's the case, a class-based approach is anyway easier to read so that would be better all around).

@CrazyPython
Copy link
Contributor

CrazyPython commented Sep 5, 2016

@lrq3000 so noclass is slower than v1.0?

@casperdcl
Copy link
Member

casperdcl commented Sep 5, 2016

v1.0 was slooooooow. easy to beat

Signed-off-by: Stephen L. <lrq3000@gmail.com>
@CrazyPython
Copy link
Contributor

@casperdcl I mean slower

@CrazyPython
Copy link
Contributor

CrazyPython commented Sep 5, 2016

For comparison:

bare    5079080.71
v1.0    5175179.85
noclass 5159426.28
class   4893086.06
core    5140595.26

@casperdcl v1.0 was the fastest in all of them. And @lrq3000 optimizations apparently made it slower. Slowest is class.

Signed-off-by: Stephen L. <lrq3000@gmail.com>
Signed-off-by: Stephen L. <lrq3000@gmail.com>
@coveralls
Copy link

Coverage Status

Coverage decreased (-23.2%) to 67.564% when pulling 48b4423 on tqdm_bare into 02cbd9c on master.

@lrq3000
Copy link
Member Author

lrq3000 commented Sep 18, 2016

I also now think we should keep both, because the new goal of this submodule is to provide a simplified version for porting to other languages, and non-object-oriented languages will benefit from tqdm_bare whereas object oriented languages can implement tqdm_bare_class.

What do you think about it @casperdcl ? Should we keep both?

Signed-off-by: Stephen L. <lrq3000@gmail.com>
Signed-off-by: Stephen L. <lrq3000@gmail.com>
Signed-off-by: Stephen L. <lrq3000@gmail.com>
Signed-off-by: Stephen L. <lrq3000@gmail.com>
# Conflicts:
#	tqdm/__init__.py

Signed-off-by: Stephen L. <lrq3000@gmail.com>
Signed-off-by: Stephen L. <lrq3000@gmail.com>
Signed-off-by: Stephen L. <lrq3000@gmail.com>
@lrq3000
Copy link
Member Author

lrq3000 commented Oct 17, 2016

I merged both in the same branch, it will be easier to review, compare and test.

@casperdcl casperdcl force-pushed the master branch 4 times, most recently from 8cade97 to a65e347 Compare October 31, 2016 02:34
lrq3000 added a commit that referenced this pull request Oct 31, 2016
… imports interaction fix #176 #245 #188 + speed boost fix #258)

Signed-off-by: Stephen L. <lrq3000@gmail.com>
@casperdcl
Copy link
Member

oh yes, nice. I'd agree that we should have both a class and a functional approach for future ref. Probably would merge this in after the next major release since it doesn't actually add features though.

@casperdcl casperdcl added this to the >5 milestone Nov 12, 2016
@lrq3000
Copy link
Member Author

lrq3000 commented Nov 12, 2016

Ok great :) yes every new submodules should be merged after the new
submodule architecture (which is next major release).
Le 12 Nov. 2016 17:56, "Casper da Costa-Luis" notifications@github.com a
écrit :

oh yes, nice. I'd agree that we should have both a class and a functional
approach for future ref. Probably would merge this in after the next major
release since it doesn't actually add features though.


You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
#258 (comment), or mute
the thread
https://github.com/notifications/unsubscribe-auth/ABES3gjoyWQiIJRRa-nk5MCcATmv9Vd7ks5q9e-6gaJpZM4J0L5M
.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
need-feedback 📢 We need your response (question) p4-enhancement-future 🧨 On the back burner submodule ⊂ Periphery/subclasses
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants