relative timing for unit tests (rebased) #60

lrq3000 · 2015-11-29T22:17:41Z

Should fix #59, #100 and #103 by using relative (cpu) timing (time.process_time() for Py3.3+ or time.clock() for Py2) instead of absolute timing (time.time()). This should allow to reliably execute the unit tests that are time dependent, even if the Travis server is suddenly overloaded in the middle of a test.

Signed-off-by: Stephen L. lrq3000@gmail.com

codecov-io · 2015-11-29T22:22:14Z

Current coverage is `100.00%`

Merging #60 into master will not affect coverage as of e72a08f

@@            master     #60   diff @@
======================================
  Files            5       5       
  Stmts          242     244     +2
  Branches        49      49       
  Methods          0       0       
======================================
+ Hit            242     244     +2
  Partial          0       0       
  Missed           0       0

Review entire Coverage Diff as of e72a08f

Powered by Codecov. Updated on successful CI builds.

lrq3000 · 2015-12-16T18:39:56Z

This patch is complete for me. It's not perfect, but it's certainly more reliable than what we have now, so I think we can merge this until we get a better idea.

casperdcl · 2015-12-17T20:38:29Z

can you check the failing tests @lrq3000 ? I've removed cputime from the api as it looks bad and unclean.

lrq3000 · 2015-12-18T01:18:11Z

I will take a look, but why did you remove cputime? This was supposed to ensure that tqdm uses the relative timing if provided by unit tests (I know, that's not very elegant, but I have yet to find another way to do otherwise...).

casperdcl · 2015-12-18T07:02:40Z

I removed it from the api and manually set it when unit testing :)

lrq3000 · 2015-12-25T18:49:53Z

I tried to investigate this issue for several hours now, and I cannot find the culprit. I cannot reproduce the issue locally, it seems it only happens on Travis, and using debug printing it seems that it's because of a low time resolution. As for the reasons why, I really don't know. The interesting thing is that this time resolution issue does not seem to happen when using the old way (with cputime), so it seems that somehow the time resolution can be diminished when trying to monkey patch.

So I suggest we stay with cputime. And even if I see it more as an internal parameter, it can also be useful for some users if they don't want their ETA to be dependent on other processes load.

lrq3000 · 2015-12-25T18:55:24Z

Proof that it's a time resolution issue, but only when using clock():

def cpu_timify(t):
    process_time()
    t._time = process_time
    t.start_t = t.last_print_t = t._time()
    print("%s | %s | %s | %s" % (repr(t.start_t), repr(process_time()), repr(time.clock()), repr(time.time()) ) )

Result:

0.19 | 0.19 | 0.19 | 1451069514.610997

The low time resolution completely put off all the tests based on rate variations.

lrq3000 · 2015-12-25T20:30:30Z

I'm trying another possible way to fix this issue. Please wait until then.

lrq3000 · 2015-12-26T01:27:33Z

I hereby declare: NO MORE OF THESE FREAKING TIMING ISSUES! (sorry about the caps, but these bugs were getting on my nerves ;) ).

So, thank's to your great tqdm._time monkey patching idea, I developped this idea further and made a virtual discrete time manager, so we do not have to care anymore about timing issues in tests_tqdm.py, as long as we use DiscreteTimer.sleep() and cpu_timify(tqdm).

However, this idea is not implementable in tests_perf.py unless we find a way to monkey patch the Python interpreter to compute the whole instructions list of every Python statements in the tests and then compute the total number (which is equivalent to saying that each statement cost 1 cpu cycle, which is optimistic, but at least it would be stable). This might be possible by scanning the AST, but it's a bit too much for my current py-fu :)

Maybe todo:

we could simplify some tests, where we used to set tricky values for sleep and mininterval, now we can use simple, theoretically defined values, and we don't need to care about the time resolution (nor the cpu cost!) since it's managed virtually by DiscreteTimer.
cputime should maybe be reinstated in tqdm, not for unit testing, but for users. I think some very demanding users may find it useful to have a way to run tqdm without any interference from external softwares. I don't have a particular scenario in mind, but I just think this might happen, but we can of course wait for someone to ask about it first :) (but right now we have the code ready so why not use it!).

lrq3000 · 2015-12-26T01:30:40Z

@casperdcl What do you think about cputime? About simplifying some tests, we can maybe do that later, the tests run fine right now, this is not a need but just cosmetic :)

Ah and I forgot to tell you that I have a weird issue since you added tqdm/*.py instead of tqdm/ in the tox.ini file:

Running command: flake8 --max-line-length=80 --count --statistics --exit-zero tq
dm/*.py
tqdm/*.py:1:1: E902 IOError: [Errno 22] invalid mode ('rU') or filename: 'tqdm/*
.py'
1       E902 IOError: [Errno 22] invalid mode ('rU') or filename: 'tqdm/*.py'
1
Running command: flake8 --max-line-length=80 --count --statistics --exit-zero ex
amples/
Running command: flake8 --max-line-length=80 --count --statistics --exit-zero *.
py
*.py:1:1: E902 IOError: [Errno 22] invalid mode ('rU') or filename: '*.py'
1       E902 IOError: [Errno 22] invalid mode ('rU') or filename: '*.py'
1
Running command: flake8 --max-line-length=80 --count --statistics --exit-zero --
ignore=E731 tqdm/tests/

(Sorry about the line wrap, damn Windows console...).

Should we revert back?

lrq3000 · 2015-12-26T02:04:42Z

Ok I'm crazy today, I'm gonna try to disassemble code on-the-fly to count the number of instructions and make a virtual timing environment for performances tests!

lrq3000 · 2015-12-26T04:42:28Z

Opcodes comparison is not yet ready, do not merge! The call is not recursive. I'm gonna try to fix that.

casperdcl · 2015-12-26T10:24:00Z

whoa nice work Stephen. I didn't quite understand why this doesn't work for perf tests...

casperdcl · 2015-12-26T10:28:27Z

Re: flake8 on windows, maybe open a new issue? I had changed to wildcards to stop recursive addition of directories. The new method requires less specification of individual file names (the old version would also fail by e.g. including non-timing tests in timed ones or vice-versa I think).

lrq3000 · 2015-12-26T15:58:51Z

@casperdcl Ok about opening an issue for flake8 on Windows, I'll do that when this PR will be merged.

About DiscreteTimer not working for tests_perf.py, this is because this timer is only creating a virtual environment timer, so this works for tests_tqdm.py where we can trick tqdm to get false time information since we just care about what tqdm does of this information to compute rate, adjust internal parameters, etc. But for tests_perf.py, it's the other way around: we want to watch the time consumed by each step inside tqdm, we don't just want to send false time information, we also want to receive timing info. So this is quite a different issue.

For the moment, I have two main leads, all involving analysis of the bytecode opcodes:

Using a recursive dis (see here, here and here and my own enhanced version here). This works, but the problem is that it computes the static bytecode, so the whole tqdm sourcecode will be accounted and every branches of the if statements, so this wouldn't really help much (the result will be an unrealistically high number of opcodes, we have lots of conditional statements but they are of course not all executed).
Using a bytecode tracer. This one is tough, the idea would basically be to do just like a statement coverage by running a function and trace afterwards only the bytecode that was executed. This one is tough, but I have some leads here and here and the best lead here.
Another possibility if we want to go real crazy: use a virtual Python bytecode runner. We would then have full control on what happens inside.

lrq3000 · 2015-12-26T18:05:38Z

For later: try to use with block statement style to inline testing code (avoid putting the code to trace inside functions), see here.

lrq3000 · 2015-12-27T02:08:43Z

[Redacted]

lrq3000 · 2015-12-29T12:23:04Z

I think it's time to merge this PR. We have a much better framework for time testing and perf testing, I think we should now merge into the main branch.

I removed the bytecode opcode dynamic analysis and moved it into a new branch and PR #90, we'll see what we can do with that in the future.

To summarize what has been done so far with this PR:

In tests_tqdm.py:

Virtual discrete timer for all tests requiring time. Two benefits: always return the same result without any influence from the system's load, and it's very fast since we don't really sleep.

In tests_perf.py:

Relative timing (cputime), so these tests should not be influenced by the system's load.
Automatic detection of cputime availability: tests whether process_time() or clock() really returns a process-wide (cpu) time and not a system-wide time (the check is done using a sleep(), if cputime is available, then the time slept should not increase the cputime).
Harder tests: we test not only when tqdm is used with default arguments, but also with maximum resource consumption (ie, mininterval=0, miniters=1, etc. so that tqdm needs to refresh the progress bar at each iteration).
MockFileNoWrite class to avoid biasing the perf tests with I/O.
Tests against a simple progress bar, this allows to objectively test the overhead against the most optimized bar (which of course has with fewer features than tqdm).

lrq3000 · 2015-12-29T12:28:30Z

@casperdcl and sorry about the messy history, please rebase as you want :) (ie, you can remove all commits related to bytecode analysis, and squash a few others together).

casperdcl · 2015-12-29T12:54:06Z

@lrq3000 np messy histories are fine if they're just in branches :) will look at this later

lrq3000 · 2015-12-29T12:55:34Z

@casperdcl yes lol I noticed how great it is to use branches along a protected master. I think I'll do that with all my projects from now on!

Signed-off-by: Stephen L. <lrq3000@gmail.com>

…mock StringIO object + add hard perf tests Signed-off-by: Stephen L. <lrq3000@gmail.com>

…tly works, else skip + Fix unit tests Signed-off-by: Stephen L. <lrq3000@gmail.com>

lrq3000 · 2016-01-05T20:26:30Z

I rebased myself, since I better know the code changes of this PR. This PR should now be clean and ready for merge (I checked the changes commit by commit, useless code was removed such as the bytecode tracer).

lrq3000 · 2016-01-09T12:43:12Z

@casperdcl Should I proceed to the merging of this PR and of the IPython PR? I think they are both OK.

lrq3000 · 2016-01-09T22:21:35Z

#101

casperdcl · 2016-01-09T22:42:47Z

@lrq3000 thanks, I'll look at them now you've finished.

lrq3000 · 2016-01-09T22:44:46Z

Ok thank's :)

Conflicts: tqdm/_tqdm.py

relative timing for unit tests (rebased)

lrq3000 force-pushed the relative-timing branch from d7d7b8b to 7c50212 Compare November 29, 2015 23:06

casperdcl force-pushed the master branch from b5e24f2 to b2c759d Compare November 30, 2015 23:48

lrq3000 force-pushed the master branch from b2c759d to d2263d9 Compare December 1, 2015 00:10

lrq3000 force-pushed the relative-timing branch from 5e541cc to d250e27 Compare December 2, 2015 19:55

casperdcl self-assigned this Dec 16, 2015

casperdcl assigned lrq3000 and unassigned casperdcl Dec 17, 2015

lrq3000 force-pushed the relative-timing branch 2 times, most recently from e0f9ba7 to 096b678 Compare December 25, 2015 17:26

lrq3000 force-pushed the relative-timing branch from 47d355f to 835f9cc Compare December 26, 2015 01:38

lrq3000 added the help wanted 🙏 We need you (discussion or implementation) label Dec 27, 2015

casperdcl force-pushed the master branch 2 times, most recently from 7c9fe37 to 5af1135 Compare December 27, 2015 17:11

lrq3000 removed the help wanted 🙏 We need you (discussion or implementation) label Dec 29, 2015

lrq3000 mentioned this pull request Jan 5, 2016

when I run make, give this error #100

Closed

Fix Makefile management for Python 3

8b89e16

Signed-off-by: Stephen L. <lrq3000@gmail.com>

lrq3000 force-pushed the relative-timing branch from 43371a3 to d2411c7 Compare January 5, 2016 20:00

Virtual discrete timer for unit tests + relative timing perf tests + …

68a9cce

…mock StringIO object + add hard perf tests Signed-off-by: Stephen L. <lrq3000@gmail.com>

lrq3000 force-pushed the relative-timing branch 2 times, most recently from a424b1c to 20775cd Compare January 5, 2016 20:22

Add simple progress bar perf test + Perf test check if cputime correc…

8764f00

…tly works, else skip + Fix unit tests Signed-off-by: Stephen L. <lrq3000@gmail.com>

lrq3000 force-pushed the relative-timing branch from 20775cd to 8764f00 Compare January 5, 2016 20:23

lrq3000 changed the title ~~relative timing for unit tests~~ relative timing for unit tests (rebased) Jan 5, 2016

lrq3000 mentioned this pull request Jan 6, 2016

run make fail on linux #103

Closed

casperdcl assigned casperdcl and unassigned lrq3000 Jan 9, 2016

casperdcl added this to the 3.7.0 milestone Jan 10, 2016

casperdcl added 2 commits January 12, 2016 23:02

Merge branch 'master' into relative-timing

ecf11ba

Conflicts: tqdm/_tqdm.py

flake8 fixes, minor edits

bf8172b

casperdcl added a commit that referenced this pull request Jan 12, 2016

Merge pull request #60 from tqdm/relative-timing

2efcd5d

relative timing for unit tests (rebased)

casperdcl merged commit 2efcd5d into master Jan 12, 2016

casperdcl deleted the relative-timing branch January 12, 2016 23:29

lrq3000 referenced this pull request Aug 17, 2016

14% overhead reduction, flake8 fixes

0346a4a

casperdcl mentioned this pull request Apr 15, 2018

Add progress_apply for pd.(Series|DataFrame).(rolling|expanding) #537

Merged

3 tasks

Uh oh!

relative timing for unit tests (rebased) #60

relative timing for unit tests (rebased) #60

Uh oh!

Conversation

lrq3000 commented Nov 29, 2015

Uh oh!

codecov-io commented Nov 29, 2015

Current coverage is 100.00%

Uh oh!

lrq3000 commented Dec 16, 2015

Uh oh!

casperdcl commented Dec 17, 2015

Uh oh!

lrq3000 commented Dec 18, 2015

Uh oh!

casperdcl commented Dec 18, 2015

Uh oh!

lrq3000 commented Dec 25, 2015

Uh oh!

lrq3000 commented Dec 25, 2015

Uh oh!

lrq3000 commented Dec 25, 2015

Uh oh!

lrq3000 commented Dec 26, 2015

Uh oh!

lrq3000 commented Dec 26, 2015

Uh oh!

lrq3000 commented Dec 26, 2015

Uh oh!

lrq3000 commented Dec 26, 2015

Uh oh!

casperdcl commented Dec 26, 2015

Uh oh!

casperdcl commented Dec 26, 2015

Uh oh!

lrq3000 commented Dec 26, 2015

Uh oh!

lrq3000 commented Dec 26, 2015

Uh oh!

lrq3000 commented Dec 27, 2015

Uh oh!

lrq3000 commented Dec 29, 2015

Uh oh!

lrq3000 commented Dec 29, 2015

Uh oh!

casperdcl commented Dec 29, 2015

Uh oh!

lrq3000 commented Dec 29, 2015

Uh oh!

lrq3000 commented Jan 5, 2016

Uh oh!

lrq3000 commented Jan 9, 2016

Uh oh!

lrq3000 commented Jan 9, 2016

Uh oh!

casperdcl commented Jan 9, 2016

Uh oh!

lrq3000 commented Jan 9, 2016

Uh oh!

Uh oh!

Current coverage is `100.00%`