tests: Provide more helpful assert_equal errors #29514

ajtowns · 2024-02-29T10:39:19Z

In the functional tests, we often compare dicts with assert_equal, but the output makes it very hard to tell exactly which entry in the dicts don't match when there are a lot of entries and only minor differences. Change the output to make it clearer.

DrahtBot · 2024-02-29T10:42:58Z

The following sections might be updated with supplementary metadata relevant to reviewers and maintainers.

Code Coverage

For detailed information about the code coverage, see the test coverage report.

Reviews

See the guideline for information on the review process.

Type	Reviewers
ACK	vasild, brunoerg, josibake, BrandonOdiwuor, achow101
Concept ACK	Empact, ismaelsadeeq

If your review is incorrectly listed, please react with 👎 to this comment and the bot will ignore it on the next update.

Conflicts

Reviewers, this pull request conflicts with the following ones:

#26573 (Wallet: don't underestimate the fees when spending a Taproot output by darosior)

If you consider this pull request important, please also help to review the conflicting pull requests. Ideally, start with the one that should be merged first.

In the functional tests, we often compare dicts with assert_equal, but the output makes it very hard to tell exactly which entry in the dicts don't match when there are a lot of entries and only minor differences. Change the output to make it clearer.

ajtowns · 2024-02-29T10:47:54Z

As an example, if I add some bugs to rpc_blockchain.py:

--- a/test/functional/rpc_blockchain.py
+++ b/test/functional/rpc_blockchain.py
@@ -200,7 +200,7 @@ class BlockchainTest(BitcoinTestFramework):
           "deployments": {
             'bip34': {'type': 'buried', 'active': True, 'height': 2},
             'bip66': {'type': 'buried', 'active': True, 'height': 3},
-            'bip65': {'type': 'buried', 'active': True, 'height': 4},
+            'bip65': False,
             'csv': {'type': 'buried', 'active': True, 'height': 5},
             'segwit': {'type': 'buried', 'active': True, 'height': 6},
             'testdummy': {
@@ -221,6 +221,7 @@ class BlockchainTest(BitcoinTestFramework):
                         'possible': True,
                     },
                     'signalling': '#'*(height-143),
+                    'dummy': True,
                 },
                 'active': False
             },
@@ -231,11 +232,11 @@ class BlockchainTest(BitcoinTestFramework):
                     'timeout': 9223372036854775807,
                     'min_activation_height': 0,
                     'status': 'active',
-                    'status_next': 'active',
-                    'since': 0,
+                    'status_next': 'failed',
+                    'since': 7,
                 },
                 'height': 0,
-                'active': True
+                'active': False,
             }
           }
         })

Then with this patch, I get the additional error information:

in particular not({'deployments': {'bip65': {'type': 'buried', 'active': True, 'height': 4}, 'taproot': {'active': True, 'bip9': {'since': 0, 'status_next': 'active'}}, 'testdummy': {'bip9': {}}}} == {'deployments': {'bip65': False, 'taproot': {'active': False, 'bip9': {'since': 7, 'status_next': 'failed'}}, 'testdummy': {'bip9': {'dummy': True}}}})

which only tells me about each error.

vasild

ACK 89b586b

Tested this myself:

AssertionError: not({'id': 2, 'network': 'not_publicly_routable', 'services': '0000000000000000', 'servicesnames': [], 'relaytxes': False, 'lastsend': 1709203943, 'lastrecv': 1709203943, 'last_transaction': 0, 'last_block': 0, 'conntime': 1709203943, 'timeoffset': 0, 'version': 0, 'subver': '', 'inbound': True, 'bip152_hb_to': False, 'bip152_hb_from': False, 'startingheight': -1, 'presynced_headers': -1, 'synced_headers': -1, 'synced_blocks': -1, 'inflight': [], 'addr_relay_enabled': False, 'addr_processed': 0, 'addr_rate_limited': 0, 'permissions': [], 'minfeefilter': Decimal('0E-8'), 'bytessent_per_msg': {}, 'bytesrecv_per_msg': {}, 'connection_type': 'inbound', 'transport_protocol_type': 'v2', 'session_id': '04c4eba0893b099c923beb8b1a1e08a23f98fa6e825fcc0100bfe6dadd858272'} == {'addr_processed': 0, 'addr_rate_limited': 0, 'addr_relay_enabled': False, 'bip152_hb_from': False, 'bip152_hb_to': False, 'bytesrecv_per_msg': {}, 'bytessent_per_msg': {}, 'connection_type': 'inbound', 'conntime': 1709203943, 'id': 2, 'inbound': True, 'inflight': [], 'last_block': 0, 'last_transaction': 0, 'lastrecv': 1709203943, 'lastsend': 1709203943, 'minfeefilter': Decimal('0E-8'), 'network': 'not_publicly_routable', 'permissions': [], 'presynced_headers': -1, 'foo': 0, 'relaytxes': False, 'services': '0000000000000000', 'servicesnames': [], 'session_id': '04c4eba0893b099c923beb8b1a1e08a23f98fa6e825fcc0100bfe6dadd858272', 'startingheight': -1, 'subver': '', 'synced_blocks': -1, 'synced_headers': -1, 'timeoffset': 0, 'transport_protocol_type': 'v22', 'version': 0})
  in particular not({'transport_protocol_type': 'v2'} == {'transport_protocol_type': 'v22', 'foo': 0})

Thanks!

vasild

ACK a3badf7

maflcko

lgtm, thanks!

maflcko · 2024-02-29T10:59:43Z

test/functional/test_framework/util.py

 def assert_equal(thing1, thing2, *args):
+    if thing1 != thing2 and not args and isinstance(thing1, dict) and isinstance(thing2, dict):
+        d1,d2 = summarise_dict_differences(thing1, thing2)


Suggested change

d1,d2 = summarise_dict_differences(thing1, thing2)

diff_str = '\n'.join(difflib.unified_diff(*[json.dumps(obj, indent=4, sort_keys=True).splitlines() for obj in [thing1, thing2]], lineterm=''))

nit: Possibly, an alternative, needing two imports

What would the output look like?

The output will put each error on a separate line, which is helpful if there is more than one diff. However, it may also hide the parent key names (nesting) in the unified diff. I guess the context can be increased, if this is important.

raise AssertionError("not(%s == %s)\n in particular(%s)" % (thing1, thing2, diff_str)) AssertionError: not({'hash': '62f73cf9d4e2eaabe3fd04b426f0c05f236b0bd38a0015ac20d7060bf8080b25', 'height': 208, 'deployments': {'bip34': {'type': 'buried', 'active': True, 'height': 2}, 'bip66': {'type': 'buried', 'active': True, 'height': 3}, 'bip65': {'type': 'buried', 'active': True, 'height': 4}, 'csv': {'type': 'buried', 'active': True, 'height': 5}, 'segwit': {'type': 'buried', 'active': True, 'height': 6}, 'testdummy': {'type': 'bip9', 'active': False, 'bip9': {'bit': 28, 'start_time': 0, 'timeout': 9223372036854775807, 'min_activation_height': 0, 'status': 'started', 'since': 144, 'status_next': 'started', 'statistics': {'period': 144, 'elapsed': 65, 'count': 65, 'threshold': 108, 'possible': True}, 'signalling': '#################################################################'}}, 'taproot': {'type': 'bip9', 'height': 0, 'active': True, 'bip9': {'start_time': -1, 'timeout': 9223372036854775807, 'min_activation_height': 0, 'status': 'active', 'since': 0, 'status_next': 'active'}}}} == {'hash': '62f73cf9d4e2eaabe3fd04b426f0c05f236b0bd38a0015ac20d7060bf8080b25', 'height': 208, 'deployments': {'bip34': {'type': 'buried', 'active': True, 'height': 2}, 'bip66': {'type': 'buried', 'active': True, 'height': 3}, 'bip65': False, 'csv': {'type': 'buried', 'active': True, 'height': 5}, 'segwit': {'type': 'buried', 'active': True, 'height': 6}, 'testdummy': {'type': 'bip9', 'bip9': {'bit': 28, 'start_time': 0, 'timeout': 9223372036854775807, 'min_activation_height': 0, 'status': 'started', 'status_next': 'started', 'since': 144, 'statistics': {'period': 144, 'threshold': 108, 'elapsed': 65, 'count': 65, 'possible': True}, 'signalling': '#################################################################', 'dummy': True}, 'active': False}, 'taproot': {'type': 'bip9', 'bip9': {'start_time': -1, 'timeout': 9223372036854775807, 'min_activation_height': 0, 'status': 'active', 'status_next': 'failed', 'since': 7}, 'height': 0, 'active': False}}}) in particular(--- +++ @@ -5,11 +5,7 @@ "height": 2, "type": "buried" }, - "bip65": { - "active": true, - "height": 4, - "type": "buried" - }, + "bip65": false, "bip66": { "active": true, "height": 3, @@ -26,13 +22,13 @@ "type": "buried" }, "taproot": { - "active": true, + "active": false, "bip9": { "min_activation_height": 0, - "since": 0, + "since": 7, "start_time": -1, "status": "active", - "status_next": "active", + "status_next": "failed", "timeout": 9223372036854775807 }, "height": 0, @@ -42,6 +38,7 @@ "active": false, "bip9": { "bit": 28, + "dummy": true, "min_activation_height": 0, "signalling": "#################################################################", "since": 144,)

Forgot the serialization_fallback in my patch. Fixed:

diff --git a/test/functional/test_framework/util.py b/test/functional/test_framework/util.py index b4b05b1597..5e1ca4628e 100644 --- a/test/functional/test_framework/util.py +++ b/test/functional/test_framework/util.py @@ -7,6 +7,7 @@ from base64 import b64encode from decimal import Decimal, ROUND_DOWN from subprocess import CalledProcessError +import difflib import hashlib import inspect import json @@ -18,7 +19,7 @@ import re import time from . import coverage -from .authproxy import AuthServiceProxy, JSONRPCException +from .authproxy import AuthServiceProxy, JSONRPCException, serialization_fallback from collections.abc import Callable from typing import Optional @@ -53,6 +54,14 @@ def assert_fee_amount(fee, tx_size, feerate_BTC_kvB): def assert_equal(thing1, thing2, *args): + if thing1 != thing2 and not args and isinstance(thing1, dict) and isinstance(thing2, dict): + diff_str = "\n".join(difflib.unified_diff(*[json.dumps( + obj, + indent=4, + default=serialization_fallback, + sort_keys=True, + ).splitlines() for obj in [thing1, thing2]], lineterm="")) + raise AssertionError("not(%s == %s)\n in particular(%s)" % (thing1, thing2, diff_str)) if thing1 != thing2 or any(thing1 != arg for arg in args): raise AssertionError("not(%s)" % " == ".join(str(arg) for arg in (thing1, thing2) + args))

(In any case, my comment was just a nit. In the common case of just a single difference, both solutions are equally good)

brunoerg

nice!

utACK a3badf7

josibake

ACK a3badf7

I've also written at least 2 versions of this function locally when faced with this.

Empact · 2024-02-29T21:11:40Z

test/functional/test_framework/util.py

 def assert_equal(thing1, thing2, *args):
+    if thing1 != thing2 and not args and isinstance(thing1, dict) and isinstance(thing2, dict):
+        d1,d2 = summarise_dict_differences(thing1, thing2)
+        raise AssertionError("not(%s == %s)\n  in particular not(%s == %s)" % (thing1, thing2, d1, d2))


How about moving this into the block below, as that would maintain a hierarchy of checks and avoid unintended divergence? I.e. such that the below checks: "are they equal" and the new code within checks "how should we best communicate the difference".

Empact · 2024-02-29T21:14:00Z

test/functional/test_framework/util.py

@@ -52,7 +52,24 @@ def assert_fee_amount(fee, tx_size, feerate_BTC_kvB):
        raise AssertionError("Fee of %s BTC too high! (Should be %s BTC)" % (str(fee), str(target_fee)))


+def summarise_dict_differences(thing1, thing2):
+    if not isinstance(thing1, dict) or not isinstance(thing2, dict):
+        return thing1, thing2


How about raising a TypeError error here, which will help detect unexpected calling patterns?

This would break the recursion, no?

Empact · 2024-02-29T21:14:37Z

Concept ACK

BrandonOdiwuor

Code Review ACK a3badf7

ismaelsadeeq · 2024-03-07T18:47:39Z

Concept ACK

achow101 · 2024-03-11T11:51:52Z

ACK a3badf7

a3badf7 tests: Provide more helpful assert_equal errors (Anthony Towns) Pull request description: In the functional tests, we often compare dicts with assert_equal, but the output makes it very hard to tell exactly which entry in the dicts don't match when there are a lot of entries and only minor differences. Change the output to make it clearer. ACKs for top commit: achow101: ACK a3badf7 vasild: ACK a3badf7 brunoerg: utACK a3badf7 josibake: ACK bitcoin@a3badf7 BrandonOdiwuor: Code Review ACK a3badf7 Tree-SHA512: 1d4b4a3b2e2e28ab09f10b41b04b52b37f64e0d8a54e2306f37de0c3eb3299a7ad4ba225b9efa67057a75e90d008a17385c810a32c9b212d240be280c2dcf2e5

b70e091 Merge bitcoin#29667: fuzz: actually test garbage >64b in p2p transport test (fanquake) 6d7aa3d Merge bitcoin#29497: test: simplify test_runner.py (fanquake) d0e15d5 Merge bitcoin#29606: refactor: Reserve memory for ToLower/ToUpper conversions (Ava Chow) 045fa5f Merge bitcoin#29514: tests: Provide more helpful assert_equal errors (Ava Chow) bd607f0 Merge bitcoin#29393: i2p: log connection was refused due to arbitrary port (Ava Chow) c961755 Merge bitcoin#29595: doc: Wrap flags with code in developer-notes.md (fanquake) 8d6e5e7 Merge bitcoin#29583: fuzz: Apply fuzz env (suppressions, etc.) when fetching harness list (fanquake) 4dce690 Merge bitcoin#29576: Update functional test runner to return error code when no tests are found to run (fanquake) 910a7d6 Merge bitcoin#29529: fuzz: restrict fopencookie usage to Linux & FreeBSD (fanquake) fdac2b3 Merge bitcoin#29493: subtree: update crc32c subtree (fanquake) a23b342 Merge bitcoin#29475: doc: Fix Broken Links (fanquake) 92bad90 Merge bitcoin#28178: fuzz: Generate with random libFuzzer settings (fanquake) 9b6a05d Merge bitcoin#29443: depends: fix BDB compilation on OpenBSD (fanquake) 9963e6b Merge bitcoin#29413: fuzz: increase length of string used for `NetWhitelist{bind}Permissions::TryParse` (fanquake) 3914745 Merge bitcoin#29425: test: fix intermittent failure in wallet_reorgrestore.py (fanquake) b719883 Merge bitcoin#29399: test: Fix utxo set hash serialisation signedness (fanquake) f096880 Merge bitcoin#29377: test: Add makefile target for running unit tests (Ava Chow) 03e0bd3 Merge bitcoin#27319: addrman, refactor: improve stochastic test in `AddSingle` (Ava Chow) Pull request description: ## Issue being fixed or feature implemented Batch of trivial backports ## What was done? See commits ## How Has This Been Tested? built locally; large combined merge passed tests locally ## Breaking Changes Should be none ## Checklist: _Go over all the following points, and put an `x` in all the boxes that apply._ - [ ] I have performed a self-review of my own code - [ ] I have commented my code, particularly in hard-to-understand areas - [ ] I have added or updated relevant unit/integration/functional/e2e tests - [ ] I have made corresponding changes to the documentation - [x] I have assigned this pull request to a milestone _(for repository code-owners and collaborators only)_ ACKs for top commit: UdjinM6: utACK b70e091 knst: utACK b70e091 Tree-SHA512: 659a931f812c8a92cf3854abb873d92885219a392d6aa8e49ac4b27fe41e8e163ef9a135050e7c2e1bd33cecd2f7dae215e05a9c29f62e837e0057d3c16746d6

DrahtBot added the Tests label Feb 29, 2024

ajtowns force-pushed the 202402-assertequal branch from 89b586b to a3badf7 Compare February 29, 2024 10:43

vasild approved these changes Feb 29, 2024

View reviewed changes

maflcko approved these changes Feb 29, 2024

View reviewed changes

brunoerg reviewed Feb 29, 2024

View reviewed changes

josibake reviewed Feb 29, 2024

View reviewed changes

Empact reviewed Feb 29, 2024

View reviewed changes

DrahtBot mentioned this pull request Mar 1, 2024

Wallet: don't underestimate the fees when spending a Taproot output #26573

Closed

BrandonOdiwuor approved these changes Mar 2, 2024

View reviewed changes

DrahtBot requested a review from Empact March 2, 2024 04:45

DrahtBot requested a review from ismaelsadeeq March 11, 2024 11:51

achow101 merged commit 10d7b6e into bitcoin:master Mar 11, 2024

ajtowns mentioned this pull request Apr 9, 2024

Backports for 27.x bitcoin-inquisition/bitcoin#52

Merged

bitcoin locked and limited conversation to collaborators Mar 11, 2025

	d1,d2 = summarise_dict_differences(thing1, thing2)
	diff_str = '\n'.join(difflib.unified_diff(*[json.dumps(obj, indent=4, sort_keys=True).splitlines() for obj in [thing1, thing2]], lineterm=''))

tests: Provide more helpful assert_equal errors #29514

tests: Provide more helpful assert_equal errors #29514

Uh oh!

Conversation

ajtowns commented Feb 29, 2024

Uh oh!

DrahtBot commented Feb 29, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Code Coverage

Reviews

Conflicts

Uh oh!

ajtowns commented Feb 29, 2024

Uh oh!

vasild left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

vasild left a comment

Choose a reason for hiding this comment

Uh oh!

maflcko left a comment

Choose a reason for hiding this comment

Uh oh!

maflcko Feb 29, 2024

Choose a reason for hiding this comment

Uh oh!

vasild Feb 29, 2024

Choose a reason for hiding this comment

Uh oh!

maflcko Mar 1, 2024

Choose a reason for hiding this comment

Uh oh!

maflcko Mar 1, 2024

Choose a reason for hiding this comment

Uh oh!

maflcko Mar 1, 2024

Choose a reason for hiding this comment

Uh oh!

brunoerg left a comment

Choose a reason for hiding this comment

Uh oh!

josibake left a comment

Choose a reason for hiding this comment

Uh oh!

Empact Feb 29, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Empact Feb 29, 2024

Choose a reason for hiding this comment

Uh oh!

maflcko Mar 1, 2024

Choose a reason for hiding this comment

Uh oh!

Empact commented Feb 29, 2024

Uh oh!

BrandonOdiwuor left a comment

Choose a reason for hiding this comment

Uh oh!

ismaelsadeeq commented Mar 7, 2024

Uh oh!

achow101 commented Mar 11, 2024

Uh oh!

Uh oh!

DrahtBot commented Feb 29, 2024 •

edited

Loading

vasild left a comment •

edited

Loading

Empact Feb 29, 2024 •

edited

Loading