Skip to content

Conversation

kfessel
Copy link
Contributor

@kfessel kfessel commented Feb 25, 2020

Reverts #13179 which introduced debugging issues due to misinformation of openocd and thereby gdb

Contribution description

Fixes the issue that #13179 tried to address by making the flash probing work on the nucleo-f767zi.
Might generaly improve probing Results due to switching from system reset to system halt after reset.

Testing procedure

The issue #13179 tried fixed:

make -C tests/riotboot/ BOARD=nucleo-f767zi flash

would fail due to openocd not being able to probe flash while srst is asserted

the issue #13179 generated

 make  -C tests/riotboot/ BOARD=nucleo-f767zi debug 

lead to gdb telling me:

warning: Overlapping regions in memory map: ignoring

Issues/PRs references

Fixes #13179 this is the same as #13457 but with the patch for openocd.sh added

@kfessel kfessel changed the title Fix13179 boards/nucleo-f767zi: Fix13179 make nucleo-f767zi debugable again Feb 25, 2020
@kfessel kfessel requested review from benpicco and maribu February 25, 2020 14:20
@kfessel
Copy link
Contributor Author

kfessel commented Feb 25, 2020

Travis says:

Pull request needs squashing:
833d2b9f9 Fix Flash probing for nucleo-f747zi

so i will do that this will hide the reverting of 13179

@kfessel kfessel force-pushed the fix13179 branch 2 times, most recently from cfeed18 to eed174f Compare February 25, 2020 17:07
@benpicco benpicco added Area: boards Area: Board ports Type: bug The issue reports a bug / The PR fixes a bug (including spelling errors) labels Feb 25, 2020
@PeterKietzmann PeterKietzmann added the CI: ready for build If set, CI server will compile all applications for all available boards for the labeled PR label Feb 26, 2020
@PeterKietzmann
Copy link
Member

No idea about the change and discussions behind it, just a blind test. Maybe it helps:

Current master: -> flashing tests/riotboot doesn't work (/tests/shell does)

    TargetName         Type       Endian TapName            State       
--  ------------------ ---------- ------ ------------------ ------------
 0* stm32f7x.cpu       hla_target little stm32f7x.cpu       reset
target halted due to debug-request, current mode: Thread 
xPSR: 0x01000000 pc: 0x0800056c msp: 0x20000200
auto erase enabled
Error: Invalid command argument
image.base_address option value ('bin') is not valid

This PR: flashing works

    TargetName         Type       Endian TapName            State       
--  ------------------ ---------- ------ ------------------ ------------
 0* stm32f7x.cpu       hla_target little stm32f7x.cpu       reset
target halted due to debug-request, current mode: Thread 
xPSR: 0x01000000 pc: 0x0800056c msp: 0x20000200
auto erase enabled
Info : device id = 0x10016451
Info : flash size = 2048kbytes
Info : Single Bank 2048 kiB STM32F76x/77x found
target halted due to breakpoint, current mode: Thread 
xPSR: 0x61000000 pc: 0x20000046 msp: 0x20000200
wrote 1310720 bytes from file /tests_riotboot-slot0-extended.bin in 24.713152s (51.794 KiB/s)
target halted due to breakpoint, current mode: Thread 
xPSR: 0x61000000 pc: 0x2000002e msp: 0x20000200
verified 1049088 bytes in 4.267101s (240.093 KiB/s)
shutdown command invoked
Done flashing

I couldn't stop at a breakpoints in both versions. What I tried:

BOARD=nucleo-f767zi make -C tests/riotboot debug
(gdb) monitor reset run
(gdb) monitor reset halt
target halted due to debug-request, current mode: Thread 
xPSR: 0x01000000 pc: 0x08000504 msp: 0x20000200
(gdb) b main
Breakpoint 1 at 0x8000f88: file /riotboot/main.c, line 77.
(gdb) c
Continuing.
Note: automatically using hardware breakpoints for read-only addresses.

-> simply runs the program and doesn't actually break in main

@kfessel
Copy link
Contributor Author

kfessel commented Feb 26, 2020

seems like 'tests/riotboot' is not the best example for debugging and breakpoints
(i would have choosen another one but this was what the original PR said to Fix)

the debugging issue i have is independant of riotboot.

riotboot is about moving firmware to other place on flash, to keep multiple of them around (Versioning and CRCing, then help to decide which is actually will run))
-> adresses in the elf file may not match (offsets need to be calculated).

or you load

(gdb) file bin/nucleo-f767zi/tests_riotboot-slot0.elf
A program is being debugged already.
Are you sure you want to change the file? (y or n) y
Load new symbol table from "bin/nucleo-f767zi/tests_riotboot-slot0.elf"? (y or n) y
Reading symbols from bin/nucleo-f767zi/tests_riotboot-slot0.elf...

which has the right addresses for slot 0
then

(gdb) b main
Breakpoint 15 at 0x800925c: file ..../RIOT/tests/riotboot/main.c, line 77.
(gdb) run
The program being debugged has been started already.
Start it from the beginning? (y or n) y
Starting program: ..../tests/riotboot/bin/nucleo-f767zi/tests_riotboot-slot0.elf 
[New Thread 2]
[Switching to Thread 2]

Thread 2 hit Breakpoint 15, main () at ..../RIOT/tests/riotboot/main.c:77
77      {

and it will break

@kfessel
Copy link
Contributor Author

kfessel commented Feb 27, 2020

Current master: -> flashing tests/riotboot doesn't work (/tests/shell does)

interresting this is a problem with #13179 (and current master)
Explanation: in current Ubuntu/Debian openocd (0.10.0-6) there is only one memory region defined in the stm32f7x.cfg -> the extra region defined is not the 4th but the second one -> selecting the 4th line (using FLASH_BANK) to extract the address from will have no result -> flash fail

@fjmolinas
Copy link
Contributor

@kfessel sorry for not addressing this earlier, I was AFK this last two weeks. I'm sorry I didn't test setting breakpoints with the change in #13179 and that this had led to the mis-configuration you state. I'll take a closer look into this now.

I agree that fixing probing is the best fix, the issue is that using reset halt means that if there is a hardfault and you want to attach the debugger it will reset the device instead of attaching at the point of fault (unless I'm missing something).

I think it would have been easier if you just provide the addrI think it would have been easier if you just provide the address in Makefile.import, and modified openocd.sh to use such a address if it exists (may also be possible that such a mechanism allready exists)ess in Makefile.import, and modified openocd.sh to use such a address if it exists (may also be possible that such a mechanism already exists)

This could be a better alternative you are right. Would you agree on a solution where #13179 is reverted and your suggestion used?

@kaspar030 kaspar030 changed the title boards/nucleo-f767zi: Fix13179 make nucleo-f767zi debugable again boards/nucleo-f767zi: Fix13179 make nucleo-f767zi debuggable again Feb 28, 2020
@fjmolinas
Copy link
Contributor

This could be a better alternative you are right. Would you agree on a solution where #13179 is reverted and your suggestion used?

I think this should also be split into two PR;s, one with the revert, that we can backport to 2020.01. And the one fixing the issue, depending on the impact of the change, e.g.: changing openocd.sh or just come configuration for nucleo-f767zi.

@kfessel
Copy link
Contributor Author

kfessel commented Feb 28, 2020

Hi just reading,
reverting (already created a PR for that #13457 )and making an extra request to fix the flash probing problem is ok for me. i did join both since i wanted to provide a complete solution

i will also create a revert PR for #13219 (I did at #13503)

had this this PR be two commits before but Travis said they should be squashed before this can continue so I did.

@kfessel
Copy link
Contributor Author

kfessel commented Feb 28, 2020

I agree that fixing probing is the best fix, the issue is that using reset halt means that if there is a hardfault and you want to attach the debugger it will reset the device instead of attaching at the point of fault (unless I'm missing something).

the reset halt might be a problem when u want to debug but this flash probing procedure is only used for flashing bin file (calculating where to flash them) so this is no problem (btw the procedure start with a srst assert there is no program state left at that point in time)

@fjmolinas
Copy link
Contributor

the reset halt might be a problem when u want to debug but this flash probing procedure is only used for flashing bin file (calculating where to flash them) so this is no problem (btw the procedure start with a srst assert there is no program state left at that point in time)

True, sorry I misread where your proposed change was. This is indeed the best fix for the issue.

BIULD_IN_DOCKER=1 RIOT_CI_BUILD=1 BOARD=nucleo-f767zi make -C tests/riotboot flash test -j3 still passes.

> getslotaddr 0
 getslotaddr 0
Slot 0 address=0x08008200
> dumpaddrs
 dumpaddrs
slot 0: metadata: 0x8008000 image: 0x08008200
slot 1: metadata: 0x8100000 image: 0x08100200
> TEST PASSED

I can properly apply breakpoints:

DISABLE_MODULE=test_utils_interactive_sync BOARD=nucleo-f767zi make -C tests/xtimer_usleep flash -j3

→ DISABLE_MODULE=test_utils_interactive_sync BOARD=nucleo-f767zi make -C tests/xtimer_usleep debug
make: Entering directory '/home/francisco/workspace/RIOT2/tests/xtimer_usleep'
/home/francisco/workspace/RIOT2/dist/tools/openocd/openocd.sh debug /home/francisco/workspace/RIOT2/tests/xtimer_usleep/bin/nucleo-f767zi/tests_xtimer_usleep.elf
### Starting Debugging ###
Open On-Chip Debugger 0.10.0+dev-01047-g09ac9ab1 (2020-02-05-11:31)
Licensed under GNU GPL v2
For bug reports, read
	http://openocd.org/doc/doxygen/bugs.html
Reading symbols from /home/francisco/workspace/RIOT2/tests/xtimer_usleep/bin/nucleo-f767zi/tests_xtimer_usleep.elf...
Remote debugging using :3333
cortexm_sleep (deep=<optimized out>) at /home/francisco/workspace/RIOT2/cpu/cortexm_common/include/cpu.h:177
177	    irq_restore(state);
(gdb) b main
Breakpoint 1 at 0x8001284: file /home/francisco/workspace/RIOT2/tests/xtimer_usleep/main.c, line 62.
(gdb) r
The program being debugged has been started already.
Start it from the beginning? (y or n) y
Starting program: /home/francisco/workspace/RIOT2/tests/xtimer_usleep/bin/nucleo-f767zi/tests_xtimer_usleep.elf 
Note: automatically using hardware breakpoints for read-only addresses.
[Switching to Thread 1]

Breakpoint 1, main () at /home/francisco/workspace/RIOT2/tests/xtimer_usleep/main.c:62
62	    printf("Running test %u times with %u distinct sleep times\n", RUNS,
(gdb) 

Reproduced failure in master

→ DISABLE_MODULE=test_utils_interactive_sync BOARD=nucleo-f767zi make -C tests/xtimer_usleep debug
make: Entering directory '/home/francisco/workspace/RIOT2/tests/xtimer_usleep'
/home/francisco/workspace/RIOT2/dist/tools/openocd/openocd.sh debug /home/francisco/workspace/RIOT2/tests/xtimer_usleep/bin/nucleo-f767zi/tests_xtimer_usleep.elf
### Starting Debugging ###
Open On-Chip Debugger 0.10.0+dev-01047-g09ac9ab1 (2020-02-05-11:31)
Licensed under GNU GPL v2
For bug reports, read
	http://openocd.org/doc/doxygen/bugs.html
Reading symbols from /home/francisco/workspace/RIOT2/tests/xtimer_usleep/bin/nucleo-f767zi/tests_xtimer_usleep.elf...
Remote debugging using :3333
cortexm_sleep (deep=<optimized out>) at /home/francisco/workspace/RIOT2/cpu/cortexm_common/include/cpu.h:177
177	    irq_restore(state);
warning: Overlapping regions in memory map: ignoring
(gdb) b main
Breakpoint 1 at 0x8001284: file /home/francisco/workspace/RIOT2/tests/xtimer_usleep/main.c, line 62.
(gdb) r
The program being debugged has been started already.
Start it from the beginning? (y or n) y
Starting program: /home/francisco/workspace/RIOT2/tests/xtimer_usleep/bin/nucleo-f767zi/tests_xtimer_usleep.elf 
warning: Overlapping regions in memory map: ignoring

@kfessel
Copy link
Contributor Author

kfessel commented Feb 28, 2020

so will we go for the 4 PR (two for master two for Backport ) or the two PR solution

@fjmolinas
Copy link
Contributor

Hi just reading,
reverting ( #13457 )and making an extra request to fix the flash probing problem is ok for me. i did join both since i wanted to provide a complete solution

I re-thought how this should be handled. If I just backport the revert, then I would re-introduce the inability to flash binaries. I think having both the revert and the fix in the same PR is actually better, but having them as two separate commits, the revert and the fix.

I think I'll probably make a point release including this and maybe other bug fixes that have showed up since 2020.01. But I need to discuss this a little bit with other maintainers. For now lets get this fix in to master with two distinct commits.

Sorry for all the noise, regarding the commits/pr's and thanks again for the proper fix here.

@fjmolinas
Copy link
Contributor

so will we go for the 4 PR (two for master two for Backport ) or the two PR solution

I posted the last comment before seeing this. Lets first go with this single PR with two commits for now.

@kfessel
Copy link
Contributor Author

kfessel commented Feb 28, 2020

Ok but will i t go through travis with two commits it told me to squash them before.
I also like the two commit one PR version the most since it is clean but travis put me of

@fjmolinas
Copy link
Contributor

Ok but will i t go through travis with two commits it told me to squash them before.
I also like the two commit one PR version the most since it is clean

What was the message?

@kfessel
Copy link
Contributor Author

kfessel commented Feb 28, 2020

Travis says:

Pull request needs squashing:
833d2b9f9 Fix Flash probing for nucleo-f747zi

so i will do that this will hide the reverting of 13179

its my second comment in this PR

@kfessel
Copy link
Contributor Author

kfessel commented Feb 28, 2020

I can repush my old branch ( keeped it on my pc for Reference) so we can see the message. but this will trigger the whole ci

@fjmolinas
Copy link
Contributor

Ok but will i t go through travis with two commits it told me to squash them before.
I also like the two commit one PR version the most since it is clean

Oh this is just because travis doesn't accept commits prefixed with fix, remove *me, squash:

keyword_filter() {
    grep -i \
        -e "^    [0-9a-f]\+ .\{0,2\}SQUASH" \
        -e "^    [0-9a-f]\+ .\{0,2\}FIX" \
        -e "^    [0-9a-f]\+ .\{0,2\}REMOVE *ME"
}

@kfessel
Copy link
Contributor Author

kfessel commented Feb 28, 2020

wat? < there should be a gif but i do not know where to link

wat

i will rework my that and push

@fjmolinas
Copy link
Contributor

fjmolinas commented Feb 28, 2020

Sorry, maybe its clearer when looking at the rest of the code, here is the permalink:

keyword_filter() {
grep -i \
-e "^ [0-9a-f]\+ .\{0,2\}SQUASH" \
-e "^ [0-9a-f]\+ .\{0,2\}FIX" \
-e "^ [0-9a-f]\+ .\{0,2\}REMOVE *ME"
}

To avoid fixups and squash commits to get in the ci checks the commit looking gotfor the mentioned patterns, if any of those are present then it fails. That way no fixup! or squash! commit get in the history by mistake. This includes any commit prefixed with fix, so your initial commit fix Flash probing for nucleo-f747zi matched the pattern and that is why the ci failed, if it had been "something: fix Flash probing for nucleo-f747zi" the the ci would have passed.

@fjmolinas fjmolinas removed the CI: ready for build If set, CI server will compile all applications for all available boards for the labeled PR label Feb 28, 2020
@fjmolinas
Copy link
Contributor

I think the best would be:

git revert --no-commit 53f60db66f1a4691e3fff24f71116da5bcef0539
git revert --no-commit 1dec5ba61b0ae6e4a9df3d9780eeca5de2687015
git commit

and a commit message with the PR title

    Revert "boards/nucleo-f767zi: add correct flash bank openocd config"
    
    This reverts commits:
    - 1dec5ba61b0ae6e4a9df3d9780eeca5de2687015
    - 53f60db66f1a4691e3fff24f71116da5bcef0539

@fjmolinas
Copy link
Contributor

I think the best would be:

git revert --no-commit 53f60db66f1a4691e3fff24f71116da5bcef0539
git revert --no-commit 1dec5ba61b0ae6e4a9df3d9780eeca5de2687015
git commit

and a commit message with the PR title

    Revert "boards/nucleo-f767zi: add correct flash bank openocd config"
    
    This reverts commits:
    - 1dec5ba61b0ae6e4a9df3d9780eeca5de2687015
    - 53f60db66f1a4691e3fff24f71116da5bcef0539

And the the commit with the fix.

@kfessel
Copy link
Contributor Author

kfessel commented Feb 28, 2020

did that
thank you, for the git revert introduction it is clearer than the PR Revert by github

@kfessel
Copy link
Contributor Author

kfessel commented Feb 28, 2020

now for the backport

@fjmolinas fjmolinas added the CI: ready for build If set, CI server will compile all applications for all available boards for the labeled PR label Feb 28, 2020
@fjmolinas
Copy link
Contributor

now for the backport

For the backport we should use our standard tools, so when this PR is merged we can use our backport script:

./dist/tools/backport_pr/backport_pr.py --comment 13470

But this PR needs to be merged before.

@kfessel
Copy link
Contributor Author

kfessel commented Feb 28, 2020

ok then lets wait

@kfessel
Copy link
Contributor Author

kfessel commented Feb 28, 2020

Since i saw you use the Docker build, which openocd does it use? or is it the systems openocd?

@fjmolinas
Copy link
Contributor

Since i saw you use the Docker build, which openocd does it use? or is it the systems openocd?

DOCKER is only used for building, not for flashing here, so the system version.

@fjmolinas
Copy link
Contributor

DOCKER is only used for building, not for flashing here, so the system

BTW I tested with openocd latest release as well as my local more update openocd.

@kfessel
Copy link
Contributor Author

kfessel commented Feb 28, 2020

seems like this can be merged

Copy link
Contributor

@fjmolinas fjmolinas left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ACK!

@kfessel
Copy link
Contributor Author

kfessel commented Feb 28, 2020

Backport available at #13513

@fjmolinas
Copy link
Contributor

For backports its better to use the available scripts ./dist/tools/backport_pr/backport_pr.py --comment 13470

@kfessel
Copy link
Contributor Author

kfessel commented Mar 2, 2020

Backport provided in #13531

@leandrolanzieri leandrolanzieri added the Process: needs backport Integration Process: The PR is required to be backported to a release or feature branch label Mar 13, 2020
@leandrolanzieri leandrolanzieri added this to the Release 2020.01 milestone Mar 13, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Area: boards Area: Board ports CI: ready for build If set, CI server will compile all applications for all available boards for the labeled PR Process: needs backport Integration Process: The PR is required to be backported to a release or feature branch Type: bug The issue reports a bug / The PR fixes a bug (including spelling errors)
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants