-
Notifications
You must be signed in to change notification settings - Fork 2.1k
cortexm_common: check for possible stack overflow in hardfault handler #4015
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
cortexm_common: check for possible stack overflow in hardfault handler #4015
Conversation
uint32_t* sp; | ||
asm volatile ("mov %[sp], sp" : [sp] "=r" (sp) : : ); | ||
/* If printf may overflow the handler stack */ | ||
if( (sp - &_sstack) < required) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if you change this to
if (((uint32_t)sp) < (((uint32_t)&_sstack) + required)) {
}
you will also catch if sp has already passed _sstack
@gebart Another possibility would be to check for the stack canary after the handler has finished, got that idea just now. What do you think? |
@daniel-k How much does this currently add to the binary? Did you run a make info-buildsizes-diff yet for the cortex platforms? It is a useful debugging tool to be able to check if the stack has been corrupted, but I think the hardfault handler is already becoming quite large and there are very few cases where it would make any difference rather than just comparing the stack pointer with the limits of the stack. You can also use the ps command to check the stacks. |
No, I haven't yet.
That's true though.
You mean "checking for a canary" by that? The problem, at least on samr21-xpro, at the moment is, that an overflowing ISR stack corrupts scheduler/threading related system variables and (maybe therefore) |
9156668
to
c16e430
Compare
@gebart Diff against master introduces an overhead of 136 bytes in |
ACK from me, but I would like another opinion on the increased complexity of the hardfault handler. @OlegHahm, @kaspar030, @haukepetersen maybe? Note: This PR only affects the |
I am ok with this change as it is not affecting anything with disabled |
@daniel-k what happened to the contents of daniel-k@c31929d ? |
I refactored this into |
Agree it looks cleaner, though you only use it once as far as I can see in the files changed tab? |
@@ -246,6 +261,10 @@ __attribute__((used)) void hard_fault_handler(uint32_t* sp, uint32_t corrupted, | |||
printf("EXC_RET: 0x%08" PRIx32 "\n", exc_return); | |||
puts("Attempting to reconstruct state for debugging..."); | |||
printf("In GDB:\n set $pc=0x%lx\n frame 0\n bt\n", pc); | |||
int stack_left = _stack_size_left(HARDFAULT_HANDLER_REQUIRED_STACK_SPACE); | |||
if(stack_left < 0) { | |||
printf("\nISR stack overflowed by %d bytes max.\n", (-1 * stack_left)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd argue that it has overflowed by at least %d
bytes
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's correct. Now that I measured the stack usage, this should be changed to at least
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Wait, isn't it the other way around? Now that the stack usage refering to the entry pointer is known and given that the stack pointer has probably advanced (at least the stack usage didn't decline), I can tell that max. n bytes overflowed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if the stack has overflowed in the past we can not know for sure how far it has overflowed before shrinking to the current sp
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see your point. I was implicitly assuming that a stack overflow would cause hardfault and that it would happen at the point with the highest stack usage. But that obviously doesn't hold. I this case _stack_size_left()
should be called at the entry of hard_fault_handler()
I guess.
ACK, please squash |
8c52a7a
to
c5e220c
Compare
I'm gonna hit the button for Joakim. And go |
cortexm_common: check for possible stack overflow in hardfault handler
When the hardfault handler executes it may case an overflow into the preceeding section in memory (bss) caused by printf. The already existent measure only make sure not leave valid ram, not leaving the own stack.
This PR will output a warning if the hardfault handler may have corrupted memory and display the corrupt range. If you happen to see this on your board it should be an indication that the handler stack is too small and raise the awareness that debugging the current state may be pointless.