nanocoap: add coap_next_msg_id() #18991

benpicco · 2022-11-29T01:02:11Z

Contribution description

Both nanoCoAP and GCoAP have the need for getting (somewhat) monotonic message IDs.
Currently both implement this as an internal function that they then kinda need to re-repose after all.

Clean this up by just adding a single public coap_next_msg_id() function that can be used by anyone needing CoAP message IDs.

Testing procedure

CoAP should function as before.
(tests/gcoap_fileserver will run integration test on CI)

Issues/PRs references

riot-ci · 2022-11-29T05:50:27Z

Murdock results

✔️ PASSED

d45ae80 nanocoap_sock: make use of coap_next_msg_id()

Success	Failures	Total	Runtime
117858	0	117858	01h:56m:58s

Artifacts

Documentation preview

benpicco · 2022-11-29T10:14:48Z

@kfessel 074b2e2 will now randomize the ID on init if it happens to be 0.

kfessel

looks good to me (randomize if the no initialized memory is 0 (power loss) according to should match the coap-standards strong suggestion randomize the initial id)

maribu

see inline

maribu · 2022-11-29T10:40:14Z

sys/include/net/nanocoap.h

+ *
+ * @return  A new message ID that can be used for a request or response.
+ */
+uint16_t coap_next_msg_id(void);


Could you change so that the src and dst endpoint are available to the implementation?

The standard suggest three levels of granularity:

Having one counter for each distinct src <--> dst pair

Having one counter per network interface / subnet /

Having one single globel counter

The reason for having more counter is that message IDs must not be reused within some time. It easily becomes the limiting factor on 802.3 or 802.11 networks. So, having multiple counters can increase throughput.

I think it is fine to keep using a single counter. But having a way to extend would be nice.

Then we better keep the separate implementations.
With nanocoap_sock it's trivial to attach such a counter to the nanocoap_sock_t struct.

If we want a generalized function we would need to work with arrays to look up the message ID, which sounds like a disproportionate overhead to me.

maribu · 2022-11-29T10:47:02Z

sys/net/application_layer/nanocoap/nanocoap.c

+
+static void auto_init_coap_random(void)
+{
+    if (_msg_id == 0) {


Doesn't this assume that SRAM has a value of 0 on the first boot? To me understanding this is not necessarily the case and this us being relied upon by the PUF modules.

The PUF random seeding utility does track whether it is a cold boot. Maybe we should split this out an own (trivial) module and use that here?

Yes that's why initially only wanted to use the .noinit section as it was done in nanocoap_sock.

@kfessel was worried that some MCUs would then always start with 0 on power-on-reset, so I added the random.

I wanted to keep the .noinit to keep sequence numbers sequential across reboots as I had issues with CoAP servers discarding old message IDs in the past (when they were reset after the reboot).

With always randomizing there is a small chance of this happening.

I think the standard actually wants the first msg IDs to be randomized. I do agree with .noinit, but I doubt that checking it to be zero is a good way to determine it is a cold boot.

With always randomizing there is a small chance of this happening.

especially since random seed might not be random but RANDOM_SEED_DEFAULT or cpuid in these cases .noinit seems a better bet to not reuse the same msg_id.

checking for 0 we catch the cpus that have their memory cells drift to 0 if powered off .

checking for 0 we catch the cpus that have their memory cells drift to 0 if powered off .

Note: Production variance has a significant impact on whether a given SRAM cell tends ti drift to zero or one. A few cells end up being randomly zero or one. With large enough SRAM this is a viable source of entropy.

So, there really is no sense in checking for it being zero. One could profile the memory at the address this gets linked to for a specific SRAM and end up with a classification that would yield a probability for it being a cold boot.

We do have reliable ways to reliably tell the reset cause: Some MCUs readily tell so. For others, we can store a magic cookie value into a .noinit variable. If it is still stored there on boot, it was a warm boot.

We may already have a module for that, I think this is ringing a bell.

chrysn · 2022-11-29T12:52:09Z

There has been a similar proposal in #16730.

My preference would be that that either RIOT bless some mechanism of persisting data across reboots (which comes in various, possibly conflicting, sets of requirements¹), or it be up to the application to do that persistence. Either way, while message-ID is one thing to persist, I'd rather have an opaque thing that the CoAP stack (or even the full OS) takes at startup and produces at any time, which may easily be just the MID for now (but may become larger as the stack takes on responsibilities, eg. when it enforces congestion avoidance, or when we want to enhance our RNG by keeping entropy across reboots).

I think the main conflicting goals are "works reliably" vs "best effort is OK", which is connected to whether a clean shutdown is required, or whether it is cheap to do these updates often enough to be meaningfully usable even after a crash. ↩

This reverts commit 074b2e2.

benpicco · 2022-11-29T12:57:56Z

I still don't fully understand why randomness of the initial message ID is so important. All message IDs after that are sequential (even across requests) so why is it important that the first ID is randomized?

chrysn · 2022-11-29T13:00:36Z

The initial random value is in the spec to make blind attackers' lives harder (if a device always comes up with 4 and you spam it with RSTs of message ID 4, it will never get far).

miri64 · 2022-11-29T13:05:52Z

I still don't fully understand why randomness of the initial message ID is so important. All message IDs after that are sequential (even across requests) so why is it important that the first ID is randomized?

The growth could also be randomized IMHO (as long as it does not contradict the standard).

chrysn · 2022-11-29T13:24:07Z

Growth better not be -- there is a rejected erratum (https://www.rfc-editor.org/errata/eid5429 from @maribu) even that it would be a MUST to go by 1. Still, some people want to implement message dedup based on a window.

maribu · 2022-11-29T13:28:46Z

Note that using an upcounting counter is the suggested implementation in the standard and everybody uses this. Some CoAP implementations rely one a counter being used to optimize duplicate detection. The QoS would drop when interacting with such implementations.

The 32 bit entropy in the CoAP token required by the standard should help more than the 16 bit msg ID in annoying attackers anyway.

chrysn · 2022-11-29T16:11:28Z

A bit OT, but there's no guarantee of 32bit in the token. Message size sensitive applications will use the 0-length token as often as possible.

benpicco · 2023-01-19T18:00:47Z

So to salvage this:

add msg_id field to nanocoap_sock_t, leave GCoAP alone
randomize the msg_id on connect

did I get that right?

maribu · 2023-01-20T07:36:46Z

A bit OT, but there's no guarantee of 32bit in the token. Message size sensitive applications will use the 0-length token as often as possible.

Hmm, I read the standard as that the Token is where you should put in your entropy against spoofing, if you want this (section 5.3.1):

A client that is connected to the general Internet SHOULD use at least 32 bits of randomness, keeping in mind that not being directly connected to the Internet is not necessarily sufficient protection against spoofing.

But then, I do wonder what the use cases are where adding 32 bit of entropy to your message will make a lot of difference. I honestly see no use case were using CoAP over plain UDP is sensible except for networks with L2 security and a firewall that prevents plain UDP based CoAP communication with nodes outside of that network (such as nodes in the Internet). If you have a device capable enough to act as a border router, I would expect that to also to be capable enough to act as secure endpoint of CoAP communication (such as CoAP over OSCORE or DTLS). And that could expose any resource provided by the constrained devices incapable of DTLS or OSCORE in the L2 network with added security.

benpicco · 2023-01-26T01:18:12Z

closed in favor of #19178

benpicco requested review from miri64, haukepetersen, PeterKietzmann and cgundogan as code owners November 29, 2022 01:02

benpicco requested a review from fabian18 November 29, 2022 01:02

github-actions bot added Area: CoAP Area: Constrained Application Protocol implementations Area: network Area: Networking Area: sys Area: System labels Nov 29, 2022

benpicco added Type: cleanup The issue proposes a clean-up / The PR cleans-up parts of the codebase / documentation CI: ready for build If set, CI server will compile all applications for all available boards for the labeled PR labels Nov 29, 2022

benpicco added 3 commits November 29, 2022 02:04

nanocoap: add coap_next_msg_id()

fbcb8cf

gcoap: make use of coap_next_msg_id()

33afa0a

nanocoap_sock: make use of coap_next_msg_id()

d45ae80

benpicco force-pushed the coap_next_msg_id branch from b75d365 to d45ae80 Compare November 29, 2022 01:04

fixup! nanocoap: add coap_next_msg_id()

074b2e2

benpicco removed the CI: ready for build If set, CI server will compile all applications for all available boards for the labeled PR label Nov 29, 2022

kfessel reviewed Nov 29, 2022

View reviewed changes

benpicco requested review from chrysn and maribu November 29, 2022 10:27

maribu reviewed Nov 29, 2022

View reviewed changes

Revert "fixup! nanocoap: add coap_next_msg_id()"

75fe774

This reverts commit 074b2e2.

benpicco mentioned this pull request Jan 20, 2023

nanocoap_sock: store message ID in nanocoap_sock_t #19178

Merged

benpicco closed this Jan 26, 2023

benpicco deleted the coap_next_msg_id branch January 26, 2023 01:18

nanocoap: add coap_next_msg_id() #18991

nanocoap: add coap_next_msg_id() #18991

Uh oh!

Conversation

benpicco commented Nov 29, 2022

Contribution description

Testing procedure

Issues/PRs references

Uh oh!

riot-ci commented Nov 29, 2022

Murdock results

Artifacts

Uh oh!

benpicco commented Nov 29, 2022

Uh oh!

kfessel left a comment

Choose a reason for hiding this comment

Uh oh!

maribu left a comment

Choose a reason for hiding this comment

Uh oh!

maribu Nov 29, 2022

Choose a reason for hiding this comment

Uh oh!

benpicco Nov 29, 2022

Choose a reason for hiding this comment

Uh oh!

maribu Nov 29, 2022

Choose a reason for hiding this comment

Uh oh!

benpicco Nov 29, 2022

Choose a reason for hiding this comment

Uh oh!

maribu Nov 29, 2022

Choose a reason for hiding this comment

Uh oh!

kfessel Nov 29, 2022

Choose a reason for hiding this comment

Uh oh!

maribu Nov 29, 2022

Choose a reason for hiding this comment

Uh oh!

chrysn commented Nov 29, 2022

Footnotes

Uh oh!

benpicco commented Nov 29, 2022

Uh oh!

chrysn commented Nov 29, 2022 via email

Uh oh!

miri64 commented Nov 29, 2022

Uh oh!

chrysn commented Nov 29, 2022 via email

Uh oh!

maribu commented Nov 29, 2022

Uh oh!

chrysn commented Nov 29, 2022 via email

Uh oh!

benpicco commented Jan 19, 2023

Uh oh!

maribu commented Jan 20, 2023

Uh oh!

benpicco commented Jan 26, 2023

Uh oh!

Uh oh!