Skip to content

Gcoap drops long packages instead of gracefully erring out #14167

@chrysn

Description

@chrysn

Description

When gcoap receives messages that exceed CONFIG_GCOAP_PDU_BUF_SIZE, it silently discards it.

Steps to reproduce the issue

  • Run the gcoap example
  • Run a server that serves a resource that gives >128 byte responses, for example the aiocoap demo server
  • coap get -c fe80::176b:fd74:a58f:ff97%6 5683 /.well-known/core
  • Watch how the server reports sending responses (also to duplicate request), and how the client keeps retransmitting for lack of a response.
  • To see where it's breaking, turn on debug in sys/application_layer/gcoap/gcoap.c and watch the gcoap: udp recv failure: -105 messages pop up on each response.

Expected results

The suitable behavior (as I understand) on client side would be to indicate the failure to the client application, and (in CON cases) to stop retransmitting.

The suitable behavior would be, on server side, to respond with "4.13 Request Entity Too Large" with Size1 indicating CONFIG_GCOAP_PDU_BUF_SIZE, or a Block1 option to that effect. Then, the client would know to abort if it can't make the request smaller, or use Block1 fragmentation.

Actual results

Outgoing confirmable requests are retransmitted. Both confirmable and nonconfirmable requests run into timeouts.

Incoming larger requests are ignored, causing the client to retransmit or fail with a timeout. (Admittedly, I have not tested whether that's not already the case, but from the code path it seems pretty clear that the package is dropped).

Versions

Current RIOT master (8a2b089)

How this could be addressed

This is pretty hard to fix with the current socket API, as it does not allow the application to receive at least the truncated package (which would be needed for both cases).

If gcoap could migrate to sock_udp_recv_buf, this might be trivially fixed -- but that is currently unstable, and does not really work well with the underlying nanocoap assumption that the CoAP message is in contiguous memory. (And having worked with CoAP on lwIP, I distinctly remember that scatter-gathering CoAP messages is a mess).

All information required would be in the first 12 bytes (except when long tokens / stateless is used, but those clients are aware of how hard it is to get sane output back).

  • If gcoap sticks with the self-allocated buffer API, maybe that API could be extended to allow overflow. (Just writing up to max_len and returning a value > max_len would, while not strictly breaking the API, probably break applications in security relevant ways).

  • If the sock scatter-gather API stabilizes, it may be a sane assumption that the first 12 bytes of the message are in a contiguous buffer (otherwise falling back to the current behavior of ignoring the message for lack of options).

    After that, gcoap could still decide whether to reassemble the message from the stack-internal buffers into a full message, whether it can ask the stack to pretty please assemble the fragments to a contiguous buffer or to err out, but the information to err out would be present.

Metadata

Metadata

Assignees

Labels

Area: CoAPArea: Constrained Application Protocol implementationsType: bugThe issue reports a bug / The PR fixes a bug (including spelling errors)

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions