Skip to content

Conversation

miri64
Copy link
Member

@miri64 miri64 commented May 26, 2020

Contribution description

This PR adds the support for netdev_tap to pkg/lwip the same way as PR #12895 and #12950 already did for other Ethernet devices.

Testing procedure

I only was able to compile test this, but there was no traffic visible in Wireshark.

Issues/PRs references

None, but came up during the review of #13701.

@miri64 miri64 added Type: enhancement The issue suggests enhanceable parts / The PR enhances parts of the codebase / documentation Platform: native Platform: This PR/issue effects the native platform Area: network Area: Networking Area: pkg Area: External package ports labels May 26, 2020
@miri64 miri64 added this to the Release 2020.07 milestone May 26, 2020
@miri64 miri64 requested a review from MichelRottleuthner May 26, 2020 19:07
@benpicco benpicco added the CI: ready for build If set, CI server will compile all applications for all available boards for the labeled PR label May 26, 2020
Copy link
Contributor

@MichelRottleuthner MichelRottleuthner left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I confirm that this PR solves the compile error on master when building tests/lwip with the LWIP_IPV4=1 option set. But I couldn't test with proper traffic because the RIOT instance node won't get an address and always stays at 0.0.0.0. Even with a proper DHCP server setup this wont work. Looking at the code tells me that the dhcp client of lwip is only started via a NETDEV_EVENT_LINK_UP event. But netdev_tap never actually triggers that. @miri64 where would be the most appropriate place to issue that event?

@miri64
Copy link
Member Author

miri64 commented May 27, 2020

As I wrote in the testing procedures, I wasn't able to figure this out myself yesterday :-/. Maybe @gschorcht has some hints?

@MichelRottleuthner
Copy link
Contributor

"injecting" a NETDEV_EVENT_LINK_UP gets it going for me just fine. I'm just not sure where would be the right place to do that properly.

@miri64
Copy link
Member Author

miri64 commented May 27, 2020

Maybe just ifconfig <if_id> up?

@MichelRottleuthner
Copy link
Contributor

Maybe just ifconfig <if_id> up?

Hmm that would be more like telling the network device to go up. But then the indication from the tap device to the lwip netdev instance (i.e. NETDEV_EVENT_LINK_UP) would still be missing, or am I misunderstanding something here?
Actually I would have expected that either

  1. the tap device guarantees the event is fired automatically somewhere close to its init because other modules need that indication for proper operation (e.g. lwip_dhcp_auto)
  2. or that modules like lwip_dhcp_auto don't strictly require the event to eventually start operation (e.g. actively check if the device is already up).

@miri64
Copy link
Member Author

miri64 commented Jun 8, 2020

Hmm that would be more like telling the network device to go up. But then the indication from the tap device to the lwip netdev instance (i.e. NETDEV_EVENT_LINK_UP) would still be missing, or am I misunderstanding something here?

Ah, I mixed it up with the NETOPT option, I believe.

So I guess, we need to add that event then.

@miri64
Copy link
Member Author

miri64 commented Jun 8, 2020

@MichelRottleuthner do you want use your injection hack as a template for a PR?

@MichelRottleuthner
Copy link
Contributor

@MichelRottleuthner do you want use your injection hack as a template for a PR?

Not really. Its was just a really dirty hack to confirm if that event really is the missing part.
Anyway, just for reference:

snippet
diff --git a/cpu/native/netdev_tap/netdev_tap.c b/cpu/native/netdev_tap/netdev_tap.c
index 1afdc1e62..2a34475fb 100644
--- a/cpu/native/netdev_tap/netdev_tap.c
+++ b/cpu/native/netdev_tap/netdev_tap.c
@@ -204,11 +204,22 @@ static void _continue_reading(netdev_tap_t *dev)
     _native_in_syscall--;
 }
 
+bool first_recv = true;
+
 static int _recv(netdev_t *netdev, void *buf, size_t len, void *info)
 {
     netdev_tap_t *dev = (netdev_tap_t*)netdev;
     (void)info;
 
+    if (first_recv) {
+        if (netdev->event_callback) {
+            netdev->event_callback(netdev, NETDEV_EVENT_LINK_UP);
+        } else {
+            DEBUG("no event callback!!\n");
+        }
+        first_recv = false;
+    }
+
     if (!buf) {
         if (len > 0) {
             /* no memory available in pktbuf, discarding the frame */

I think a proper solution would do more something like the two steps I described above.
A problem I see with 1 (i.e. firing that event on init): AIUI, it is not guaranteed that the association between netdev_tap and some module sitting on top of that (in our case lwip_netdev) is already set up at netdev_tap's _init. So the lwip part would still need some refactoring to eventually start dhcp if it wasn't already set up when netdev_tap issued the event.

@jia200x to me it feels like this pretty much touches what you worked on with MLME/MCPS (#11324) . Do you have some wise thoughts to share here? Is the ongoing MAC layer rework maybe already providing a generic solution to this?

@miri64
Copy link
Member Author

miri64 commented Jun 9, 2020

AIUI, it is not guaranteed that the association between netdev_tap and some module sitting on top of that (in our case lwip_netdev) is already set up at netdev_tap's _init. So the lwip part would still need some refactoring to eventually start dhcp if it wasn't already set up when netdev_tap issued the event.

Since lwip_netdev is the one calling the netdev's init(), I think we can guarantee, that the lwip_netdev is listening when the event fires.

I rather would not like to touch the lwIP code proper, if necessary.

@miri64
Copy link
Member Author

miri64 commented Jun 9, 2020

But maybe you are right :-/.

make: Entering directory '/home/mlenders/Repositories/RIOT-OS/RIOT/tests/lwip'
/home/mlenders/Repositories/RIOT-OS/RIOT/tests/lwip/bin/native/tests_lwip.elf  tap0
RIOT native interrupts/signals initialized.
LED_RED_OFF
LED_GREEN_ON
RIOT native board initialized.
RIOT native hardware initialization complete.

Assertion "netif != NULL" failed at /home/mlenders/Repositories/RIOT-OS/RIOT/tests/lwip/bin/pkg/native/lwip/src/core/ipv4/dhcp.c:743
make: *** [/home/mlenders/Repositories/RIOT-OS/RIOT/tests/lwip/../../Makefile.include:718: term] Aborted (core dumped)
make: Leaving directory '/home/mlenders/Repositories/RIOT-OS/RIOT/tests/lwip'

@miri64
Copy link
Member Author

miri64 commented Jun 9, 2020

Ah, if the call back is initialized earlier, the context should be as well ^^"

@miri64
Copy link
Member Author

miri64 commented Jun 9, 2020

I should test before I push >.<

@miri64
Copy link
Member Author

miri64 commented Jun 9, 2020

Ok, we have a bootstrapping problem: In my opinion, it does not make sense to initialize the context and callback after we call init() (especially initializing the context after initializing the callback seems dangerous to me, as the callback might access the context). However, we need the network device to have the netif initialized and the netif needs to be initialized for DHCP not to run into an assertion. Hmmmmm.... tricky one to solve.

@miri64
Copy link
Member Author

miri64 commented Jun 9, 2020

For now, I added a hack for netdev_tap to start_dhcp() after the initialization of the tcpip thread (which is required, otherwise DHCP will crash since the mempool is not initializied yet).

@MichelRottleuthner
Copy link
Contributor

I think this special case for native tap is ok for now. Though I question how stable the current approach is for other netdevs. On the ESP for example it probably only works because connecting to WiFi takes some time. Beware of any super fast connecting WiFis otherwise assertions may haunt you? :P

@miri64
Copy link
Member Author

miri64 commented Jun 9, 2020

I think this special case for native tap is ok for now. Though I question how stable the current approach is for other netdevs. On the ESP for example it probably only works because connecting to WiFi takes some time. Beware of any super fast connecting WiFis otherwise assertions may haunt you? :P

Yepp. My thought exactly (edit: but not the focus of this PR).

Copy link
Contributor

@MichelRottleuthner MichelRottleuthner left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Tested with BOARD=native LWIP_IPV4=1 make -C tests/lwip all term and a DHCP server serving the tap interface on the linux side. RIOT now gets a valid IPv4 address and I can communicate with it via the tcp and udp commands. Feel free to squash!

@MichelRottleuthner MichelRottleuthner added Reviewed: 1-fundamentals The fundamentals of the PR were reviewed according to the maintainer guidelines Reviewed: 2-code-design The code design of the PR was reviewed according to the maintainer guidelines Reviewed: 3-testing The PR was tested according to the maintainer guidelines Reviewed: 4-code-style The adherence to coding conventions by the PR were reviewed according to the maintainer guidelines Reviewed: 5-documentation The documentation details of the PR were reviewed according to the maintainer guidelines labels Jun 9, 2020
@miri64
Copy link
Member Author

miri64 commented Jun 9, 2020

Squashed.

@miri64 miri64 force-pushed the lwip/enh/netdev_tap_ipv4 branch from f28879e to f5c4a05 Compare June 9, 2020 10:13
@jia200x
Copy link
Member

jia200x commented Jun 9, 2020

@jia200x to me it feels like this pretty much touches what you worked on with MLME/MCPS (#11324) . Do you have some wise thoughts to share here? Is the ongoing MAC layer rework maybe already providing a generic solution to this?

Somehow :). The idea behind this is to provide an asynchronous API to control the MAC layer from the network interface.

Not sure if I misunderstood the problem, but what's described here can be solved with this Request/Confirm pattern of the SAP API described in #11324. The generic solution could look like this:

  • The device init procedure is separated in init and on (or start) (see RDM: The 802.15.4 Radio HAL #13943 ). The init function simply initializes the device but in the lowest consumption state and without any source of interrupts. The on function is called by the network interface (e.g on ifconfig 4 up) to put the device in a "ready to use" state (interrupts on, etc). The latter is intended to be used from the MAC layer's init function.
  • The network interface is the one who reports a link up event when the MAC layer is ready. For instance, in gnrc_lorawan I'm using an "MLME Join Request" on ifconfig <if_id> up. When the process finishes (MLME Join Confirm callback), the interface is set up. We don't have an "interface up" event yet though.

For instance:

netif_init()
{

    /* It's assumed the `init` function of the device was called on startup. The device IRQs are disabled. */
    if (!mac_init()) {
        return -EINVAL;
    }
    dev->event_cb = _cb;
    return dev->driver->on(dev); /* Turn on the device. It's safe to receive interrupts from here. */
}


netif_up() /* called on `ifconfig <if_id> up` */
{
    return mlme_join(mac, join_ctx); /* This will generate a Confirm primitive */
}



mlme_join_cb(status) /* called when the mlme_join request finishes */
{
    if (status == OK) {
        netif_generate_link_up_event();
    }
}

@miri64
Copy link
Member Author

miri64 commented Jun 9, 2020

So the hack as is can be merged and we move this over to the new API, once its there :-).

@MichelRottleuthner
Copy link
Contributor

Thanks @jia200x for providing the details!

So the hack as is can be merged and we move this over to the new API, once its there :-).

Agree

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Area: network Area: Networking Area: pkg Area: External package ports CI: ready for build If set, CI server will compile all applications for all available boards for the labeled PR Platform: native Platform: This PR/issue effects the native platform Reviewed: 1-fundamentals The fundamentals of the PR were reviewed according to the maintainer guidelines Reviewed: 2-code-design The code design of the PR was reviewed according to the maintainer guidelines Reviewed: 3-testing The PR was tested according to the maintainer guidelines Reviewed: 4-code-style The adherence to coding conventions by the PR were reviewed according to the maintainer guidelines Reviewed: 5-documentation The documentation details of the PR were reviewed according to the maintainer guidelines Type: enhancement The issue suggests enhanceable parts / The PR enhances parts of the codebase / documentation
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants