Skip to content

Conversation

jnohlgard
Copy link
Member

@jnohlgard jnohlgard commented Sep 29, 2015

Depends on #4183

Set XTIMER_HZ in periph_conf.h to specify the xtimer hardware timer
frequency.

The xtimer system uses the hardware timer's own tick representation
internally but a thin wrapper layer is added as a shim between xtimer
core and the public xtimer API which is based around microseconds as
the time unit.

Initially supports only 32768 Hz, and 1000000 Hz (and other power-of-two multiples of 15625), but can easily be expanded by adding a new case for XTIMER_HZ inside xtimer.h

Note: Using XTIMER_HZ = 1000000 will not add any overhead as the conversion shim will be simply an inlined straight passthrough.

TODO:

  • Solve how to handle time t > 16383999997 for XTIMER_HZ == 32768
  • Add settings for cc2538 (XTIMER_HZ 16000000)
  • Test on multiple platforms and with non-standard XTIMER_HZ settings

Test subjects of particular interest:

@jnohlgard jnohlgard added Type: enhancement The issue suggests enhanceable parts / The PR enhances parts of the codebase / documentation Discussion: RFC The issue/PR is used as a discussion starting point about the item of the issue/PR Impact: major The PR changes a significant part of the code base. It should be reviewed carefully Area: timers Area: timer subsystems labels Sep 29, 2015
@jnohlgard
Copy link
Member Author

(Renamed the include guard because it was so close to XTIMER_HZ in name)

@jnohlgard
Copy link
Member Author

I will create a separate PR which makes use of this on Kinetis to run xtimer on the LPTMR hardware instead of PIT.

* slow, and uses an additional 200 bytes of ROM compared to the helper
* function required for multiplying two 32 bit integers into a 64 bit result.
*/
return ((uint64_t)(us << 9) * 0x431bde83ul) >> (12 + 32);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

((uint64_t)(us) * 0x431bde83ul) >> (12 + 32 - 9) should be equivalent, but allow the division for all values of us.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are you sure? This kind of optimization is new to me, so I don't know.

I am trying to reduce the rounding/truncation in the least significant bits by performing the multiplication first and then down-shifting after the division.
Compare a = (x * y) / z and a = (x / z) * y (using integer division)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well, x * (whatever) / x == x/x * (whatever) == whatever. So (us * 2^9 * X) / (2^44) == (2^9 * (us * X)) / (2^35)*(2^9) == (2^9)/2^9) * ((us * X) / 2^35) == (us * X) / 2^35.

Left-shifting first might cut some bits off if us > 2^(32-9).

(I need more coffee).

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was worrying about the big scary modular arithmetic magic constant would mess up the division if didn't shift first to get the proper fixed-point number.
I will update the PR.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

wait a minute, I might have something simpler...

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm sitting on pins and needles. What's your simpler approach?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

see #4002.

It is simpler, but it will execute the conversion macro whenever _xtimer_now() is called, e.g., a lot when spinning.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Executing the conversion as sparsely as possible was the main reason I chose to put it inside the public functions and make xtimer use ticks internally.

Also, I wanted to keep the API design around microseconds to avoid conversion macros inside the calls similar to what the hwtimer API used to require before. In my opinion it is good to have a standard unit defined for the timer delays, especially when writing portable code between platforms.

Both the macro approach and the static inline function approach lets the compiler optimize away unnecessary conversions by precomputing the result when using constants for offsets (useful in device drivers which require delays which are specified in the data sheet of the device).

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Executing the conversion as sparsely as possible was the main reason I chose to put it inside the public functions and make xtimer use ticks internally.

The old vtimer used ticks internally and converted in the public functions. I specifically designed xtimer to avoid that, as it IMHO made the whole thing a lot more complex.

Not sure what is the way to go here.

As I usually like my own solutions better, I probably can't be objective here, but I prefer doing the conversion at the latest point for several reasons:

  • the xtimer code only deals with the conversions at two (very low-level) points
  • all assumptions on when a timer goes into the overflow list, the long-timer list etc. stay the same across all timers
  • the values for target and long_target in xtimer_t can be used without conversion
  • it has slightly less code size

The downside is that the *_spin() functions are less efficient and less accurate.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, at some point we'll probably use a low-power timer as fallback, then we'd have to deal with three different ticks. Having microseconds as common base seems easier.

inline static uint32_t _xtimer_ticks_to_us(uint32_t ticks) {
/* Using 64 bit multiplication to avoid truncating the top 9 bits */
uint64_t us = (uint64_t)ticks * 15625ul;
return us >> 9;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For _xtimer_us_to_ticks() you commented, what the actual meaning of the calculation is. Can you please do so here, too?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right-o, will update

@jnohlgard
Copy link
Member Author

Rebased after #3998 was merged

while (_xtimer_now() > value);
while (_xtimer_now() < value);
while (_xtimer_now_ticks() > value);
while (_xtimer_now_ticks() < value);
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

note to self: this should be changed to _lltimer_now_ticks

@kaspar030
Copy link
Contributor

If I define XTIMER_HZ to 32768 in cpu/native/include/periph_conf.h, examples/timer_wakeup_periodic breaks (works correctly for some a second and then loops). But that is probably a matter of correct tuning values.

@jnohlgard
Copy link
Member Author

@kaspar030 did it run really fast?
I did not yet update the timer_init call in xtimer, so that might be one problem. I have a PR in the works for changing the meaning of the middle argument to timer_init from ticks_per_us to ticks_per_sec to handle non-"integer MHz" timers

@kaspar030
Copy link
Contributor

did it run really fast?
(well, I just used the 1mhz timer as 32khz timer, so I was expecting it to be ~30 times as fast)

It went as expected for maybe 10 too fast seconds, then looped. That happens if one xtimer_usleep_until() call returns too fast.

I have a PR in the works for changing the meaning of the middle argument to timer_init from ticks_per_us to ticks_per_sec to handle non-"integer MHz" timers

nice!

@jnohlgard
Copy link
Member Author

@kaspar030 could it be the printf's inside the test taking too much time when trying to run the main test loop at 30x speed?

@kaspar030
Copy link
Contributor

@gebart I don't think so, as this is on native, so prints should be very fast. Also #4002 works fine at the same speed.

@jnohlgard
Copy link
Member Author

I'll look into it.

@jnohlgard
Copy link
Member Author

@kaspar030

Executing the conversion as sparsely as possible was the main reason I chose to put it inside the public functions and make xtimer use ticks internally.

The old vtimer used ticks internally and converted in the public functions. I specifically designed xtimer to avoid that, as it IMHO made the whole thing a lot more complex.

Not sure what is the way to go here.

As I usually like my own solutions better, I probably can't be objective here, but I prefer doing the conversion at the latest point for several reasons:

  • the xtimer code only deals with the conversions at two (very low-level) points
  • all assumptions on when a timer goes into the overflow list, the long-timer list etc. stay the same across all timers
  • the values for target and long_target in xtimer_t can be used without conversion
  • it has slightly less code size

The downside is that the *_spin() functions are less efficient and less accurate.

You make valid points, I agree it is better to have conversions happen at the fewest possible places (fewer points of failure). The alternative method looks smaller in the example PR, but it will still need a bunch of conversions distributed across all users of the xtimer API, instead of encapsulated inside the xtimer code.

While the assumptions on which point in time (at x microseconds) the timer goes into overflow remains the same, that won't happen on the same place in the hardware counter of the different hardware timers, I think makes more sense to refer to timer tick 0xffff (or 0xffffffff or whatever depending on resolution) as the last tick of the short period rather than trying to shoehorn a timer into overflowing at 65.535 ms or 16.777216 seconds or 4300-ish seconds. Debugging timers is always hard because you always affect them when trying to observe them because of the timing involved, but I think using the native timer overflow as the basis of the "long term" is the way to go here.

Regarding the values in target and long_target, I don't know, it is not a very common use case (from looking through the RIOT tree) but maybe we should provide translation functions (some kind of getter) for reading the target timer values from a struct xtimer.

The code size can probably be reduced in this PR too, I wanted something that works and is stable before messing around with optimizations.

A big difference between xtimer and vtimer is that xtimer has a better separation between xtimer public API and xtimer core. In my opinion it makes more sense to let xtimer core work with timer values which are native to the timer hardware. xtimer is the public API which should be consistent between platforms, it is quite easy to forget to use XTIMER_USEC_TO_TICKS() on some call (and easy to miss when reviewing).

@jnohlgard
Copy link
Member Author

something is seriously messed up with xtimer_usleep_until in the current state of this PR. I will update the PR later.

@jnohlgard jnohlgard added the State: WIP State: The PR is still work-in-progress and its code is not in its final presentable form yet label Oct 3, 2015
@jnohlgard jnohlgard added the State: waiting for other PR State: The PR requires another PR to be merged first label Oct 3, 2015
@jnohlgard
Copy link
Member Author

Waiting for #4040

@jnohlgard
Copy link
Member Author

  • Rebased
  • Updated to get xtimer_usleep_until working properly on native again
  • Eliminate XTIMER_SHIFT
  • Added warning for XTIMER_BACKOFF < XTIMER_OVERHEAD which causes timer underrun for small offsets
  • Added some statistics in tests/xtimer_usleep_until

@jnohlgard
Copy link
Member Author

This PR is getting out of hand again. I will try to fork off the general improvements

@jnohlgard jnohlgard removed the CI: ready for build If set, CI server will compile all applications for all available boards for the labeled PR label Mar 19, 2016
@OlegHahm
Copy link
Member

Won't make it until the release, right?

@jnohlgard
Copy link
Member Author

Won't make it. Will likely result in a replacement PR later.
On Mar 23, 2016 11:59 PM, "Oleg Hahm" notifications@github.com wrote:

Won't make it until the release, right?


You are receiving this because you were mentioned.
Reply to this email directly or view it on GitHub
#3990 (comment)

@OlegHahm OlegHahm modified the milestones: Release 2016.07, Release 2016.04 Mar 24, 2016
@jnohlgard jnohlgard removed this from the Release 2016.07 milestone Jul 12, 2016
@jnohlgard jnohlgard removed the State: waiting for other PR State: The PR requires another PR to be merged first label Jul 12, 2016
@jnohlgard
Copy link
Member Author

obsoleted in favour of #5608.

@jnohlgard jnohlgard closed this Jul 12, 2016
@jnohlgard jnohlgard deleted the pr/xtimer-32768hz branch March 30, 2018 08:16
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Area: timers Area: timer subsystems Discussion: RFC The issue/PR is used as a discussion starting point about the item of the issue/PR Impact: major The PR changes a significant part of the code base. It should be reviewed carefully State: WIP State: The PR is still work-in-progress and its code is not in its final presentable form yet Type: enhancement The issue suggests enhanceable parts / The PR enhances parts of the codebase / documentation
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants