-
Notifications
You must be signed in to change notification settings - Fork 16
aubuf adaptive jitter buffer #33
Conversation
@@ -0,0 +1,53 @@ | |||
#audio_path /usr/local/share/baresip |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we should keep the baresip related config examples in the baresip repo and reference to ajb
docs if necessary.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is not meant as documentation. It is the baresip config for generate_plots.sh
. Maybe it is displaced.
please restart the actions! |
The timestamp of the given audio frame is stored to avoid a jump of the jitter computation in ajb.c.
src/auframe/auframe.c
Outdated
* Note: faster than auframe_level for s16le | ||
* - no sqrt | ||
* - no logarithm | ||
* - early loop exit for non-silence |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wonder if auframe_level performance is really a problem? In my quick measurements its nearly the same (and we can reuse the level for vumeter
etc.). Silence tracks are more expensive since comparison costs much more than addition.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We do not need the vumeter on our products and the silence detection works already. We don't want to have 640 sqrt and logarithm operations for each frame.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The sqrt is only done once per frame sum. not per sample.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Okay, silence detection is only needed for AJB_LOW and AJB_HIGH situations. But it is not the same like computing a volume in dB
. This might justify a separate function auframe_silence()
. The note can be removed, it is meant only for the review.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, yes, you're right! Logarithm also. Thus there are only the O(n)
double
multiplications instead ofint
double
additions instead ofint
- and the early exit with
return false
which is not possible forauframe_level()
.
On one product we have a very old hardware (balckfin CPU). I will check how much the impact would be to replace the silence detection with the auframe_level()
computation.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes the double multiplications has much more impact, i will prepare a patch (i dont think we need this), here are some results:
auframe_silence 3468 usecs (2000)
auframe_silence2 665 usecs (2000)
auframe_silence 3572 usecs (2000)
auframe_silence2 645 usecs (2000)
https://gist.github.com/sreimers/67440f237cae34537c9c1667900a7c57
On one product we have a very old hardware (balckfin CPU).
We can use maybe a square root alternative:
static float square_root(float x)
{
unsigned int i = *(unsigned int *)&x;
// adjust bias
i += 127 << 23;
// approximation of square root
i >>= 1;
return *(float *)&i;
}
On modern hardware I see no difference with sqrt
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks, improvement here would be welcome! If the performance loss is only little, than I would remove the extra silence detection.
Edit:
- arm would be interesting
- and maybe somebody of my colleagues tries this on our blackfin platform
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@sreimers Do you have a sample program for this benchmarks?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
https://gist.github.com/sreimers/67440f237cae34537c9c1667900a7c57
And this patch: #38
Compiled with RELEASE=1
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
thanks!
TODO: do some tests to find a good silence boundary edit: was meant for me to do. ;-) |
The level of
Additionally the @sreimers : Sorry for the extra work yesterday! But at least the |
A fixed boundary for silence is not possible. It depends on the noise floor of the remote microphone.
In my use-case customers have high quality mics/headsets and computation of auframe_level is needed anyway for each call (so the customer can see remote level). Can we make this configurable? I would like avoid dropping frames with audio that leads to clicking/audible artifacts. I can do some research of bad and good audio material for a better value. |
Thanks, looks good now. Should we merge all PRs? |
This a rework on the adaptive jitter buffer algorithm in
re/src/jbuf
.Discussion: baresip/re#184
Relates to baresip/baresip#1784.
For G.711 the
jbuf
can be dropped (jitter_buffer_type off
). What means that each RTP packet is immediately passed to the decoder. Theaubuf
handles re-ordered audio frames.For state-full codecs like G.722, opus the
jbuf
has a new option (jitter_buffer_type minimize
). In this case stilljbuf
has to handle re-ordered packets. But theminimize
mode ensures that thejbuf
latency is adjusted on the frequency of "packet to late" occurrences.The
jitter_buffer_type adaptive
becomes obsolete. --> Simplifiesre/src/jbuf/jbuf.c
andbaresip/src/audio.c
.This image shows how starting with aubuf min size 60ms, the algorithm reduces the delay because of low jitter.

This image shows how the aubuf is adjusted to computed jitter. The network jitter simulation is activated at second 8 and disabled at second 27.
