Skip to content
This repository was archived by the owner on Feb 16, 2023. It is now read-only.

Conversation

cspiel1
Copy link
Collaborator

@cspiel1 cspiel1 commented Mar 30, 2022

This a rework on the adaptive jitter buffer algorithm in re/src/jbuf.
Discussion: baresip/re#184
Relates to baresip/baresip#1784.

For G.711 the jbuf can be dropped (jitter_buffer_type off). What means that each RTP packet is immediately passed to the decoder. The aubuf handles re-ordered audio frames.

For state-full codecs like G.722, opus the jbuf has a new option (jitter_buffer_type minimize). In this case still jbuf has to handle re-ordered packets. But the minimize mode ensures that the jbuf latency is adjusted on the frequency of "packet to late" occurrences.

The jitter_buffer_type adaptive becomes obsolete. --> Simplifies re/src/jbuf/jbuf.c and baresip/src/audio.c.

  • aubuf: compute input jitter and increment/decrease buffered samples
  • aubuf: add aubuf mode selection
  • aubuf: add documentation for adaptive jitter buffer
  • docs: add .gitignore to docs/aubuf
  • aubuf: fixes
  • docs/aubuf: add a script for plot generation
  • docs/aubuf: activate new jbuf type "minimize" for plots
  • aubuf: reset debug level

This image shows how starting with aubuf min size 60ms, the algorithm reduces the delay because of low jitter.
aubuf1

This image shows how the aubuf is adjusted to computed jitter. The network jitter simulation is activated at second 8 and disabled at second 27.
aubuf2

@@ -0,0 +1,53 @@
#audio_path /usr/local/share/baresip
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we should keep the baresip related config examples in the baresip repo and reference to ajb docs if necessary.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is not meant as documentation. It is the baresip config for generate_plots.sh. Maybe it is displaced.

@cspiel1
Copy link
Collaborator Author

cspiel1 commented Mar 31, 2022

please restart the actions!

* Note: faster than auframe_level for s16le
* - no sqrt
* - no logarithm
* - early loop exit for non-silence
Copy link
Member

@sreimers sreimers Apr 4, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder if auframe_level performance is really a problem? In my quick measurements its nearly the same (and we can reuse the level for vumeter etc.). Silence tracks are more expensive since comparison costs much more than addition.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We do not need the vumeter on our products and the silence detection works already. We don't want to have 640 sqrt and logarithm operations for each frame.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The sqrt is only done once per frame sum. not per sample.

Copy link
Collaborator Author

@cspiel1 cspiel1 Apr 4, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay, silence detection is only needed for AJB_LOW and AJB_HIGH situations. But it is not the same like computing a volume in dB. This might justify a separate function auframe_silence(). The note can be removed, it is meant only for the review.

Copy link
Collaborator Author

@cspiel1 cspiel1 Apr 4, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, yes, you're right! Logarithm also. Thus there are only the O(n)

  • double multiplications instead of int
  • double additions instead of int
  • and the early exit with return false which is not possible for auframe_level().

On one product we have a very old hardware (balckfin CPU). I will check how much the impact would be to replace the silence detection with the auframe_level() computation.

Copy link
Member

@sreimers sreimers Apr 4, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes the double multiplications has much more impact, i will prepare a patch (i dont think we need this), here are some results:

auframe_silence 3468 usecs (2000)
auframe_silence2 665 usecs (2000)
auframe_silence 3572 usecs (2000)
auframe_silence2 645 usecs (2000)

https://gist.github.com/sreimers/67440f237cae34537c9c1667900a7c57

On one product we have a very old hardware (balckfin CPU).

We can use maybe a square root alternative:

static float square_root(float x)
{
	unsigned int i = *(unsigned int *)&x;

	// adjust bias
	i += 127 << 23;
	// approximation of square root
	i >>= 1;

	return *(float *)&i;
}

On modern hardware I see no difference with sqrt.

Copy link
Collaborator Author

@cspiel1 cspiel1 Apr 4, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, improvement here would be welcome! If the performance loss is only little, than I would remove the extra silence detection.

Edit:

  • arm would be interesting
  • and maybe somebody of my colleagues tries this on our blackfin platform

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@sreimers Do you have a sample program for this benchmarks?

Copy link
Member

@sreimers sreimers Apr 4, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks!

@cspiel1
Copy link
Collaborator Author

cspiel1 commented Apr 4, 2022

TODO: do some tests to find a good silence boundary

edit: was meant for me to do. ;-)

@cspiel1 cspiel1 marked this pull request as ready for review April 5, 2022 05:41
@cspiel1 cspiel1 marked this pull request as draft April 5, 2022 05:58
@cspiel1
Copy link
Collaborator Author

cspiel1 commented Apr 5, 2022

The level of -70dB is much too low. The boundary for "silence" depends very much on the quality of the remote microphone. What we want is a speech detection. At least a silence detection would need to find local/global min and max values. This would have the drawback that we

  • need to compute the aulevel for each frame (not only if ajb->as != AJB_GOOD),
  • and this needs more variables and code complexity in ajb.c.

Additionally the AJB_LOW and AJB_HIGH switches should not be postponed too long. If there is no silence for a while, the switch of the aubuf latency should be forced. Thus the benefit for the silence detection in ajb.c is too low.

@sreimers : Sorry for the extra work yesterday! But at least the auframe_level is useful for the vumeter.

A fixed boundary for silence is not possible. It depends on the noise floor of
the remote microphone.
@cspiel1 cspiel1 marked this pull request as ready for review April 5, 2022 06:25
@sreimers
Copy link
Member

sreimers commented Apr 5, 2022

In my use-case customers have high quality mics/headsets and computation of auframe_level is needed anyway for each call (so the customer can see remote level). Can we make this configurable? I would like avoid dropping frames with audio that leads to clicking/audible artifacts. I can do some research of bad and good audio material for a better value.

@sreimers
Copy link
Member

sreimers commented Apr 5, 2022

Thanks, looks good now. Should we merge all PRs?

@sreimers sreimers merged commit 5e1104d into baresip:main Apr 5, 2022
@cspiel1 cspiel1 deleted the aubuf_adaptive_jitter_buffer branch April 5, 2022 12:11
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants