Skip to content

Low signal level distorts MFCC/GFCC values #543

@dbogdanov

Description

@dbogdanov

Ideally we would expect to have identical MFCC coefficients, except for the 0th coefficient, on different input levels for the same input signal frame. However, in the case when the input signal level is very low, the MFCC values get distorted.

Low signal level leads to small spectrum values. Using power spectrum for computation of mel bands reduces these values further. Taking log to compute log-energies we apply thresholding to truncate very silent bands (currently we truncate to -90dB).

For some signal frames it may occur that some bands truncated being below the threshold, while others are not. This lead to MFCC values different from expected.

When all bands values are truncated, the resulting MFCC vector contains zeros except for the 0th coefficient which will receive its minimum negative value. Avoiding distortion by lowering silence threshold comes at cost of more frames containing non-zero MFCC vectors. This threshold might depend on application.

Solutions:

  • disable truncation when computing dbamp/dbpow log in MFCC/GFCC.
  • lower silence threshold (-180dB) for MFCC/GFCC
  • truncate bands at -90dB if magnitude spectrum was used, at -180dB if power spectrum was used
  • implement silenceThreshold parameter
  • print a warning on every truncated frame
  • leave as it is, but note this issue in documentation (will affect some tasks)

Metadata

Metadata

Assignees

Type

No type

Projects

No projects

Relationships

None yet

Development

No branches or pull requests

Issue actions