Voice
We have already discussed the adaptive multirate vocoder and wideband vocoder specified by 3GPP1. This is a speech synthesis codec and, as a result, provides us, conveniently, with the ability to support speech recognition, also specified by 3GPP1. The better the accuracy of the speech recognition (the distance from user to user), the higher the value. Similarly, the better the voice quality (measured on a mean opinion score), the more user value we deliver, but the more it costs to deliver, because of a higher coding rate.
These audio codecs use a time domain to frequency domain transform (discrete cosine transform) to expose redundancy in the input signal (see Figure 7.1). We send filter coefficients that describe the spectral/harmonic (frequency domain) content of the 20-ms speech sample. MPEG-4 also has an audio coding standard including a very low bit rate harmonic codec (2 to 4 kbps) and a codebook codec (4 to 24 kbps). The codebook codec stores waveform samples in the decoder. When the digital filter coefficients are received, the decoder goes and fetches the closest-match waveform from the decoder—hence, the need for good memory fetch management in these devices. The intention is that the MPEG-4 CELP (codebook excitation linear prediction) codec will be compatible with the AMR-W codec, which has a similar codec rate range.
132 times read
|
Related news
|
| No matching news for this article |
|
Did you enjoy this article?
(total 0 votes)
|