CELP Speech Codecs
Most modern speech codecs are based on the Codebook Excited Linear Predictive (CELP) codec model [49], where speech is synthesized according to a model of how the speech is produced by humans. The synthesis, in the decoder, follows the following steps:
1. A fixed code book generates an innovation. This simulates the air flow created by the lungs.
2. The innovation is passed through a pitch predictor, which is also often called adaptive code book. This creates pitch pulses in a similar way as the vocal cords create pulses in the air flow.
3. The excitation after the pitch predictor is passed through an LPC predictor. The LPC predictor shapes the signal in a similar way as the vocal tract would do for the air flow.
CELP codecs typically use block-based encoding where the sound is usually encoded in frames of 20 ms. Most CELP codecs also use the analysis-by-synthesis method in the encoder to determine innovation and excitation parameters [174]. Analysis-by-synthesis, in its purest form, is very complex since one basically executes the decoder with every possible parameter combination and selects the parameter combination that minimizes the error between the original signal and the synthesized signal. Preselection and structured search methods are therefore often used to reduce the complexity to a more manageable level.
A multitude of CELP speech codecs have been developed during the recent decades. Several codecs have been standardized by ITU-T, for example the Conjugate Structure
Algebraic CELP (CS-ACELP) codec [99]. Several variants of the G.729 code are also available in annexes. Other speech codecs have been especially designed for cellular systems and have been standardized by ETSI, 3GPP and TIA/EIA, for example GSM-EFR [10], AMR [4], AMR-WB [14] and TDMA-EFR [181].
Out of these codecs, AMR and AMR-WB are of especial interest for mobile IMS voice services since they were originally designed for cellular systems and were also designed to work well for a large set of different and varying channel conditions. These codecs have the following advantages over all the other alternatives:
• The quality is very good. The highest codec modes of AMR and AMR-WB give a quality for narrowband and wideband services respectively that is not exceeded by any other codec under the same operating conditions and given the same bit rate requirements.
• Both AMR and AMR-WB have several codec modes. The AMR codec includes eight codec modes ranging from 4.75 kbps to 12.2 kbps (see Table 5.1) and the AMR-WB codec has nine codec modes ranging from 6.60 kbps to 23.85 kbps (see Table 5.2). This allows for adapting the codec rate to different network loads and channel conditions. It also allows for developing several service variants that can be differentiated by quality; see Chapter 2.
• The complexity is reasonable. One of the fundamental requirements in the selection of these codecs was that the complexity must be manageable and should not increase the processing requirements in the mobile phones too much.
• These codecs, especially AMR, are today available in most GSM phones. Since there are over 2 billion GSM customers in the world in more than 210 countries [83], this gives excellent opportunities to maximize the quality by using Tandem-Free Operation (TFO) between IMS services and traditional circuit switched services; see also Section 5.7.1.
As described above, the AMR and AMR-WB codecs can operate at a number of codec modes. The possible codec modes and bit rates are shown in Tables 5.1 and 5.2.
|
Codec mode [kbps] |
Comment |
|
AMR 12.2 |
Same as GSM-EFR |
|
AMR 10.2 | |
|
AMR 7.95 | |
|
AMR 7.4 |
Same as TDMA-EFR |
|
AMR 6.7 |
Same as PDC-EFR |
|
AMR 5.9 | |
|
AMR 5.15 |
Default codec mode for PoC |
|
AMR 4.75 |
For the circuit switched GERAN and UTRAN systems, the adaptive feature of the AMR and AMR-WB codecs allows for adapting the source coding and channel coding bit rates so that the service quality can be optimized for a variety of operating conditions. For good
Table 5.2: Codec modes for AMR-WB.
Codec mode [kbps] AMR-WB 23.85 AMR-WB 23.05 AMR-WB 19.85 AMR-WB 18.25 AMR-WB 15.85 AMR-WB 14.25 AMR-WB 12.65 AMR-WB 8.85 AMR-WB 6.60
channel conditions, the best possible quality can be delivered by using the AMR 12.2 and a high rate AMR-WB mode respectively. For degraded channels, the bit rate used for source coding can be reduced, giving room for more channel coding.
For HSPA, the adaptive feature can be used in a similar way. Lower codec modes allow for using smaller transport blocks, which gives room for using more channel coding for the transport blocks. For high system loads, it is also advantageous to reduce the bit rate. This can be done in two ways:
1. A lower codec mode bit rate gives smaller packet sizes that can be transmitted with smaller transport blocks. Since the required transmission power is proportional to the transport block size, this means less interference, which allows more users into the cell.
2. Since a lower codec mode bit rate gives smaller packets, this can be used to encapsulate more packets into one transport block. This reduces the packet rate, which means that more TTIs will be available for other users.
The adaptive feature of AMR and AMR-WB is therefore important also for packet switched systems since it allows for making different trade-offs between capacity and quality for different system load levels.
Post a comment