This web site teaches the creation and operation of the MELP and MELPe vocoders, summarizes their most updated information, and provides useful resources and solutions related to MELP (MIL_STD-3005), and MELPe (STANAG-4591) vocoders.

Introduction to MELP and MELPe Vocoder

Mixed-excitation linear prediction (MELP) is a United States Department of Defense (US DoD) speech coding standard used mainly in military applications and satellite communications, secure voice, and secure radio devices. Its standardization and later development was led and supported by NSA, and NATO.

History of MELP and MELPe Vocoders

Early MELP Vocoder

The initial MELP was invented by Alan McCree around 1995 [1], and standardized in 1997 as MIL-STD-3005.[2] It surpassed other candidate vocoders in the US DoD competition, including:
(a) Frequency Selective Harmonic Coder (FSHC),
(b) Advanced Multi-Band Excitation (AMBE),
(c) Enhanced Multiband Excitation (EMBE),
(d) Sinusoid Transform Coder (STC),
(e) Subband LPC Coder (SBC), and
(f) Waveform Interpolative (WI) Coder.

MELPe achived better quality than the first five candidates, and thank to its lower complexity than the WI coder, the MELP vocoder won the DoD competition and was selected for MIL-STD-3005.

US MIL-STD-3005, from MELP to MELPe vocoder

Between 1998 and 2001, a new MELP-based vocoder was created at half the rate (i.e. 1200 bit/s) and substantial enhancements were added to the MIL-STD-3005 by SignalCom (later acquired by Microsoft), AT&T Corporation, and Compandent which included:

(a) additional new vocoder at half the rate (i.e. 1200 bit/s),
(b) substantially improved encoding (analysis),
(c) substantially improved decoding (synthesis),
(d) Noise-Preprocessing for removing background noise,
(e) transcoding between the 2400 bit/s and 1200 bit/s bitstreams, and
(f) new postfilter.

This fairly significant development was aimed to create a new coder at half the rate and have it interoperable with the old MELP standard. This enhanced-MELP (also known as MELPe) was adopted as the new MIL-STD-3005 in 2001 in form of annexes and supplements made to the original MIL-STD-3005, enabling the same quality as the old 2400 bit/s MELP's at half the rate. One of the greatest advantages of the new 2400 bit/s MELPe is that it shares the same bit format as MELP, and hence can interoperate with legacy MELP systems, but would deliver better quality at both ends. MELPe provides much better quality than all older military standards, especially in noisy environments such as battlefield and vehicles and aircraft.

NATO STANAG-4591 MELPe vocoder

In 2002, following extensive competition and testing, the 2400 and 1200 bit/s US DoD MELPe was adopted also as NATO standard, known as STANAG-4591.[3] As part of NATO testing for new NATO standard, MELPe was tested against other candidates such as France's HSX (Harmonic Stochastic eXcitation) and Turkey's SB-LPC (Split-Band Linear Predictive Coding), as well as the old secure voice standards such as FS1015 LPC-10e (2.4 kbit/s), FS1016 CELP (4.8 kbit/s) and CVSD (16 kbit/s). Subsequently, the MELPe won also the NATO competition, surpassing the quality of all other candidates as well as the quality of all old secure voice standards (CVSD, CELP and LPC-10e). The NATO competition concluded that MELPe substantially improved performance (in terms of speech quality, intelligibility, and noise immunity), while reducing throughput requirements. The NATO testing also included interoperability tests, used over 200 hours of speech data, and was conducted by 3 test laboratories worldwide. Compandent Inc, as a part of MELPe-based projects performed for NSA and NATO, provided NSA and NATO with special test-bed platform known as MELCODER device that provided the golden reference for real-time implementation of MELPe. The low-cost FLEXI-232 Data Terminal Equipment (DTE) made by Compandent, which are based on the MELCODER golden reference, are very popular and widely used for evaluating and testing MELPe in real-time, various channels & networks, and field conditions.

The NATO STANAG-4591 MELPe competition's combined performance index is illustrated in the figure below.

MELPeVsVocodersNATOP2

In 2005, a new 600 bit/s rate MELPe variation by Thales Group (France) was added (without extensive competition and testing as performed for the 2400/1200 bit/s MELPe) [4] to the NATO standard STANAG-4591.  The following features were added for the 600 bit/s STANAG-4591

(a) additional new vocoder at quarter the rate (i.e. 600 bit/s),

(b) transcoding between the 2400 bit/s and 600 bit/s bitstreams, and

(c) adjusted postfilter.

300 bit/s MELP Vocoder

In 2010 Lincoln Labs., Compandent, BBN, and General Dynamics also developed for DARPA a 300 bit/s MELP device .[5] Its quality is better than the 600 bit/s MELPe, but its delay is longer.

MELPe Vocoder Implementations

The MELPe has been implemented in many applications including secure radio devices, satellite communications, VoIP, and cellphone applications. In such applications, additional expertise is required for combating channel errors, packet loss, and synchronization loss. Such expertise requires the understanding of the MELPe's bits sensitivity to errors. The 2400 bit/s and 1200 bit/s MELPe include synchronization bit, which is useful in serial communications.

MELPe Intellectual property rights

Note that MELPe (and/or its derivatives) is subject to IPR licensing from the following companies, Texas Instruments (2400 bit/s MELP algorithm / source code), Microsoft (1200 bit/s transcoder), Thales Group (600 bit/s rate), AT&T (Noise Pre-Processor NPP), and Compandent.

About the MELPe Vocoder  Algorithm

MELPe - Enhanced Mixed Excitation Linear Predictive (MELP) vocoder, known as military standard MIL-STD-3005 and NATO STANAG 4591, is a triple-rate low rate coder that operates at 2400, 1200 and 600 bps.  It improves on previous military standards including the earlier MIL-STD-3005 (MELP), FS-1016 (CELP), FS1015 (LPC10e), and CVSD.

About MELP VOCODER (MIL-STD-3005)

General

The Mixed Excitation Linear Prediction coder is based on the traditional Linear Prediction Coder (LPC) parametric model, but also includes five additional features. They are mixed excitation, aperiodic pulses, adaptive spectral enhancement, pulse dispersion, and Fourier magnitude modeling. A MELP frame interval is 22.5 ms ± 0.01 percent in duration and contains 180 voice samples (8,000 samples/second). These features are illustrated in the MELP decoder block diagram shown in the figure below.

The mixed excitation is implemented using a multi-band mixing model. This model can simulate frequency dependent voicing strength using an adaptive filtering structure implemented with a fixed filter bank. The primary effect of this mixed excitation is to reduce the buzz usually associated with LPC vocoders, especially in broadband acoustic noise.

When the input speech is voiced, the MELP coder can synthesize using either periodic or aperiodic pulses. Aperiodic pulses are used most often during transition regions between voiced and unvoiced segments of the speech signal. This feature enables the decoder to reproduce erratic glottal pulses without introducing tonal sounds.

The adaptive spectral enhancement filter is based on the poles of the linear prediction synthesis filter. Its use enhances the formant structure of the synthetic speech and improves the match between the synthetic and natural bandpass waveforms. It also gives the synthetic speech a more natural quality.

MELPDecDiagramMILSTD3005
Analog Specification

The recommended analog requirements for the MELP coder are for a nominal bandwidth ranging from 100 Hz to 3800 Hz. Although the MELP coder will operate with a more band limited signal, performance degradation will result. To ensure proper operation of the MELP coder, the A-D conversion process should produce peak values of (or near) -32768 and 32767 (16-bit signed samples). Additionally, the coder should have unity gain, which means that the output speech level should match that of the input speech.

Parameter quantization and encoding

The MELP parameters which are quantized and transmitted are the final pitch; the bandpass voicing strengths; the two gain values; the linear prediction coefficients; the Fourier magnitudes; and the aperiodic flag. The use of the specified quantization procedures is required for interoperability among various implementations.

About MELPe Vocoder (STANAG-4591)

General

The Enhanced Mixed Excitation Linear Prediction coder is MELPe block diagram is illustrated shown below. Substantial enhancements were added to the MIL-STD-3005 by SignalCom (later acquired by Microsoft), AT&T Corporation, and Compandent which included:

(a) additional new vocoders at half and quarter the rate (i.e. 1200 and 600 bit/s),

(b) substantially improved encoding (analysis),

(c) substantially improved decoding (synthesis),

(d) Noise-Preprocessing for removing background noise,

(e) transcoding between the 2400 bit/s and 1200 bit/s bitstreams, and 2400 bit/s and 600 bit/s bitstreams, and

(f) new postfilter.

The STANAG-4591 MELPe vocoder encoding operation is performed as follows.

  • At 2400 bit/s the MELPe encodes voice into 22.5 msec frames (which is 180 sample per frame for speech sampled at 8000 samples/sec), using 54 bits per frame (including 1 synchronization bit).
  • At 1200 bit/s, the MELPe encodes 3 analyzed speech frames into 67.5 msec super-frames (540 samples) , using 81 bits per such a super-frame (including 1 synchronization bit).
  • At 600 bit/s, the MELPe encodes 4 analyzed speech frames into 90 msec super-frames (720 samples), using 54 bits per such a super- frame.

That is illustrated in the figure below (from the STANAG-4591 standard) and summarized in the table below.

Rate Frame Size (samples)
Frame Size (msec) Bits / Frame
2400 bit/s 180 22.5 54
1200 bit/s 540 67.5 81
600 bit/s 720 90.0 54

Table 1. MELPe vocoder's frame sizes and the number of bits per frame for the three rates

STANAG4591_MELPe

Secure Communications using MELPe Vocoder

SCIP-210

In 2006, the Secure Communication Interoperability Protocol (SCIP) is a multinational communications standard for application layer protocol developed by the National Security Agency (NSA) to enable interoperable secure communications among allies and partners around the globe. [6] The SCIP-210 Signaling Plan is the specification that defines the application layer signaling used to negotiate a secure end-to-end session between two communication devices, independent of network transport. The SCIP supported channels include digital cellular systems such as GSM and CDMA, digital mobile satellite systems, and a variety of other narrowband digital systems.  Its Secure and Clear MELPe Voice applications use MELPe codec, G729D, Voice Activity Detection based Discontinuous Voice (DTX Voice), and Comfort Noise, both are implemented similarly to the GSM standard. Compandent's MELPe software under Android was used and tested by NATO also as part of the development of SCIP.

Tactical Secure Voice Cryptographic Interoperability Specification (TSVCIS)

In 2009, the Tactical Secure Voice Cryptographic Interoperability Specification (TSVCIS) was created by National Security Agency (NSA) and Naval Research Laboratory (NRL). [7]  It includes the following two main voice categories:

  • TSVCIS Narrowband (NB) Waveform which is based on STANAG-4591 the 600 bit/s, 1200 bit/s and 2400 bit/s NATO Interoperable Narrow Band Voice Coder, i.e. MELPe codec,
  • TSVCIS Wideband (WB) Waveform which operates at 8000, 10000, and 12000 bit/s and is based on the STANAG-4591 codec and additional encoded voice wideband parameters that enables the MELPe-base multi-rate speech coding.

Both categories may have different modes including Forward Error Correction (FEC) using different blocks of BCH (Bose-Chaudhuri-Hocquenghem). The FEC protection gives a tremendous advantage in highly degraded channels so that speech intelligibility will be maintained even at very high rates of channel bit errors.

TSVCIS includes also a special WB Voice 16 Gateway mode at 16000 bit/s channel, using band FEC, which is used when NB to WB crossbanding has occurred.

References

  1. "A Mixed Excitation LPC Vocoder Model for Low Bit Rate Speech Coding," Alan V. McCree, Thomas P. Barnweell, 1995 in IEEE Trans. Speech and Audio Processing (Original MELP).
  2. "Analog-to-Digital Conversion of Voice by 2,400 Bit/Second Mixed Excitation Linear Prediction (MELP)," US DoD (MIL_STD-3005, Original MELP).
  3. "The 1200 and 2400 bit/s NATO Interoperable Narrow Band Voice Coder," STANAG-4591, NATO.
  4. "MELPe Variation for 600 bit/s NATO Narrow Band Voice Coder, STANAG-4591," NATO.
  5. Alan McCree, “A scalable phonetic vocoder framework using joint predictive vector quantization of MELP parameters,” in Proc. IEEE Int. Conf. Acoust., Speech, Signal Processing, 2006, pp. I 705–708, Toulouse, France.
  6. "SCIP Signaling Plan," Revision 3.6 January 8, 2013.
  7. “Tactical Secure Voice Cryptographic Interoperability Specification (TSVCIS) Version 2.1,” July 2, 2012.

Find out more in MELP FAQ

Find out more about MELP vocoder in Frequently Asled Questions (FAQ)

find out more about MELPe software.

To find out more about Compandent's STANAG-4591 MELPe software...

find out more about MELPe hardware.

To find out more about Compandent's STANAG-4591 MELPe hardware...