CELT 0.5.2 automated testing results

The automated testing routine for CELT involves running roughly 6 months of audio through the CELT encoder and decoder across a wide variety of modes and configurations. All user accessible modes receive at least some level of coverage. 48kHz mono receives automated quality testing of all frame sizes and most reasonable bit-rates. Common configurations receive extensive fuzz testing under valgrind. ARM (OpenMoko), x86_64 (Fedora 10), and x86 (Fedora 10) are used in testing.

This level of extensive testing is made possible by the large multiple of real-time that CELT operates at on modern computing hardware.

Keep in mind that as of 0.5.2 CELT is still a work in progress. Neither the API/ABI, nor the bit-stream are stable. Also, while we do not expect it to set your house on fire, we cannot guarantee that it won't. Spontaneous combustion is specifically not covered by these tests.

Automated Quality testing

Value	Meaning
0	Imperceptible
-1	Perceptible but not annoying
-2	Slightly annoying
-3	Annoying
-4	Very annoying
Definitions of PEAQ ODG scores

The quality of CELT 0.5.2 at 48 kHz mono was assessed for 51,848 combinations of bitrate, frame size, and complexity using PQEvalAudio, an implementation of PEAQ. The PEAQ objective difference grade does not always accurately reflect human opinion but its automated nature permits testing large numbers of configurations. These quality tests would require over 52 days of continuous listening if conducted with a single human reviewer.

Complexity 9 PQEvalAudio map

This illustration demonstrates the quality/bitrate/delay trade-offs available in CELT in full (default) complexity mode.

Equal-quality contours are drawn at -0.5, -1, -2, and -3.

Complexity 1 PQEvalAudio map

This illustration demonstrates the quality/bitrate/delay trade-offs available in CELT in low complexity mode.

Equal-quality contours are drawn at -0.5, -1, -2, and -3.

Comparison with CELT 0.5.1 (complexity 9)

For each test point the 0.5.1 PQEvalAudio score was subtracted from the CELT 0.5.2 score.

Positive (blue) values in the chart indicate improvement according to PQEvalAudio, while negative (red) values indicate quality loss.

Comparison with CELT 0.5.1 (complexity 1)

For each test point the 0.5.1 PQEvalAudio score was subtracted from the CELT 0.5.2 score.

Positive (blue) values in the chart indicate improvement according to PQEvalAudio, while negative (red) values indicate quality loss.

"make check" tests

CELT includes a number of unit tests that exercises internal components of CELT.

Test	x86_64	x86	ARM
cwrs32-test	Pass	Pass	Pass
dft-test	Pass	Pass	Pass
ectest	Pass	Pass	Pass
laplace-test	Pass	Pass	Pass
mathops-test	Pass	Pass	Pass
mdct-test	Pass	Pass	Pass
real-fft-test	Pass	Pass	Pass
type-test	Pass	Pass	Pass

All modes test

A short audio file is run through 27,525,120 CELT configurations (all frame sizes, all bytes-per-frame from 8-200, and sample rates from 32000-96000 in 100Hz increments). Because of CPU requirements this test is only run only in low complexity mode. In order to pass, these cycles of "testcelt" must complete without error.

x86_64: Pass
x86_64 fixed point: Pass

Popular modes fuzz-test

Two hours of audio extracted from several dozen albums and live recordings are run through CELT at 32, 44.1, and 48 kHz at frame sizes of 64, 96, 128, 192, 256, 384 and 512 samples and at 48, 64, and 128kbit/sec in mono and stereo mode. One tenth of a percent of the encoded bits are randomly flipped. In order to pass, these cycles of "testcelt" must complete without error. This test is run under valgrind and with assertions enabled for extra error sensitivity.

x86_64: Pass
x86_64 alloca (psedo-stack mode): Pass
x86: Pass
x86 fixed point: Pass