AAC is a modern audio encoding format (Advanced Audio Coding), that is quickly becoming the standard for capture and delivery in consumer markets.
For example, iTunes encodes songs in AAC format by default. Music sold on the iTunes Store comes encoded in AAC format.
AAC achieves better quality than MP3 at similar bit-rate.
- AAC is very difficult to repair due to lack of headers in AAC frame.
- AAC can be found in audio files, usually in QuickTime and MP4 containers with the following extensions: .m4a .m4b .m4p .mov (and rarely as raw streams in .aac .adif and .adts)
- AAC is also commonly found in movie files, and it usually pairs with MPEG4 or H264 video formats.
Detection
Treasured can detect AAC in damaged audio files.
AAC Repair Techniques
Repairing raw AAC data requires that each frame is identified, and then re-indexed or rewrapped into a playable container format. The hardest part is to identify AAC frames, since they don't have a header and have a variable length. Below we present 3 methods of increasing complexity, that produce respectively audible audio, audio with glitches, and good audio.
Pattern matching: This is an empirical method that doesn't try to validate AAC by the coding rules, but by patterns observed in the bitstream. Pattern matching uses bits matching, block lengths filters, ... With this simple technique, used until 2008 by Aero Quartet, audio becomes audible, but quality is not good.
Surface decoding: This method decodes the beginning of an alleged AAC frame to validate it. It can check the a few important parameters, like gain, maxsfb, frame size and termination, have coherent values. It achieves a good result, but not perfect, with one fault every around 500 frames. It was used until 2009 by Aero Quartet.
Decode-Validate: This method, introduced in 2009, is giving near-perfect results. It uses a AAC decoder to validate alleged AAC frames. Frames that pass the test are then wrapped in a QuickTime movie and validated again.
The output of AAC repair, even with Decode-Validate, usually needs to be re-processed to remove stutter and extra duration due to rounding errors.
Guidelines for parsing an AAC bitstream with "Surface decoding"
AAC uses variable-length blocks that are difficult to parse and identify.
In most common case, stereo LC AAC, we'll have the following bitstream:
[0x21 ][0x1b ] 0010000100011011 --/---/||-/| CPE | ||| ics shape | ||window sequence | ||always 0 | common window element instance tag
- If common window is 0, we have a Multiple Window (see below)
- If window sequence (2 bits) is 2, we have a "EIGHT_SHORT_SEQUENCE" (see below)
Otherwise, the common case continues like this:
[0x21 ][0x1b ][0xd4 ][0x4d ] 00100001000110111101010001001101 -----/|-/-------/ | || gain (8 bits) | |mask (2 bits) | predictor (1 bit) maxsfb (6 bits)
maxsfb must be between 0 and the value from this table:
EIGHT-SHORT-SEQUENCE OTHERWISE SAMPLING RATE 12 41 >96 kHz 12 47 >64 14 49 >48 15 51 >32 15 47 >24 15 43 >16 15 40 >8
It's common in a given file to only see two maxsfb values: one for EIGHT-SHORT-SEQUENCE, one for the rest of blocks. maxsfb value is usually close to or equal to the maximum of the table.
- Predictor should always be 0.
- If mask (2 bits) is equal to 1, then between mask and gain we insert maxsfb bits.
In EIGHT_SHORT_SEQUENCE case, this value is multiplied by the number of windows. (see below) - Gain is usually between 100 and 200, and usually does not vary a lot in a given file.
After gain, we have the scale factor data (sequence of 4 and 5 bits):
[0x21 ][0x1b ][0xd4 ][0x4d ][0x05 ][0x20 ][0x25 ][0x01 ] 00100001000110111101010001001101000001010010000000100101000000010 ---/----/---/----/---/----/---/----/ 0,0 +1 0,1 +16 0,2 +5 0,3 +2 17 22 24 [0x09 ][0x8a ][0x22 ][0x40 ][0x20 ] 0000100110001001001000100100000000100000 ---/----/---/----/---/----/---/----/ 0,4 +6 0,5 +9 0,6 +4 0,7 +4 30 39 43 47
We verify that after several sequences, summing the 5 bits increments, we arrive exactly on maxsfb (47 in this case).
Note that 5 bits increments should never be 0, and the 31 value triggers a new reading of 5 bits (example: 31, 5 gives 36)
EIGHT-SHORT-SEQUENCE
windows sequence is always 10 (2) in eight-short-sequence frames.
[0x21 ][0x46 ][0xbd ][0x65 ][0xAE ][0x2c ][0x26 ][0x6e ] 0010000101000110101111010110010110101110001011000010011001101110 -/ ---/------/-/-------/---/--/---/--/---/--/---/--/ always 10 | | | gain(8) 0,0 6 1,0 6 2,0 +1 2,1 +5 | | mask 1 6 | scale factor grouping (7 bits) maxsfb (4 bits)
Note that here maxsfb is 6. Scale factor grouping with 2 zeros tells that we have 3 groups. Therefore, we have 3 sequences of alternate 4 and 3 bits that sum exactly 6.
Multiple Windows
This block has a different layout with 2 consecutive sets of data.
[0x20 ][gain ][0x ][0x4d ] 001000001001101101101010001001101 |-/|-----/| || | | predictor || | maxsfb || shape |window sequence always 0
Just after predictor, we have the scale factor data (sequence of 4 and 5 bits).
Multiple Windows and EIGHT-SHORT-SEQUENCE
This combination is possible, and has the following layout:
[0x20 ][gain ][0x4d ][0xe6 ][0x1f ] 0010000010011011010011011110011000011111 |-/|---/------/ || | | grouping || | maxsfb (4 bits) || shape |window sequence=2 always 0
Just after grouping, scale factor data starts (sequence of 4 and 3 bits)
Degenerated block
You can find very short blocks with no data:
[0x20 ][0x99 ][0x00 ][0x02 ][0x64 ][0x00 ][0x0e ] 00100000100110010000000000000010011001000000000000001110 CPE -------/ -----/ -------/ -----/ END gain maxsfb=0 gain maxsfb=0
Note the layout of the two windows (second window starts with gain)
Degenerated block with padding
You can find blocks that appear normal in size but that contain no data:
[0x20 ][0x68 ][0x20 ][0x01 ][0xa0 ][0x80 ][0x0d ][0xED ]...68 bytes.... 0010000001101000001000000000000110100000100000000000110111101101000 CPE -------/ -----/||||-------/ -----/ FIL---/-------/ END gain maxsfb=0|||| gain maxsfb=0 type length=0x68 |||| |||no gain control data ||no tns data |no pulse data |predictor
AAC mono
Starts with this:
[0x00 ] 00000000 SPE
External links
http://wiki.multimedia.cx/index.php?title=Understanding_AAChttp://getid3.sourceforge.net/source/module.audio.aac.phps