show App
Treasured's Movie Repair Guide -- AAC

AAC is a modern audio encoding format (Advanced Audio Coding), that is quickly becoming the standard for capture and delivery in consumer markets.
For example, iTunes encodes songs in AAC format by default. Music sold on the iTunes Store comes encoded in AAC format.
AAC achieves better quality than MP3 at similar bit-rate.


Treasured can detect AAC in damaged audio files.

AAC Repair Techniques

Repairing raw AAC data requires that each frame is identified, and then re-indexed or rewrapped into a playable container format. The hardest part is to identify AAC frames, since they don't have a header and have a variable length. Below we present 3 methods of increasing complexity, that produce respectively audible audio, audio with glitches, and good audio.

Pattern matching: This is an empirical method that doesn't try to validate AAC by the coding rules, but by patterns observed in the bitstream. Pattern matching uses bits matching, block lengths filters, ... With this simple technique, used until 2008 by Aero Quartet, audio becomes audible, but quality is not good.

Surface decoding: This method decodes the beginning of an alleged AAC frame to validate it. It can check the a few important parameters, like gain, maxsfb, frame size and termination, have coherent values. It achieves a good result, but not perfect, with one fault every around 500 frames. It was used until 2009 by Aero Quartet.

Decode-Validate: This method, introduced in 2009, is giving near-perfect results. It uses a AAC decoder to validate alleged AAC frames. Frames that pass the test are then wrapped in a QuickTime movie and validated again.

The output of AAC repair, even with Decode-Validate, usually needs to be re-processed to remove stutter and extra duration due to rounding errors.

Guidelines for parsing an AAC bitstream with "Surface decoding"

AAC uses variable-length blocks that are difficult to parse and identify.

In most common case, stereo LC AAC, we'll have the following bitstream:

[0x21  ][0x1b  ]
CPE |  ||| ics shape
    |  ||window sequence
    |  ||always 0
    |  common window
    element instance tag

Otherwise, the common case continues like this:

[0x21  ][0x1b  ][0xd4  ][0x4d  ]
              |   ||  gain (8 bits)
              |   |mask (2 bits)
              |   predictor (1 bit)
              maxsfb (6 bits)

maxsfb must be between 0 and the value from this table:

    12                      41          >96 kHz
    12                      47          >64
    14                      49          >48
    15                      51          >32
    15                      47          >24
    15                      43          >16
    15                      40          >8

It's common in a given file to only see two maxsfb values: one for EIGHT-SHORT-SEQUENCE, one for the rest of blocks. maxsfb value is usually close to or equal to the maximum of the table.

After gain, we have the scale factor data (sequence of 4 and 5 bits):

[0x21  ][0x1b  ][0xd4  ][0x4d  ][0x05  ][0x20  ][0x25  ][0x01  ]
                             0,0  +1  0,1  +16 0,2  +5  0,3  +2
                                            17	    22       24

[0x09  ][0x8a  ][0x22  ][0x40  ][0x20  ]
 0,4  +6  0,5  +9  0,6  +4  0,7  +4
      30       39       43       47

We verify that after several sequences, summing the 5 bits increments, we arrive exactly on maxsfb (47 in this case).
Note that 5 bits increments should never be 0, and the 31 value triggers a new reading of 5 bits (example: 31, 5 gives 36)


windows sequence is always 10 (2) in eight-short-sequence frames.

[0x21  ][0x46  ][0xbd  ][0x65  ][0xAE  ][0x2c  ][0x26  ][0x6e  ]
         -/ ---/------/-/-------/---/--/---/--/---/--/---/--/
  always 10	  |   |    | gain(8) 0,0  6 1,0 6  2,0 +1 2,1 +5
              |   |    mask                         1      6
              |   scale factor grouping (7 bits)
              maxsfb (4 bits)

Note that here maxsfb is 6. Scale factor grouping with 2 zeros tells that we have 3 groups. Therefore, we have 3 sequences of alternate 4 and 3 bits that sum exactly 6.

Multiple Windows

This block has a different layout with 2 consecutive sets of data.

[0x20  ][gain  ][0x    ][0x4d  ]
                || |   |  predictor
                || |   maxsfb
                || shape
                |window sequence
         always 0

Just after predictor, we have the scale factor data (sequence of 4 and 5 bits).

Multiple Windows and EIGHT-SHORT-SEQUENCE

This combination is possible, and has the following layout:

[0x20  ][gain  ][0x4d  ][0xe6  ][0x1f  ]
                || | |  grouping
                || | maxsfb (4 bits)
                || shape
                |window sequence=2
         always 0

Just after grouping, scale factor data starts (sequence of 4 and 3 bits)

Degenerated block

You can find very short blocks with no data:

[0x20  ][0x99  ][0x00  ][0x02  ][0x64  ][0x00  ][0x0e  ]
CPE     -------/    -----/    -------/    -----/    END
           gain   maxsfb=0       gain   maxsfb=0

Note the layout of the two windows (second window starts with gain)

Degenerated block with padding

You can find blocks that appear normal in size but that contain no data:

[0x20  ][0x68  ][0x20  ][0x01  ][0xa0  ][0x80  ][0x0d  ][0xED  ]...68 bytes....
CPE      -------/    -----/||||-------/    -----/    FIL---/-------/  END
         gain      maxsfb=0||||  gain    maxsfb=0       type  length=0x68
                           |||no gain control data
                           ||no tns data
                           |no pulse data

AAC mono

Starts with this:

[0x00  ]

External links​/source/
Free Preview of AAC inside corrupt M4A files with:


Treasured icon
Download button
Free Preview of corrupt videos
Version 4.5 • Nov '19 • 10 MB • Mac OS X