Corrupt Audio Files: How to Repair

Today we will provide a few tricks to recover corrupt audio files.

Whether you’re a professional using Apple’s Logic Studio or AVID Pro Tools, an extreme biker torturing the last GoPro camera or a video aficionado editing the last family gathering with Windows Movie Maker, you can end up fighting with a corrupt WAV, M4A, AIF or QuickTime MOV audio file.

Your fight is really unfair: Even if you are a skilled computer guy, you probably will spend hours on that, to no avail. Professional video and audio repair services exist for a reason. We have spent the last 5 years developing recovery tools for Mac and a web video repair app to give you predictable results.

You’ve been warned.

Do It Yourself: Linear PCM Audio Repair

Note that the techniques below only work for Linear PCM format and for containers mentioned below. We won’t be able to repair an .m4a file containing AAC with those methods, for example.

Unfortunately, repairing compressed audio formats like AAC or ALAC (Apple Lossless) is much more complex and we’ll leave it for future posts.

Canon 5D Mk III video files playing only first second of audio and black image

We are seeing a pattern with damaged Canon 5D Mark III videos:
Corrupt files are playable with usual tools (QuickTime Player, VLC) but video is black or gray and audio is OK during first second, then becomes white noise.

Note: With Treasured, our diagnostics app , you can preview those corrupt files and then repair them with through our Repair Service. This post is not about repairing videos, but about understanding why we see this pattern.

Let’s do a post-mortem of an unplayable video and figure out why.

Playing Detective

With the help of an hexadecimal editor, we look at the beginning of the file. Trained eyes will see the normal header of a QuickTime .MOV movie. It contains a moov atom, which usually is at the end of the file. Strange…

0000000: 0000 0018 6674 7970 7174 2020 2007 0900  ....ftypqt   ...
0000010: 7174 2020 4341 4550 0001 7fe8 6d6f 6f76  qt  CAEP....moov
0000020: 0001 004a 7564 7461 0000 0026 434e 4356  ...Judta...&CNCV
0000030: 4361 6e6f 6e41 5643 3030 3130 2f30 332e  CanonAVC0010/03.
0000040: 3030 2e30 302f 3030 2e30 302e 3030 0000  00.00/00.00.00..
0000050: 000c 434e 444d ffd8 ffd9 0001 0010 434e  ..CNDM........CN
0000060: 5448 0000 36e3 434e 4441 ffd8 ffe1 0e26  TH..6.CNDA.....&
0000070: 4578 6966 0000 4949 2a00 0800 0000 0900  Exif..II*.......



If we have a moov atom, let’s check if it is consistent and what is inside it. For this we use Dumpster or Atom Inspector, two free apps that you can download from developer.apple.com (free developer account required).

Surprise, our moov atom is a valid! Complete with 3 tracks (video, audio, timecode) and metadata.
This explains why the corrupt video file opens: the valid moov atom at the beginning of the file contains all the information required to configure the movie and the tracks, so QuickTime Player initializes and can start decoding media.

One Second of Audio

Now let’s look at our audio track. It corresponds to the second ‘trak’ atom. Tables ’stco’ and ’stsc’ tell us where audio media is fetched. First entry of tables gives us: at address 98312, 48048 samples.



Now let’s look at what we have at this address. Addr 98312 (hexadecimal 0×18008) is just after ‘mdat’ atom, and Linear PCM audio data starts here as expected.. Everything happens like in a valid movie file.


0018000: 0bd1 afd4 6d64 6174 0b0c 0b0c bc0b bc0b  ....mdat........
0018010: 780b 780b 9b0b 9b0b 390b 390b 190a 190a  x.x.....9.9.....
0018020: 0109 0109 0d08 0d08 3507 3507 e106 e106  ........5.5.....
0018030: 3507 3507 4d07 4d07 a506 a506 b006 b006  5.5.M.M.........
0018040: 8007 8007 9607 9607 f706 f706 9306 9306  ................
0018050: 4b06 4b06 8b05 8b05 cf04 cf04 3a04 3a04  K.K.........:.:.

Black Video

Now let’s looks at first video entry: Address is correct, but length is not. This explain the black image: H.264 cannot be correctly decoded. If you dig deeper, you see that audio and video tables don’t correspond to actual data inside the file. Maybe those tables refer to the last recorded video.

If you look at more corrupt videos, the same facts hold true. Everything can finally be explained:

  • A fixed-length ‘moov’ atom is written at the beginning of the file. This is why it can be open.
  • Audio and Video tables are wrong. This is why video is corrupt.
  • But since first Audio block is always at the same address, and has same length, it’s correctly reproduced.

What it tells about Canon 5D Mark III firmware

Canon Mk III firmware writes a placeholder ‘moov’ atom at the beginning at the file, then writes media data (video and audio) while camera is recording. This placeholder is not empty, it could actually be the same ‘moov’ as in the last recorded video.

When recording ends, the camera overwrites the placeholder with real ‘moov’ atom containing good media tables.

If this last operation doesn’t complete, the video is corrupt.

Cameras usually write the ‘moov’ atom at the end of the file: Data is written in the same order as it is produced. But Canon 5D Mk3 firmware does it different as we have seen. This is possible because camera limitations dictate a maximum ‘moov’ atom size, so the corresponding space can be “booked” at the beginning of the file (placeholder) and then overwritten.

I can only see one advantage to this: Since ‘moov’ atom is at the beginning of the file, the file can be streamed. We talk of fast-start, streamable videos. This is important for Internet applications, but honestly I don’t see how this matters for Canon 5D workflows.

Apple’s new Fusion Drive and Video Repair

One of the most exciting announcement during Apple Oct. 23 event was the “Fusion Drive”: A mix of solid state drive (SSD) and mechanical hard disk, that brings the best of the two worlds: High-speeds of SSDs and high-capacity of HDDs.

How the “magic” works is explained here.

Fusion Drive and Video

The Catch

But if you read Apple’s technical note, you will see that this doesn’t work great for high-speed video capture.

My understanding is that high-speed, high-volume writing operations have a serious bottleneck, because when the 4 GB SSD write cache is full, it has to be moved to the hard disk, and this causes some latency that probably drops frames during video capture.

Video Repair?

How does it affect video repair?

Video capture limitations will probably keep video professionals away from those Fusion Drives, at least for video capture. High-throughput Thunderbolt drives are a better fit.

But Fusion Drives will become mainstream among casual users. I fear that this will make our video repair job harder, in particular for DeepMediaScan jobs. (DeepMediaScan is used when footage is not found inside files, but scanned directly on disk)

With traditional disks, footage is written sequentially as the disk fills up, so it’s possible to for DeepMediaScan to extract it consistently.
In a Fusion Drive, it’s more complex. Footage is written to an SSD cache and then, maybe it’s transferred to the hard disk, or maybe not. In the end, there is more fragmentation, and footage recovery is more challenging.

As soon as I get get my hands on a Mac with a Fusion Drive, I will carry out repair tests and provide advice about Fusion Drive usage.


Stop Whining and Start Measuring

Last month, I have started to collect data about how visitors are browsing our website. Just standard analytics at work here.

First surprise: 40% of Treasured downloads are from Windows users, who will obviously never be able to run our Mac video recovery app.

Despite clearly stating on the website that this is a Mac app, a lot of Windows people is reaching the big Download button. This is not cluelessness, just the standard practice in a “we don’t read, we scan” world.

Second surprise: Among Mac people with a fresh copy of Treasured just downloaded, only 60% perform a diagnostics on a damaged video. In other words, 40% never get any value from the software.

I could give more examples. Every time I have measured a new metric related to our repair service use, I have discovered that it was grossly inefficient.

The most amazing thing: Those problems are usually very easy to fix. You can just add a warning for Windows guys trying to download Treasured and redirect them to MP4repair.org, for example.

This is a real eye-opener. Don’t start working on hard problems until you get the basic stuff right.

The Secret: Measuring