Voice Memos “Epidemic Failure” and How to Avoid it

Since fall of 2018, at Aero Quartet we have observed a surge of requests to fix audio recordings made with an iPhone and Voice Memos app.

TL;DR

  • In iOS 12, Voice Memos app has a feature called Replace to edit an original recording.
    Replace is not reliable and can corrupt the original recording.
  • Corrupted recordings can be repaired, but are significantly shorter in duration than originals.
    The missing duration is lost forever.

What I Have Observed

I have pulled data from recent Voice Memos requests to see what they have in common.
A pattern quickly emerged:

  • All the failures happened on phones running iOS 12, and in consequence, the redesigned Voice Memos application.
    The iPhone model doesn’t matter.
  • Failed recordings are always truncated, ie shorter in duration than what the user reports to have recorded, and also shorter than what the phone displays.
  • No way to reproduce the recording on the phone or on other devices
  • 

I have plotted here a dozen of requests:
The green line is where the dots should be. Unfortunately, all the corrupted recordings are significantly shorter than the original, this is why the dots are below the line.
On average, only one quarter to one third of the audio is still present in the damaged file.

Reproducing the Issue

I made a few experiments to reproduce the problem. On my iPhone 8 and iOS 12, the Voice Memos app has two recording settings: Compressed and Uncompressed.

The samples recorded with each setting have the following audio profiles:
Compressed: AAC format (mono 48000Hz), bitrate 64 kbit/s
Uncompressed: Apple Lossless format (ALAC mono 48000Hz), bitrate 720 kbit/s

When I compare this to the damaged samples that I am receiving, the first thing that strikes me is that format and bitrate are different.
All damaged files come in AAC format, but with a frequency of 44100Hz and a bitrate around 225 kbit/s.

After playing a bit with the app, I finally managed to produce samples with those settings when using the Edit Recording function.
Whenever you use the Replace button to alter the original recording, Voice Memos performs an audio transcoding operation in the background, which converts the format into 44100Hz.

This operation is necessary, because replacing audio disrupts the organization of encoded media. In other words, it cannot be done directly on encoded media, as replacement involves a second audio track, and the two tracks need to be “flattened”, to use the technical word, into a new encoded audio track.

The first take-away is that the failure only occurs when Replace feature is used.

Mechanism of Failure

Now, why does the failure happen? Although I don’t have have access to the source code, I can guess how and why the problem occurs.

The first hint is the disconnect between the user interface and the reality:
Whereas the app gives the impression that Replace is a real-time thing, flattening the recording takes some time to complete.
Even if you replace a few seconds in a 2 hours long recording, your iPhone will need several seconds, maybe even minutes for very long recordings, to produce the modified recording.
This operation happens in the background, but Voice Memos doesn’t reflect this. Instead of showing a progress bar, and labelling the recording as “in process” or “unavailable”, it makes it look like the change is done and the recording is available when in reality it’s not.

From here I can guess at several possible flaws in the Voice Memos software:

  • If flattening fails or does not complete, the recording being flattened overwrites the original, as is. This explains why it’s truncated.
  • The user interface lets you initiate operations (like a new Edit) on a recording being flattened and this makes the app crash.
  • Flattening and replacing the recording occurs as a single-step operation, without proper exception handling. Second step, replacement, should never happen if first step hasn’t been completed and verified first.
  • Instead, the software should do it in two steps: First, flattening would make the original clip unavailable and a progress indicator would show on the screen that it’s because it’s being processed. Second, in case of failure, the software would show an error message and revert the state of the recording. Finally, in case of success, the modified clip would replace the original one.

There’s never a good excuse for the sloppy implementation of a feature like Replace.
I suspect that Apple engineers wanted first and foremost to make Voice Memos look cool and to give the illusion of direct manipulation and instant effects. But technology doesn’t work that way, flattening takes time, and not reflecting it in the user interface ended, I believe, in a flawed design.

Letting the user do a new edit while flattening takes place, is when user interface and processing enter in conflict. When the app crashes, and probably when corrupted recordings are generated.

How to Avoid it

Having understood the failure mechanism, I can now give some advice to avoid it:

  1. Avoid using the “Edit Recording” as it seems that it’s not a reliable feature
  2. If you have to use it, take some precautions:

  3. Always do “Duplicate” and create a back-up of your recording before using “Edit Recording”
  4. After editing a Voice Memo, give enough time to your phone to perform the “flattening”.
    In particular, don’t do a second “Edit Recording” or a “Share…” immediately after an edit.

What Our Service Can Do

I often tell my customers that we do magic, but no miracles!
In the Voice Memos case, the magic is to fix the corrupted recording, and the miracle would be to restore the full duration, when only part of it is present in the damaged file.

If you have a corrupted Voice Memo (or any unplayable audio file, from any device), you can use our Treasured service to determine the true duration of audio present in the file, and then to repair it. Unfortunately, for the reasons I have explained above, in Voice Memos the duration is usually significantly shorter than expected, but many customers still consider that for important material, it’s better to recover one third of it than to get nothing.

Interviewed

Last week-end, I had the opportunity to be interviewed by Adam Forgione, the guy behind Select Music Library, the website that helps filmmaker find original music for their productions and musicians get paid for their work.

Adam has been a customer of Aero Quartet in the past. He wanted to share the good experience he had with us with fellow filmmakers, so he asked me to answer a few questions.

All you ever wanted to know about video repair…

The result is quite good, thanks Adam for pulling this off:

“My interview lasted almost an hour but it’s filled with gold.
Here are the chapters if you want to skip to what you want:”

1:36 – About AeroQuartet
3:31 – How It Works / The Process
10:29 – Common types of cameras involved
11:14 – Pricing
18:56 – Common types of corrupt files
23:04 – How video data is stored on cards or disks
25:36 – Types of media that can be recovered
28:18 – Audio only files
31:20 – Tips to avoid problems in the future
39:43 – Canon EOS issue / bit flipping
41:57 – Counterfeit cards
44:38 – How AeroQuartet got started


Transforming the Universe

Cachu Hwch *

* It’s all gone wrong

On 10th of March of 2011, Howard Stringer, then Sony’s chief executive, left Tokyo in a wheelchair. A slipped disk in his back required emergency surgery, and he was flying home to New York for the procedure.

What the Welshman didn’t know was that his situation, already painful, was about to become even worse. So bad indeed that he would have to step back from CEO the next year, and that his back surgery would have to wait!

When his jet landed the next day, he learned that the biggest earthquake in memory and a devastating tsunami had just struck the east coast of Japan.

Sony factories in the Fukushima and Miyagi prefectures, 8 in total, immediately halted production. Those close to the shore were so badly damaged by flooding that it took over one year to resume production.

Tragedy, however, would also transform Sony in unexpected ways…

Connecting the Dots

Fast forward to end of 2016: As I was putting the final touches on Treasured 4, I was a bit annoyed.

The rewritten “engine”, powered by libavcodec, was really shining. I could finally display those high-end Sony XAVC and Canon XF-AVC frames in all their 12-bit glory.

However, my redesign had also a few drawbacks: Some video formats would no longer “render”, due to lack of libavcodec equivalent to some proprietary QuickTime codec.

REDCODE and Sony HDCAM-SR fell into this category, and that was really annoying.
I quickly made some statistics of last years requests using those formats, and decided it wouldn’t be too significant. HDCAM-SR numbers indicated that the format was about to disappear, after decreasing for several years in a row.

Still, we had Sony at both ends of the spectrum: On the head, Sony XAVC, strongly leading the 4K pack; on the tail the dwindling Sony HDCAM-SR.

I couldn’t help but remember what had happened 5 years earlier, the flooded factories, the production lines stopped for months, and I made the connection:

Think a moment about Shiva, the Hindu god of creation and destruction, who creates, protects and transforms the universe… Tragedies always have a flip side – and don’t get me wrong, with over 20,000 casualties we’re better off without tragedy – and for Sony the flooded factories may have been a catalyst, that accelerated transition to new technologies.

But before we analyze the consequences over the years, let me tell you first the inside story of that spring and summer of 2011.

The Twilight of the Tapes

When production of HDCAM-SR tapes suddenly stopped in March 2011, professional video tapes were the medium of choice for prime time, episodic TV production.

Although Sony communicated that they expected a three months downtime, the folks in Hollywood, with only two weeks of HDCAM-SR tape supply and a shooting season about to begin, immediately saw the Sendai factory flooding as an existential threat, and started to look for alternatives.

Some of them were already engaged in a tape HDCAM workflow and had no choice but to frantically purchase all existences of HDCAM-SR tape on sale, to secure production until end of season. Tape prices on gray market surged and despite efforts by Sony to alleviate distribution bottlenecks, stocks evaporated in a few weeks.

For many other productions, there was still a plan B: Accelerate the move to tapeless workflow.

Others have explained in detail the immediate impact on the industry of HDCAM-SR tape shortage. To summarize, the shock wave travelled from production down to post-production and delivery:

  • Japan tsunami accelerated a trend that was already happening: the use of tapeless cameras.
  • Forced to embrace the new ARRI, RED, and other tapeless cameras, the production houses had their fears quickly dispelled: The result was flawless, technology was ready!
  • Post production houses were now receiving files instead of tapes. Despite huge investments in SR decks, they were forced to adapt.
  • For delivery of masters to broadcasters, HDCAM-SR tape, the standardized format, was suddenly questioned. If most deliveries were still done in tape, master file transfers through fiber optic lines became common. The future was here.
  • Archiving, on the other end, moves at a slower pace. Transition from film to HDCAM-SR was still recent in 2011, but shortage raised the question.

And with all that, the summer passed quickly…

Gloomy Future

In October, seven months after the disaster, Sony finally managed to restart production and to resume shipments of HDCAM-SR tapes.
But so much effort would hardly be rewarded.

Productions that had done the switch during summer would no longer need the tapes. Those who had purchased a supply of tapes in the aftermath of the disaster would not need them either, and had already planned to switch to tapeless the next season.

In years to come, Sony would continue to manufacture tapes, but the writing was now on the wall.
With the sudden decline of a very profitable business that otherwise would have enjoyed several more years of bonanza, and the inevitable loss of dominance in professional video industry, Sony was facing a gloomy future.

Of course, any attempt to regain a dominant position had not only to address the 2011 events, but also to envision the future of professional video, and deliver the new end-to-end solution, a kind of HDCAM-SR for 2015-2020 if you will.

In response to that challenge, Sony correctly identified the opportunities born out of crisis:

  • What will replace HDCAM-SR as a standard for post and delivery? Getting rid of HDCAM has opened the eyes of the industry on the importance of a standard medium. Let’s design a solution that can become the new standard!
  • 4K is about to happen. Let’s embrace it!
  • Professional file-based workflows have specific needs (quality, bitrate, grading, metadata, interoperability, …) Let’s use the appropriate technology: AVC/H.264 and MXF containers

2011 is reaching an end, and now Sony has a plan…

Resurrection

On October 30, 2012, Sony unveiled the XAVC recording format!

In the past 12 months the tape industry had been disrupted and chaos was looming.
Still, someone had to be the first to make a compelling offer, and start building a new order.
Sony had just done it!

Let’s speculate a bit:
In what position would Sony be today in the professional video markets if not for 2011 flooding?

I believe that they wouldn’t be enjoying a two years lead over Canon, healthy profits, and such a dominant position.
Profitable HDCAM line would have slowed them, the risk of cannibalizing their tape business would have discouraged innovation.

Sure, they would have moved to 4K AVC, but too bland too late. They wouldn’t lead.

A corollary to the Innovator’s dilemma is that companies with biggest investments in old and current technologies are the least inclined to invent the next generation of products.
In 2011, Sony tape business was probably in the middle of the “milking” phase. Sony was definitively not the natural innovator, but the tsunami changed everything.

Sony video division had to face the perfect storm, but with the proper timing, vision and resolve, against all odds, they managed an impressive turnabout. A good lesson for the competitors!

What’s new in 2017

First of all, Happy New Year to all the readers!

We are starting 2017 very strong at Aero Quartet:
A redesigned Treasured is ready to turn beta and will be released in next weeks.

Treasured is the cornerstone of our Movie Repair Service, it provides diagnostics and shows preview of the contents of unplayable video files.
A lot of work has gone into Treasured since its first release in 2008, and we are about to release the most important redesign ever.

Our development roadmap follows a tik-tok schedule, where in even version numbers we make changes to the bowels of the application, changes that are not always visible to the user. And in odd version number we make changes to the user interface.

Tik

In January, we have putting the final touches on this “tik” release: We have rewritten almost everything under the hood, yet the on the surface this looks very similar to version 3.4 released in 2016.

The “engine” was redesigned to get rid of dependencies on deprecated parts of macOS (like 32-bit QuickTime and Carbon libraries) that will no longer work in the next years.

Treasured is now a 64-bit application, ready for the next 10 years. This redesign is the foundation upon which future developments will be built.

For the most technically inclined of you, we are now relying on libavcodec (of ffmpeg fame) to render the media found inside the damaged files, and this brings some advantages over last versions:

  • H.265, the new high-performance codec, is now detected and previewed. (More about this below)
  • Treasured will no longer ask you to install some QuickTime codecs to render images. Now everything comes bundled!
  • Very fast selection of “Candidates” for H.264 format

And many more opportunities that we have identified, and that will progressively be deployed in future releases…
In summary, with this Treasured redesign we make a bold statement:
We want to be here for the next ten years.

H.265 aka HEVC (High Efficiency Video Coding)

And those next ten years will likely belong to H.265, the new high efficiency codec that promises to cut bitrates by 50% over H.264.

We have already repaired with success a dozen of H.265 videos last year, mostly from Samsung Gear cameras.
For 2017 and 2018, we are anticipating a surge, as H.265 encoding chips will become mainstream in high-end and DSLR equipment.

Treasured is not just capable to detect and preview H.265, there’s more…
We have figured out that repairing corrupt mov or mp4 files with H.265 encoding is not very different from what we have been doing with H.264 for over 8 years. In other words, all our experience in delivering high-quality, affordable repairs will be immediately available for new H.265 videos.

Tok

2017 will be a busy year, because we are also planning a huge “tok” release!

Treasured user interface is still surprisingly similar to the first version, that I unveiled in this blog in 2008.
This design has served us well, but is not adequate for the next decade:

  • Mac desktop application user interface has progressed in the last years, influenced by smart phone apps and by macOS evolution. Customers deserve a high-quality, intuitive and beautiful user experience, and we are committed to make it happen.
  • Progress in Movie Repair technology enables a profound redesign of interface. It’s not just about aesthetics, it will be about conceptual design, with new, intuitive and useful objects and features.

For the moment, we don’t have mock-ups of the “tok” Treasured to show you, but rest assured that this blog will be the first place where you’ll see them.
And you’ll love it!

←Older