Thanks to the success of Sony A7 series and Sony AX series, XAVC-S is today one of the most common format that we are requested to repair, and so it deserves its number 3 ranking in my list of Tops and Flops of 2015.
In Sony XAVC family, the “S” flavor is the mainstream one, used by a wide range of devices, from DSLR cameras to cheap consumer camcorders.
In its novel “The Leopard”, Tandredi has the famous quote:
If we want things to stay as they are, things will have to change.
This illustrates perfectly what XAVC-S is in essence: A new name for our usual suspect, AVC/H.264
Back in 2013 when XAVC was unveiled, there wasn’t a lot of choice as far as video compression is concerned:
Once discarded the future technology (HAVC/H.265) that wasn’t ready, Sony could only pick either 20-years old MPEG-2 or 10-years old AVC/H.264.
H.264 scales to 4K, has better compression and quality, so the technical decision was easy.
But the marketing guys were not happy with that, because
you know, everybody else is doing H.264, we are Sony and for our new 4K products we need differentiation. We will use H.264 but it must be with another fancy new name.
OK let’s call it XAVC.
And the marketing guy to add: “and we need different brand names for cinematography-grade, professional and consumer H.264″ and so were invented the names XAVC-I, XAVC-L and XAVC-S.
Therefore, XAVC-S is nothing more than regular AVC/H.264 wrapped in a standard MP4 container.
Let me tell you my first contact with XAVC-S in spring 2014.
Ironically, this was a corrupt recording of a conference on “Change Management”…
I was expecting this to be business as usual, a boring video (change management, anybody?) recorded in standard H.264 in MP4 container.
Apparently nothing new under the sun.. but I was wrong.
What first grabbed my attention was Treasured insistence in reporting “MXF”. In a video allegedly wrapped in MP4, the last thing you expect to detect is precisely MXF.
Treasured sometimes has “false positives”, so I fired my hex editor to check what was going on.
Upon searching for the MXF header (060e 2b34), I quickly found lots of matches inside the file.
00000c0: 0008 0100 00ad 034b 0253 0101 .......K..+4.S..
00000d0: 0c02 0101 0101 0000 8300 000c 8000 0002 ................
00000e0: b3e3 8001 0002 51ad 0253 0101 ......Q...+4.S..
00000f0: 0c02 0101 0201 0000 8300 0036 8100 0010 ...........6....
0000100: 0401 010b 0510 0101 0101 0000 ..+4............
0000110: 8101 0001 0181 0900 0800 0000 0100 0000 ................
0000120: 6481 0a00 0200 0081 0c00 0200 6481 0d00 d...........d...
0000130: 0101 0253 0101 0c02 0101 7f01 ....+4.S........
0000140: 0000 8300 002f e000 0010 9669 0800 4678 ...../.....i..Fx
0000150: 031c 2051 0000 f0c0 1181 e300 0001 00e3 .. Q............
0000160: 0200 0100 e303 0001 ffe3 0400 0844 2015 .............D .
0000170: 0625 0321 5100 0000 0000 0000 0000 0000 .%.!Q...........
Obviously Treasured was right, the Sony engineers had inserted MXF stuff inside their XAVC-S MP4 file. WTF?
I asked Mike (my “Change Management” customer) to send me a playable XAVC-S file, so I could investigate this in depth.
What were they smoking?
What I discovered was even more puzzling: MXF data is indeed part of a bigger, more byzantine thing.
The bits of MXF are inside samples of a real time metadata track of the MP4.
To every video frame corresponds one of such metadata sample of 1024 bytes.
Metadata samples look like the diagram below, and are padded with 00 bytes until filling the 1024 bytes size.
Others samples also include a mysterious a “kkad” atom at the end (more on this below).
Header was easy to understand:
It reports the length in bytes of the components of the sample:
- 8 bytes for the header itself
- 0xAD bytes for the MXF KLV (Key Length Value)
- 0×34B bytes for padding or kkad data.
And of course, the sum is 1024.
MXF KLV itself contains several items, but most are encoded in a proprietary format, and I haven’t managed to understand everything.
In any case, it is not very important and I can prove it:
This metadata is not even necessary.
For two years, I have been repairing XAVC-S without bothering to include the metadata inside the repaired videos, and no customer has reported any problem!
Therefore, Sony engineers have managed a real tour de force:
They have created a video format whose unique “new” feature (besides the shiny XAVC-S name) is to stuff some metadata that is not even being used!
To really appreciate the feat, let’s take a look at this mysterious kkad stuff:
“kkad” blocks include tables with addresses, lengths, and time code information of every video sample.
And with this we close the circle of stupidity:
This information would be useful if it wasn’t already included in the standard MP4 video track.
Yes, Sony engineers have added a metadata track with a proprietary and byzantine format that not only isn’t used, but contains information already present elsewhere in the MP4 container.
And there’s more strange stuff throughout our XAVC-S file…
For example, inside the H.264 stream we have NAL objects of type 6 with more metadata “smoke”.
Or the unnecessary file header below:
0000000: 0000 001c 6674 7970 5841 5643 0100 1fff ....ftyp ....
0000010: 5841 5643 6d70 3432 6973 6f32 0000 0094 XAVCmp42iso2....
0000020: 7575 6964 5052 4f46 21d2 4fce bb88 695c uuidPROF!.O...i\
0000030: fac9 c740 0000 0000 0000 0003 0000 0014 ...@............
0000040: 4650 5246 0000 0000 2000 0000 0000 0000 FPRF.... .......
0000050: 0000 002c 4150 5246 0000 0000 0000 0002 ...,APRF........
0000060: 7477 6f73 0000 0000 0000 0000 0000 0600 ............
0000070: 0000 0600 0000 0000 0002 0000 0034 ...............4
0000080: 5650 5246 0000 0000 0000 0001 6176 6331 VPRF........
0000090: 0164 002a 0003 0002 0000 c350 0000 c350 .d.*.......P...P
00000a0: 0032 0000 0032 0000 0001 0001 .2...2.....8....
Besides the funny “XAVC”, we have here a description of the tracks:
Audio “twos” at 48000Hz (0xbb80)
Video “avc1″ with resolution 1920 x 720 (0×780 0×438)
Again, all this information was already present in the MP4 tracks, so why duplicate it in other places?
XAVC-S: Engineered with love by Sony
If you are in software or firmware development and you have been working in a large organization, the reasons behind all this oddity should now be clear to you.
This is what I call “Frankenstein Engineering”, and it works like this:
- Marketing decides that a new product with features A, B and C shall be developed
- Management green lights the project, and gives the R&D departments a short schedule and a low budget
- Every department has no choice but picking existing, off-the-shelf components and shoe-horn them into the product
- Extra points if the different departments are spread across continents, have incompatible road maps, and don’t talk to each other anyway
- When you put hardware, electronics, firmware and software together it’s a disaster, but it can be fixed
- Eventually the company manages to ship a decent product, but internally the scar tissue is still visible
Therefore, all the odd things I have discovered inside those XAVC-S files was just that:
Scar tissue of the rushed development.
But let’s give credit to Sony: XAVC-S is very successful, and it delivers what Sony promised despite being a Frankenstein monster made with parts that barely fit together and ignore each others.
And the last word about odd XAVC-S again to Tancredi: “A house of which one knew every room wasn’t worth living in.”