How can 16 bit X-OCN deliver smaller files than 10 bit XAVC-I?

Sony’s X-OCN (XOriginal Camera Negative) is a new type of codec from Sony. Currently it is only available via the R7 recorder which can be attached to a Sony PMW-F5, F55 or the new Venice cinema camera.

It is a truly remarkable codec that brings the kind of flexibility normally only available with 16 bit linear raw files but with a files size that is smaller than many conventional high end video formats.

Currently there are two variations of X-OCN.

X-OCN ST is the standard version and then X-OCN LT is the “light” version. Both are 16 bit and both contain 16 bit data based directly on what comes off the cameras sensor. The LT version is barely distinguishable for a 16 bit linear raw recording and the ST version “visually lossless”. Having that sensor data in post production allows you to manipulate the footage over a far greater range than is possible with tradition video files. Traditional video files will already have some form of gamma curve as well as a colour space and white balance baked in. This limits the scope of how far the material can be adjusted and reduces the amount of picture information you have (relative to what comes directly off the sensor) .

Furthermore most traditional video files are 10 bit with a maximum of 1024 code values or levels within the recording. There are some 12 bit codecs but these are still quite rare in video cameras. X-OCN is 16 bit which means that you can have up to 65,536 code values or levels within the recording. That’s a colossal increase in tonal values over traditional recording codecs.

But the thing is that X-OCN LT files are a similar size to Sony’s own XAVC-I (class 480) codec, which is already highly efficient. X-OCN LT is around half the size of the popular 10 bit Apple ProRes HQ codec but offers comparable quality. Even the high quality ST version of X-OCN is smaller than ProRes HQ. So you can have image quality and data levels comparable to Sony’s 16 bit linear raw but in a lightweight, easy to handle 16 bit file that’s smaller than the most commonly used 10 bit version of ProRes.

But how is this even possible? Surely such an amazing 16 bit file should be bigger!

The key to all of this is that the data contained within an X-OCN file is based on the sensors output rather than traditional video.  The cameras that produce the X-OCN material all use bayer sensors. In a traditional video workflow the data from a bayer sensor is first converted from the luminance values that the sensor produces into a YCbCr or RGB signal.

So if the camera has a 4096×2160 bayer sensor in a traditional workflow this pixel level data gets converted to 4096×2160 of Green plus 4096×2160 of Red, plus 4096×2160 of Green (or the same of Y, Cb and Cr). In total you end up with 26 million data points which then need to be compressed using a video codec.

Bayer-to-RGB How can 16 bit X-OCN deliver smaller files than 10 bit XAVC-I?However if we bypass the conversion to a video signal and just store the data that comes directly from the sensor we only need to record a single set of 4096×2160 data points – 8.8 million. This means we only need to store 1/3rd as much data as in a traditional video workflow and it is this huge data saving that is the main reason why it is possible for X-OCN to be smaller than traditional video files while retaining amazing image quality. It’s simply a far more efficient way of recording the data from a bayer camera.

Of course this does mean that the edit or playback computer has to do some extra work because as well as decoding the X-OCN file it has to be converted to a video file, but Sony developed X-OCN to be easy to work with – which it is. Even a modest modern workstation will have no problem working with X-OCN. But the fact that you have that sensor data in the grading suite means you have an amazing degree of flexibility. You can even adjust the way the file is decoded to tailor whether you want more highlight or shadow information in the video file that will created after the X-OCN is decoded.

Why isn’t 16 bit much bigger than 10 bit? Normally a 16 bit file will be bigger than a 10 bit file. But with a video image there are often areas of information that are very similar. Video compression algorithms take advantage of this and instead of recording a value for every pixel will record a single value that represents all of the similar pixels. When you go from 10 bit to 16 bit, while yes, you do have more bits of data to record a greater percentage of the code values will be the same or similar and as a result the codec becomes more efficient. So the files size does increase a bit, but not as much as you might expect.

300x250_xdcam_150dpi How can 16 bit X-OCN deliver smaller files than 10 bit XAVC-I?

So, X-OCN, out of the gate, only needs to store 1/3rd of the data points of a similar traditional RGB or YCbCr codec. Increasing the bit depth from the typical 10 bit bit depth of a regular codec to the 16 bits of X-OCN does then increase the amount of data needed to record it. But the use of a clever algorithm to minimise the data needed for those 16 bits means that the end result is a 16 bit file only a bit bigger than XAVC-I but still smaller than ProRes HQ even at it’s highest quality level.

15 thoughts on “How can 16 bit X-OCN deliver smaller files than 10 bit XAVC-I?”

  1. So, just to understand: A “normal” codec calculate the missing pixels from red, green and blue (bayer sensor) before saving to a card in the camera itself and the X-OCN calculate them later at a workstation? I’m just wondering, if the sharpness of the picture is less, because we are having less information for each pixel

    1. That’s correct. With normal video the sensor output is converted to a conventional image in camera. That conversion is limited to whatever the cameras processing power is. Plus this then has a gamma curve, color matrix and white balance applied, all of which reduce the amount of data.

      With raw and X-OCN we take the very same sensor data and do all the processing on a computer. If you have better processing the image will be better. In fact this is one of the benefits of X-OCN and raw – as computers and the processing algorithms improve, the footage looks better. So footage shot in raw on an F55 a couple of years ago can be re-processed today to look better than it did 2 years ago.

      The source date is the same whether you do the processing in camera or via raw/X-OCN. Sharpness can actually be better if the data is processed on a computer as you don’t have to process in real-time. You can look at the previous and next frames and use then to sample more pixels and thus increase resolution beyond the sensor resolution.

      1. why are you are comparing x-ocn to a 4k 444 video file when the 4k sensor can only record 4k 420? You should compare it to a 4k 420 file which is much smaller.

        1. A 4K 4:2:0 file won’t be that much smaller. Sure, you are right though. It isn’t fair to compare a 16 bit linear file with 65,000 shades with a 10 or 12 file with at most 4096 shades. In terms of a high end, high quality workflow though these are the types of files that many will use.

          If you start with a bayer sensor, what is the highest possible quality recording you could make of the output? 4:2:0 or 4:4:4? Remembering that while there may be large gaps between the R and Blue pixels, the de-bayer algorithm will use the surrounding pixels to estimate what should be in the gaps. You can throw that away by only recording 4:2:0 if you wish, but I wouldn’t recommend that.

          1. And one of the BIG benefits of X-OCN is as you can do the de-bayer on a computer with a lot more processing power than an in camera encoder the algorithms can do a much, much better job of calculating what the missing color information might be. Sony have even demonstrated a process that uses microscopic movements of the camera and the comparison of multiple frames to increase the resolution beyond the sensor resolution.

  2. Hang on, this explanation doesn’t make sense.
    Each pixel on a Bayer sensor has 4 photo-sites, 2 green, one blue and one red.
    That’s 4 numbers to be extracted whereas debayering means there are only 3 numbers (the two greens are combined as one)
    Furthermore, RGB is not the same as YCbCr.
    Not happy Jan!

    1. No. Each pixel on a normal bayer sensor is a single photosite under a single colour filter. When a manufacture states “4K Bayer” it will be 2K green photosites and 2K red photosites on one line and 2K Blue/2K Green photosites on the next. I am well aware that YCbCr is not the same as RGB, but YCbCr is still 3 data sets.

    1. The codecs in the cameras are hardware based. You can’t add new codecs with firmware. But also there has to be something to separate the high end cameras from the lower end cameras.

  3. How is Sony able to record X-OCN internally on the Venice 2 with Red’s patent in place? I know BRAW partially debayers in camera to get around the patent, but I’m a bit stuck on how Sony (and Canon) have managed internal compressed raw without being sued into the ground.

    1. An agreement of some sort appears to have been reached between Red and Sony and for a long time Sony Raw remained in the R5 recorder but now you will never hear Sony refer to X-OCN as raw, rather 16 bit sensor data compressed using advanced algorithms or something to that effect, even though Sony Raw and X-OCN are more or less the same codec.

  4. If we don’t form a video image in the camera but in post aren’t we just creating more new data in post?
    Will the processed/debayered image in post have larger file sizes?
    how does this work in post @Alisterchapman

    1. You will only create more data in post if you convert the raw file from raw to another format. But in most cases you work directly from the raw file and the computer reconstructs the image on the fly, so it remains the same size. Only your final output renders will be larger.

Leave a Reply to Max Cancel reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.