Volumetric Video Capture Gets The Whole Picture
With talk of an impending Metaverse and augmented reality graphics increasingly becoming part of mainstream television broadcasts and live eSports events, top graphics artists are looking past traditional 3-D animation and virtual environments and using fully rendered 360-degree 2K and 4K video to get a more captivating effect.
Because of this, the world of Volumetric Video Capture (VC) is taking off in a big way across the globe, allowing celebrities and sports starts to virtually appear on TV shows when they are not in the room. It’s also helping sports teams, fashion brands, creative agencies and the feature film production present characters in ways never seen before.
The first on-air example of VC being used on live broadcast was during a Madonna performance at the Billboard Music Awards 2019, although the technology was demonstrated publicly at the 2016 CES Show. Madonna performed her track dancing on stage and integrated into the choreography with four volumetrically captured holograms of herself.
Merging perfectly with apps and environments being created for the Metaverse, which will be populated with digital representations of ourselves, VC is a technique that converts a three-dimensional space, object, or environment into a moving video (or still frame) in real-time using an array of cameras mounted around a subject. Once digitized, this captured object can be transferred to the web, mobile, or virtual worlds and viewed in 3D.
This “free-viewpoint” video technology, as some call it, captures a specific moment in time and allows it to be rotated and viewed from any viewpoint. Most professionals refer to volumetric capture as a 3D video of a specific moment in time. It’s also been called an editor’s dream because there are so many angles and images they have to work with. It allows them to choose from more than 100 camera a camera angle of an image, either in postproduction or later down the delivery chain, and manipulate it in creative ways. And if you wanted to focus on a different point or if you didn’t think you the got shot quite right, you don’t have to reshoot talent. It all there in the original VC material.
And what makes volumetric video interesting for end users is that the final product does not have a set viewpoint, so they can watch and interact with it from all angles by rotating it. This significantly enhances the viewer experience, heightening their sense of immersion and engagement.
The difference between 360-degree video and volumetric video is the depth provided with volume. In a 360-degree video, users can only view the video from a single, constant depth. With volumetric video, the end-user can play director and control how far in or out they want to explore the scene. In the past, production teams have been forced to integrate 2D video into a virtual reality (VR) or augmented reality (AR) experience. Now that they can capture a 3D view of the object, the end-user can have a 1-on-1 experience right in their living room with an athlete, artist, or entertainer.
Targeting this new demand, Dimension Studio, in London, has partnered with Nikon/MRMC (also in the UK) and its Polymotion Stage system. It’s a mobile VC studio using Dimension’s proprietary software stitching algorithms inside and a three-fold artic lorry. Inside there’s an internal studio housing 106 tightly synchronized cameras (53 RGB DLSRs that read and record the color required for the .png texture map and 53 infrared IR cameras that record depth and position in space for creating a VC mesh) arrayed on the walls, ceiling and even the floor (in the mobile stage they can place an additional 4 cameras (2 IR and 2 RGB) on the floor shooting upwards to ensure greater detail when capturing movements that require the head to be facing down.
The rig is also equipped with state-of-the-art motion capture and prop tracking, offering a high level of accuracy.
VC also involves audio, so there are four overhead microphones inside the truck for recording sound. Lavalier mics can also be incorporated to capture broadcast quality sound. Directional sound recording is also available.
Show producers using the truck can now capture 360 renditions of subjects at standard 30 fps – they can also capture at up to 60fps and above for specific requirements – output as MP4 files that are uploaded into the Microsoft Azure cloud. There the various camera feeds are then stitched together and delivered to the client in 48 hours.
In 2K or 4K the system captures the RAW data, and there is typically a lot of it—requiring huge amounts of cloud-based storage—at 10 GB/s. Up to an hour of footage can be processed in one day, after which data needs to be transferred onto a local server farm. The final processed 360-degree VC images can then be incorporated into TV coverage (linear and mobile) to add sizzle to the telecast.
Currently, this VC workflow can’t be done live (as it takes 48 hours to render a scene) but there are reportedly people working on live playback. The issue is being able to render all of the camera feeds as a single 3D scene in the cloud, which takes time.
The service is not cheap, as clients are charged a day rate for using the capture studio and then a per minute cloud processing fee. However, without this workflow, the production team might not have
Internationalbeen able to get all of the players images on screen in such a new and creative way. It was so success with viewers that the Polymotion Stage system was used during the 2021 Open as well.
When you talk about the metaverse, VC will certainly play a large role in how people interact in real time in a digital environment, bringing new opportunities to present life in ways we’ve never seen before.