Will LiDAR Break the 3D Asset Creation Bottleneck?
National Institute of Standards and Technology gltf, photogrammetry, lidar
The field of 3D Computer graphics has grown from a niche technical curiosity in the mid-1970s to mass appeal and distribution via movies and games. We’ve seen applications grow from flying logos, to highly engaging real-time renderings in games, to synthetic humans and de-aged actors in movies finally crossing the “uncanny valley” to be nearly indistinguishable from reality. However, the creation of 3D assets - computer graphics objects and the worlds they inhabit - still requires highly skilled technicians and artists, presenting a bottleneck to more widespread applications, such as creating 3D graphics for websites and E-Commerce.
That bottleneck might just be about to break with the advent of mass market LiDAR (Light Detection And Ranging). New cell phones contain LiDAR, putting this technology in the average user’s pocket. The LiDAR capabilities in consumer devices may not seem capable of capture for professional applications now, but just wait. The first cell phone cameras were rudimentary compared to the high quality digital cameras available at the time; now, it can be difficult to distinguish between the photographs taken by a smartphone and a DSLR. We can expect the same for phone-based LiDAR. What was once a $75,000 technology will be available to the masses.
LiDAR used to be an obscure technology used by forensic examiners and highly specialized industrial applications to take measurements unobtrusively. A LiDAR device can measure the precise distance between the device and the object in front of it. Performing that measurement a few million or billion times produces a “point cloud” of measurements: lots of individual measurements which together specify the exact location of the surface of an object or room. The point cloud can be processed into a coherent mesh of triangles and overlaid with color and texture information to produce an honest-to-goodness useful 3D computer graphics model of an object or environment.
LiDAR, Meshing and Texture
On the surface, LiDAR sounds very similar to photogrammetry, a tricky little technique whereby an artist takes hundreds of photos of an object or environment from all possible angles. Photogrammetry software inputs those images, computes the camera positions, and then reconstructs a mesh of the object. The results can be very high quality. The downside is that photogrammetry requires careful photography and a fair bit of skill. Objects with high gloss like glass or metal surfaces are difficult to reconstruct and should be photographed through polarizing filters or using other techniques such as powdering the surface, an approach that would not thrill the curator of a valuable object. In addition, the software, such as RealityCapture (https://www.capturingreality.com/) or Metashape (https://www.agisoft.com/), is complex and may be intimidating to laypeople.
This model of a roasted chicken is a fantastic example of a 3D object created via photogrammetry. It was photographed thousands of times to capture not only the simple shape but also the subsurface structure and lighting (which is technically photometry rather than photogrammetry, but that’s a story for another blog). The full process behind this amazing example highlights the great skill needed to obtain such high quality results using traditional techniques.
This model is available on Sketchfab (https://sketchfab.com/), one of the leaders in 3D object dissemination. Browsing their offerings is time well spent: in particular, check out their collections from cultural heritage sites that make their objects available, usually for free. The CEO of Sketchfab, Alban Denoyel (@albn), has some great examples of LiDAR usage, with a good survey of many different scanning apps. Examples include the remains of his son's birthday party as a 3D table loaded with cakes and a 3D video of a skateboarder - yes, a set of 3D captures can be sequenced as a video. As these examples demonstrate, LiDAR has the potential to slice through the Gordian knot of photogrammetry-based capture, a critical step in simplifying the asset creation pipeline. Combined with open standards for rendering, viewing, and exporting, 3D assets creation could quickly become very accessible indeed.
The usefulness of Sketchfab and its assets is based on the interoperability standards that enable viewing and interactive manipulation of 3D models via the Web, and accessible to anyone with a web browser. In particular, the Khronos Group’s WebGL API standard has become ubiquitous, allowing users to observe, manipulate and modify 3D objects without the installation of any browser plugins. VR and AR views are also now supported in the browser by WebXR, and the creation and exchange of 3D models is made easy by Khronos’s glTF 3D file format designed for efficient downloading and rendering, often called the “jpeg of 3D”. You can learn about some of the latest enhancements to glTF at “New Extensions Raise the Bar on 3D Asset Visual Realism”.
Open Standards for Display and Interaction
Why is it so important to make 3D asset creation more ubiquitous and accessible for applications like websites? The most obvious answer is E-Commerce. IKEA, Amazon, Wayfair and other large E-Commerce sites are working hard to enable their customers to visualize the items they wish to purchase in the context of their homes. Using a 3D model in an AR viewer, a shopper can see the chair they’re considering purchasing in their house before they buy.
The Khronos 3D Commerce Working Group has produced asset creation guidelines to help artists optimize real-time assets for E-Commerce; a Viewer Certification program to help standardize the performance of assets across platforms, and a provisional metadata extension to enable asset management and sharing. Khronos is further supporting E-Commerce through the development of Physically Based Rendering (PBR) Materials and Material Variant extensions to the glTF 2.0 3D file format. These enable vendors to create more photorealistic representations of the products they are trying to sell and to embed multiple material or color options into a single asset. It’s great to see the industry agree on a standards-based set of PBR capabilities. Ultimately, this standardization development effort will give content producers access to a wider collection of rendering technologies, avoiding locking them into a single vendor or platform.
In addition to its work for 3D Commerce, The Khronos Group develops standards relating to 3D graphics and XR more broadly. In addition to the new-generation Vulkan 3D API, the OpenXR standard is “a royalty-free, open standard that provides high-performance access to Augmented Reality (AR) and Virtual Reality (VR)—collectively known as XR—platforms and devices.” The other major standards work in this domain is taking place in the World Wide Web Consortium (W3C) via its Immersive Web Working Group for WebXR. Per the group’s charter, “The mission of the Immersive Web Working Group is to help bring high-performance Virtual Reality (VR) and Augmented Reality (AR) (collectively known as XR) to the open Web via APIs to interact with XR devices and sensors in browsers.” OpenXR enables applications and engines, including WebXR, to run on any system that exposes their XR runtime through the OpenXR API.
Most of the WebXR frameworks support AR, and the major platforms being developed by Apple (ARKit), Google (ARCore) and Facebook (AR Studio) all have significant AR support. Not to be left out of the pack, Amazon is actively experimenting with AR so you can see how items such as furniture look in your space. The Khronos 3D Commerce Working Group is working to enable and encourage these technologies to be used in E-commerce at industrial scale .
In short, the widespread use of 3D models is ramping up fast, with sophisticated commercial applications already in use. The open standards infrastructure to enable content portability across platforms and devices for 3D and XR applications is here. Now, LiDAR is poised to open the floodgates of 3D content creation by making model capture very nearly point and shoot.
One of the non-trivial annoyances about assets created with LiDAR, or photogrammetry in general, is that they are large when converted to meshes, and take significant computing resources to store, transmit, process, and display. Fortunately, there is a solution with the availability of geometry mesh compression. Compression that understands the semantics of geometric meshes can do a great job reducing the size of the files while maintaining visual fidelity. One of the best examples on how to integrate compression with your workflow comes from a small startup called DGG (Darmstadt Graphics Group) and their product “Rapid Compact”.
RapidCompact provides an API to utilize its compression technology integrated with 3D model creation software such as Blender and Solidworks. This company is actually an offshoot from a research group at Fraunhofer IGD in Germany, well known for object capture expertise and as an all around high-quality computer graphics research group for many years. Full disclosure: I have known the DGG folks for many years via the 3D graphics standards world. In spite of that, their work is top notch ;-)
As an interesting aside, RapidCompact also supports Draco mesh compression. Based on technology developed by Google, the relatively new Draco compression extension to glTF offers users a royalty-free open standard to enable significantly smaller asset sizes, together with open source tools to produce high quality meshes often with a ten-fold compression rate. The ever-popular flight helmet sample model went from 46.1MB to 4.41MB in a Draco-compressed GLB. More detail on Draco can be found at “Khronos Announces glTF Geometry Compression Extension.” Draco, like PBR and material variants support, is an example of glTF’s extensibility, one of the asset format’s key strengths. New extensions are being worked on all the time by active working group members, ensuring that glTF keeps ahead of the evolving needs of the industry.
File Format Creation
The last piece of the 3D puzzle is the ability to widely disseminate 3D models. There are a number of considerations impacting the selection of the final file format for a newly created 3D masterpiece. What is the intended use of the data? Is this the capture of a priceless artifact upon which scientists or curators will perform experiments? Is the asset going to be animated? Will it be 3D printed? Is it going to be displayed in very high quality, or simply appear in a thumbnail sketch on your website? Is it going to be part of an AR system and blended into the real world?
No one file format is appropriate for every use case, but content creators should abide by a few guidelines. A widely used standard is always a good choice. The current leader in widely supported file formats for web viewing is the aforementioned glTF file format, developed and maintained by the Khronos Group. Though relatively new, glTF is a strong choice since it is widely supported by web browser developers. Of course, not all 3D objects need to be displayed on web pages, but unification around one or a small number of 3D file formats will significantly simplify 3D object usage. Native applications work just fine with glTF and can take advantage of the resulting runtime improvements as well.
However, depending on the application, there are a number of other useful file formats to consider. A format like .obj has been around for long enough to gain very widespread support. However it lacks some of the finer rendering controls, such as PBR, animation, or compression, available in more modern formats. VRML, although somewhat out of use in modern systems, is still supported as an interchange format by many systems. Its successor, X3D, is also still important as an ISO standard for archival purposes. For example, X3D is the file format for NIST’s Digital Library of Mathematical Functions (DLMF). This important reference for the scientific community has a large number of interactive graphic surfaces of mathematical functions. It uses X3DOM, a web oriented version of X3D that sits on top of WebGL, the Khronos standard for interactive web graphics. These functions, however, are synthetic surfaces rather than objects from the real world. For the purpose of creating photorealistic objects to be embedded into web pages or native applications, I’d pick glTF.
The Outlook for LiDAR Generated Assets
So, in conclusion, now that I have LiDAR technology in my pocket, the technology pieces are in place so I can quickly scan a room or an object. I can post the resulting file to an object repository and save for posterity or use in any number of imaginative applications.
E-Commerce is a compelling use case for today’s 3D content developers, and as the sophistication and ease of use of the tools matures alongside industry alignment on standards for file format, compression, and display, the use cases for 3D assets will both expand AND grow more personal. I fully expect this to lead to a day when the capture of 3D objects to enhance our memory is commonplace. It will be yet another technique to supplement our existing toolbox of memory recorders. 3D scans will supplement photographs, videos, and audio recordings. All are most valuable as a means to help us remember the people and places of our lives.
DISCLAIMER: Opinions expressed in this article are that of the author. Certain commercial products or company names are identified here to foster understanding. Such identification is not intended to imply recommendation or endorsement by the National Institute of Standards and Technology, nor is it intended to imply that the products or names identified are necessarily the best available for the purpose.