Xiaoxu Meng

Portrait photo of Xiaoxu Meng I am currently employed as a Senior Researcher at Tencent America in Los Angeles. Prior to this, I earned my Ph.D. in Computer Science from the University of Maryland, College Park, advised by Dr. Amitabh Varshney.

My research focuses on the intersection of computer graphics, computer vision, and virtual reality. I have contributed to various projects such as 3D reconstruction, differentiable rendering, foveated rendering in virtual reality, and denoising in real-time Monte Carlo path tracing. Additionally, I have held various roles within the academic community, including serving as a conference committee member for EGSR and HPG, as well as the Student Volunteer Chair for IEEEVR. I have also acted as a reviewer for several conferences and journals.

Publications

NeAT: Learning Neural Implicit Surfaces with Arbitrary Topologies from Multi-view Images
Xiaoxu Meng, Weikai Chen, Bo Yang
CVPR 2023
"a SDF-based differentiable rendering method that reconstructs arbitrary surfaces from multiview images"

paper
website
code
video

abstract

Recent progress in neural implicit functions has set new state-of-the-art in reconstructing high-fidelity 3D shapes from a collection of images. However, these approaches are limited to closed surfaces as they require the surface to be represented by a signed distance field. In this paper, we propose NeAT, a new neural rendering framework that can learn implicit surfaces with arbitrary topologies from multi-view images. In particular, NeAT represents the 3D surface as a level set of a signed distance function (SDF) with a validity branch for estimating the surface existence probability at the query positions. We also develop a novel neural volume rendering method, which uses SDF and validity to calculate the volume opacity and avoids rendering points with low validity. NeAT supports easy field-to-mesh conversion using the classic Marching Cubes algorithm. Extensive experiments on DTU, MGN, and Deep Fashion 3D datasets indicate that our approach is able to faithfully reconstruct both watertight and non-watertight surfaces. In particular, NeAT significantly outperforms the state-of-the-art methods in the task of open surface reconstruction both quantitatively and qualitatively.

NeUDF: Leaning Neural Unsigned Distance Fields with Volume Rendering
Yu-Tao Liu, Li Wang, Jie Yang, Weikai Chen, Xiaoxu Meng, Bo Yang, Lin Gao
CVPR 2023
"a UDF-based differentiable rendering method that reconstructs arbitrary surfaces from multiview images"

website

abstract

Multi-view shape reconstruction has achieved impressive progresses thanks to the latest advances in neural implicit surface rendering. However, existing methods based on signed distance function (SDF) are limited to closed surfaces, failing to reconstruct a wide range of real-world objects that contain open-surface structures. In this work, we introduce a new neural rendering framework, coded NeUDF, that can reconstruct surfaces with arbitrary topologies solely from multi-view supervision. To gain the flexibility of representing arbitrary surfaces, NeUDF leverages the unsigned distance function (UDF) as surface representation. While a naive extension of SDF-based neural renderer cannot scale to UDF, we propose two new formulations of weight function specially tailored for UDF-based volume rendering. Furthermore, to cope with open surface rendering, where the in/out test is no longer valid, we present a dedicated normal regularization strategy to resolve the surface orientation ambiguity. We extensively evaluate our method over a number of challenging datasets, including DTU, MGN, and Deep Fashion 3D. Experimental results demonstrate that NeUDF can significantly outperform the state-of-the-art method in the task of multi-view surface reconstruction, especially for the complex shapes with open boundaries.

IMPLICITPCA: Implicitly-Proxied Parametric Encoding for Collision-Aware Garment Reconstruction
Lan Chen, Jie Yang, Hongbo Fu, Xiaoxu Meng, Weikai Chen, Bo Yang, Lin Gao
CVM 2023
"a parametric SDF network that closely couples parametric encoding with implicit functions, to enjoy the fine details brought by implicit reconstruction while maintaining correct open surfaces"

paper

abstract

The emerging remote collaboration in a virtual environment calls for the need for high-fidelity 3D human reconstruction from single image.To deal with the challenges of cloth details and topologies, parametric models are widely used as explicit priors. While they often lack of fine details from the image. Neural implicit approaches generate accurate details but are typically limited to closed surfaces.In addition, physically correct reconstructions, e.g. collision-free, is crucial but often ignored in prior works.We present ImplicitPCA, a parametric SDF network that closely couples parametric encoding with implicit functions, to enjoy the fine details brought by implicit reconstruction while maintaining correct open surfaces.We introduce a fast collision-aware regression network to ensure physically-correct estimation.During inference, an iterative routine is applied to align the garment to the 2D landmarks and fit with the collision-aware cloth SDF.The experiments on the public dataset and in-the-wild images demonstrate our outperformance.

HSDF: Hybrid Sign and Distance Field for Modeling Surfaces with Arbitrary Topologies
Li Wang, Jie Yang, Weikai Chen, Xiaoxu Meng, Bo Yang, Jintao Li, Lin Gao
NeurIPS 2022
"a learnable implicit representation for modeling both closed and open surfaces"

paper

abstract

Neural implicit function based on signed distance field (SDF) has achieved impressive progress in reconstructing 3D models with high fidelity. However, such approaches can only represent closed shapes. Recent works based on unsigned distance function (UDF) are proposed to handle both watertight and open surfaces. Nonetheless, as UDF is signless, its direct output is limited to point cloud, which imposes an additional challenge on extracting high-quality meshes from discrete points. To address this issue, we present a new learnable implicit representation, coded HSDF, that connects the good ends of SDF and UDF. In particular, HSDF is able to represent arbitrary topologies containing both closed and open surfaces while being compatible with existing iso-surface extraction techniques for easy field-to-mesh conversion. In addition to predicting a UDF, we propose to learn an additional sign field via a simple classifier. Unlike traditional SDF, HSDF is able to locate the surface of interest before level surface extraction by generating surface points following NDF. We are then able to obtain open surfaces via an adaptive meshing approach that only instantiates regions containing surface into a polygon mesh. We also propose HSDF-Net, a dedicated learning framework that factorizes the learning of HSDF into two easier problems. Experiments on multiple datasets show that HSDF outperforms state-of-the-art techniques both qualitatively and quantitatively.

Rectangular Mapping-based Foveated Rendering
Jiannan Ye, Anqi Xie, Susmija Jabbireddy, Yunchuan Li, Xubo Yang, Xiaoxu Meng
IEEEVR 2022
"a simple yet effective implementation of foveated rendering framework"

paper
website
code
video

abstract

With the speedy increase of display resolution and the demand for interactive frame rate, rendering acceleration is becoming more critical for a wide range of virtual reality applications. Foveated rendering addresses this challenge by rendering with a non-uniform resolution for the display. Motivated by the non-linear optical lens equation, we present rectangular mapping-based foveated rendering (RMFR), a simple yet effective implementation of foveated rendering framework. RMFR supports varying level of foveation according to the eccentricity and the scene complexity. Compared with traditional foveated rendering methods, rectangular mapping-based foveated rendering provides a superior level of perceived visual quality while consuming minimal rendering cost.

Real-time Monte Carlo Denoising with the Neural Bilateral Grid
Xiaoxu Meng, Quan Zheng, Amitabh Varshney, Gurprit Singh, and Matthias Zwicker
EGSR 2020
"an efficient neural network architecture learns to denoise noisy renderings in a data-dependent, bilateral space"

paper
website
code
video

abstract

Real-time denoising for Monte Carlo rendering remains a critical challenge with regard to the demanding requirements of both high fidelity and low computation time. In this paper, we propose a novel and practical deep learning approach to robustly denoise Monte Carlo images rendered at sampling rates as low as a single sample per pixel (1-spp). This causes severe noise, and previous techniques strongly compromise final quality to maintain real-time denoising speed. We develop an efficient convolutional neural network architecture to learn to denoise noisy inputs in a data-dependent, bilateral space. Our neural network learns to generate a guide image for first splatting noisy data into the grid, and then slicing it to read out the denoised data. To seamlessly integrate bilateral grids into our trainable denoising pipeline, we leverage a differentiable bilateral grid, called neural bilateral grid, which enables end-to-end training. In addition, we also show how we can further improve denoising quality using a hierarchy of multi-scale bilateral grids. Our experimental results demonstrate that this approach can robustly denoise 1-spp noisy input images at real-time frame rates (a few milliseconds per frame). At such low sampling rates, our approach outperforms state-of-the-art techniques based on kernel prediction networks both in terms of quality and speed, and it leads to significantly improved quality compared to the state-of-the-art feature regression technique.

3D-Kernel Foveated Rendering for Light Fields
Xiaoxu Meng, Ruofei Du, Joseph F. JaJa, Amitabh Varshney
TVCG
"a perceptual model for foveated light fields by extending the KFR for the rendering of 3D meshes"

paper
website
code
video

abstract

Light fields capture both the spatial and angular rays, thus enabling free-viewpoint rendering and custom selection of the focal plane. Scientists can interactively explore pre-recorded microscopic light fields of organs, microbes, and neurons using virtual reality headsets. However, rendering high-resolution light fields at interactive frame rates requires a very high rate of texture sampling, which is challenging as the resolutions of light fields and displays continue to increase. In this article, we present an efficient algorithm to visualize 4D light fields with 3D-kernel foveated rendering (3D-KFR). The 3D-KFR scheme coupled with eye-tracking has the potential to accelerate the rendering of 4D depth-cued light fields dramatically. We have developed a perceptual model for foveated light fields by extending the KFR for the rendering of 3D meshes. On datasets of high-resolution microscopic light fields, we observe 3.47×-7.28× speedup in light field rendering with minimal perceptual loss of detail. We envision that 3D-KFR will reconcile the mutually conflicting goals of visual fidelity and rendering speed for interactive visualization of light fields.

Eye-dominance-guided Foveated Rendering
Xiaoxu Meng, Ruofei Du, Amitabh Varshney
IEEEVR 2020 (Accepted as IEEE TVCG paper)
"renders the scene with higher detail for the dominant eye than the non-dominant eye"

paper
video

abstract

Optimizing rendering performance is critical for a wide variety of virtual reality (VR) applications. Foveated rendering is emerging as an indispensable technique for reconciling interactive frame rates with ever-higher head-mounted display resolutions. Here, we present a simple yet effective technique for further reducing the cost of foveated rendering by leveraging ocular dominance - the tendency of the human visual system to prefer scene perception from one eye over the other. Our new approach, eye-dominance-guided foveated rendering (EFR), renders the scene at a lower foveation level (with higher detail) for the dominant eye than the non-dominant eye. Compared with traditional foveated rendering, EFR can be expected to provide superior rendering performance while preserving the same level of perceived visual quality.

Kernel Foveated Rendering
Xiaoxu Meng, Ruofei Du, Matthias Zwicker, and Amitabh Varshney
I3D 2018 (Accepted as journal paper)
"renders the scene with higher detail for the dominant eye than the non-dominant eye"

paper
code
video

abstract

Foveated rendering coupled with eye-tracking has the potential to dramatically accelerate interactive 3D graphics with minimal loss of perceptual detail. In this paper, we parameterize foveated rendering by embedding polynomial kernel functions in the classic log-polar mapping. Our GPU-driven technique uses closed-form, parameterized foveation that mimics the distribution of photoreceptors in the human retina. We present a simple two-pass kernel foveated rendering (KFR) pipeline that maps well onto modern GPUs. In the first pass, we compute the kernel log-polar transformation and render to a reduced-resolution buffer. In the second pass, we carry out the inverse-log-polar transformation with anti-aliasing to map the reduced-resolution rendering to the full-resolution screen. We have carried out pilot and formal user studies to empirically identify the KFR parameters. We observe a 2.8X -- 3.2X speedup in rendering on 4K UHD (2160p) displays with minimal perceptual loss of detail. The relevance of eye-tracking-guided kernel foveated rendering can only increase as the anticipated rise of display resolution makes it ever more difficult to resolve the mutually conflicting goals of interactive rendering and perceptual realism.