Novel view synthesis (NVS) has gained significant attention with the advancement of generative modeling techniques. While earlier methods, such as neural radiance fields (NeRF) and Gaussian splatting, have achieved impressive results, the integration of diffusion models has brought NVS closer to generating photorealistic images. Recently, there has been a shift toward few-shot NVS, where only a sparse set of input views are provided to generate novel views from unseen camera poses. Despite this, existing methods typically assume equal importance for all input views relative to the target view, which can lead to suboptimal results, especially when the input views vary significantly in relevance to the target pose. In this work, we focus on improving few-shot NVS by introducing a camera-weighting mechanism that adjusts the importance of source views based on their relevance to the target view. We propose two types of approaches: a deterministic weighting schemes, which account for geometric properties like Euclidean distance and angular differences between the source and target views, and a cross-attention based learning scheme that enables the model to learn optimal view weighting. We apply these techniques to few-shot NVS scenarios, demonstrating that our camera-weighting methods enhance the quality of synthesized views compared to conventional equal-weighting approaches. Our results highlight the potential of view-weighting strategies to improve accuracy and realism in few-shot NVS.
[
Code]