姿势
人工智能
计算机科学
计算机视觉
基本事实
三维姿态估计
计算
梯度下降
投影(关系代数)
人工神经网络
算法
作者
Chaochao Zhou,Syed Hasib Akhter Faruqui,Dayeong An,Abhinav Patel,R Abdalla,Michael C. Hurley,Ali Shaibani,Matthew B. Potts,Babak S. Jahromi,Sameer A. Ansari,Donald R. Cantrell
标识
DOI:10.1007/s10278-024-01354-w
摘要
Many tasks performed in image-guided procedures can be cast as pose estimation problems, where specific projections are chosen to reach a target in 3D space. In this study, we construct a framework for fluoroscopic pose estimation and compare alternative loss functions and volumetric scene representations. We first develop a differentiable projection (DiffProj) algorithm for the efficient computation of Digitally Reconstructed Radiographs (DRRs) from either Cone-Beam Computerized Tomography (CBCT) or neural scene representations. We introduce two innovative neural scene representations, Neural Tuned Tomography (NeTT) and masked Neural Radiance Fields (mNeRF). Pose estimation is then performed within the framework by iterative gradient descent using loss functions that quantify the image discrepancy of the synthesized DRR with respect to the ground-truth, target fluoroscopic X-ray image. We compared alternative loss functions and volumetric scene representations for pose estimation using a dataset consisting of 50 cranial tomographic X-ray sequences. We find that Mutual Information significantly outperforms alternative loss functions for pose estimation, avoiding entrapment in local optima. The alternative discrete (CBCT) and neural (NeTT and mNeRF) volumetric scene representations yield comparable performance (3D angle errors, mean ≤ 3.2° and 90% quantile ≤ 3.4°); however, the neural scene representations incur a considerable computational expense to train.
科研通智能强力驱动
Strongly Powered by AbleSci AI