亲爱的研友该休息了!由于当前在线用户较少,发布求助请尽量完整的填写文献信息,科研通机器人24小时在线,伴您度过漫漫科研夜!身体可是革命的本钱,早点休息,好梦!

DiT-3D: Exploring Plain Diffusion Transformers for 3D Shape Generation

计算机科学 变压器 点云 计算 体素 算法 建筑 人工智能 电气工程 工程类 电压 艺术 视觉艺术
作者
Shentong Mo,Enze Xie,Ruihang Chu,Lewei Yao,Lanqing Hong,Matthias Nießner,Zhenguo Li
出处
期刊:Cornell University - arXiv 被引量:4
标识
DOI:10.48550/arxiv.2307.01831
摘要

Recent Diffusion Transformers (e.g., DiT) have demonstrated their powerful effectiveness in generating high-quality 2D images. However, it is still being determined whether the Transformer architecture performs equally well in 3D shape generation, as previous 3D diffusion methods mostly adopted the U-Net architecture. To bridge this gap, we propose a novel Diffusion Transformer for 3D shape generation, namely DiT-3D, which can directly operate the denoising process on voxelized point clouds using plain Transformers. Compared to existing U-Net approaches, our DiT-3D is more scalable in model size and produces much higher quality generations. Specifically, the DiT-3D adopts the design philosophy of DiT but modifies it by incorporating 3D positional and patch embeddings to adaptively aggregate input from voxelized point clouds. To reduce the computational cost of self-attention in 3D shape generation, we incorporate 3D window attention into Transformer blocks, as the increased 3D token length resulting from the additional dimension of voxels can lead to high computation. Finally, linear and devoxelization layers are used to predict the denoised point clouds. In addition, our transformer architecture supports efficient fine-tuning from 2D to 3D, where the pre-trained DiT-2D checkpoint on ImageNet can significantly improve DiT-3D on ShapeNet. Experimental results on the ShapeNet dataset demonstrate that the proposed DiT-3D achieves state-of-the-art performance in high-fidelity and diverse 3D point cloud generation. In particular, our DiT-3D decreases the 1-Nearest Neighbor Accuracy of the state-of-the-art method by 4.59 and increases the Coverage metric by 3.51 when evaluated on Chamfer Distance.

科研通智能强力驱动
Strongly Powered by AbleSci AI
科研通是完全免费的文献互助平台,具备全网最快的应助速度,最高的求助完成率。 对每一个文献求助,科研通都将尽心尽力,给求助人一个满意的交代。
实时播报
zzzzzttt完成签到 ,获得积分10
11秒前
柯柯应助文件撤销了驳回
50秒前
去2完成签到 ,获得积分10
54秒前
CipherSage应助qingzx采纳,获得10
1分钟前
斯文败类应助泰裤辣采纳,获得10
1分钟前
1分钟前
檀江完成签到 ,获得积分10
2分钟前
2分钟前
lingzhiyi发布了新的文献求助10
2分钟前
2分钟前
zzzz完成签到,获得积分10
2分钟前
2分钟前
2分钟前
2分钟前
勤劳的小牛蛙完成签到,获得积分20
2分钟前
2分钟前
3分钟前
lalalatiancai完成签到,获得积分10
3分钟前
所所应助Fu采纳,获得10
3分钟前
3分钟前
Fu完成签到,获得积分10
3分钟前
Fu发布了新的文献求助10
3分钟前
3分钟前
一路微笑完成签到,获得积分10
4分钟前
testmanfuxk完成签到,获得积分10
4分钟前
4分钟前
SciGPT应助韩凡采纳,获得10
4分钟前
4分钟前
哈扎尔完成签到 ,获得积分10
4分钟前
4分钟前
4分钟前
4分钟前
韩凡发布了新的文献求助10
5分钟前
5分钟前
5分钟前
OuY发布了新的文献求助10
5分钟前
5分钟前
5分钟前
qingzx发布了新的文献求助10
5分钟前
李昕123完成签到 ,获得积分10
6分钟前
高分求助中
Mass producing individuality 600
Algorithmic Mathematics in Machine Learning 500
Разработка метода ускоренного контроля качества электрохромных устройств 500
A Combined Chronic Toxicity and Carcinogenicity Study of ε-Polylysine in the Rat 400
Advances in Underwater Acoustics, Structural Acoustics, and Computational Methodologies 300
NK Cell Receptors: Advances in Cell Biology and Immunology by Colton Williams (Editor) 200
Effect of clapping movement with groove rhythm on executive function: focusing on audiomotor entrainment 200
热门求助领域 (近24小时)
化学 材料科学 医学 生物 工程类 有机化学 物理 生物化学 纳米技术 计算机科学 化学工程 内科学 复合材料 物理化学 电极 遗传学 量子力学 基因 冶金 催化作用
热门帖子
关注 科研通微信公众号,转发送积分 3827212
求助须知:如何正确求助?哪些是违规求助? 3369573
关于积分的说明 10456484
捐赠科研通 3089256
什么是DOI,文献DOI怎么找? 1699738
邀请新用户注册赠送积分活动 817497
科研通“疑难数据库(出版商)”最低求助积分说明 770251