发布文献求助

MaxViT: Multi-axis Vision Transformer

计算机科学可扩展性变压器人工智能块（置换群论）计算机工程生成语法计算机视觉几何学数学量子力学数据库物理电压

作者

Zhengzhong Tu,Hossein Talebi,Han Zhang,Feng Yang,Peyman Milanfar,Alan C. Bovik,Yinxiao Li

出处

期刊：Lecture Notes in Computer Science 日期：2022-01-01 卷期号：: 459-479 被引量：460

链接

arxiv.org arxiv.orgdoi.org

标识

DOI：10.1007/978-3-031-20053-3_27

摘要

Transformers have recently gained significant attention in the computer vision community. However, the lack of scalability of self-attention mechanisms with respect to image size has limited their wide adoption in state-of-the-art vision backbones. In this paper we introduce an efficient and scalable attention model we call multi-axis attention, which consists of two aspects: blocked local and dilated global attention. These design choices allow global-local spatial interactions on arbitrary input resolutions with only linear complexity. We also present a new architectural element by effectively blending our proposed attention model with convolutions, and accordingly propose a simple hierarchical vision backbone, dubbed MaxViT, by simply repeating the basic building block over multiple stages. Notably, MaxViT is able to “see” globally throughout the entire network, even in earlier, high-resolution stages. We demonstrate the effectiveness of our model on a broad spectrum of vision tasks. On image classification, MaxViT achieves state-of-the-art performance under various settings: without extra data, MaxViT attains 86.5% ImageNet-1K top-1 accuracy; with ImageNet-21K pre-training, our model achieves 88.7% top-1 accuracy. For downstream tasks, MaxViT as a backbone delivers favorable performance on object detection as well as visual aesthetic assessment. We also show that our proposed model expresses strong generative modeling capability on ImageNet, demonstrating the superior potential of MaxViT blocks as a universal vision module. The source code and trained models will be available at https://github.com/google-research/maxvit .

求助该文献

最长约 10秒，即可获得该文献文件

科研通智能强力驱动
Strongly Powered by AbleSci AI

我的文献求助列表浏览历史

一分钟了解求助规则 | 捐赠本站 | 历史今天

更新

2025年影响因子查询已上线 (2025-6-18)

更新

PDF的下载单位、IP信息已删除 (2025-6-4)

科研通是完全免费的文献互助平台，具备全网最快的应助速度，最高的求助完成率。对每一个文献求助，科研通都将尽心尽力，给求助人一个满意的交代。

实时播报: 斯文的龙猫完成签到，获得积分10

刚刚; shenyihui发布了新的文献求助10

刚刚; 大模型的应助被折木浮华采纳，获得10

1秒前; 科研通AI6上传了应助文件

1秒前; lily发布了新的文献求助10

2秒前; 共享精神上传了应助文件

2秒前; 深情安青上传了应助文件

2秒前; mdomse2109完成签到，获得积分10

2秒前; 打老虎完成签到，获得积分10

2秒前; 默默的百川完成签到，获得积分20

3秒前; 健忘的寻菱发布了新的文献求助10

3秒前; 科研通AI5上传了应助文件

3秒前; 深情安青的应助被超级丝采纳，获得10

3秒前; long发布了新的文献求助10

4秒前; 情怀上传了应助文件

4秒前; 善学以致用的应助被平常心采纳，获得10

4秒前; 精明松思完成签到，获得积分10

4秒前; 冷静岱周发布了新的文献求助10

4秒前; 小蘑菇上传了应助文件

4秒前; 东方天奇完成签到，获得积分10

5秒前; 李健的小迷弟的应助被shenyihui采纳，获得10

5秒前; 烟花的应助被Y7ue采纳，获得10

5秒前; 科研通AI5的应助被无情干饭崽采纳，获得10

6秒前; 书生完成签到，获得积分10

6秒前; th001201完成签到，获得积分10

6秒前; 香菜发布了新的文献求助10

6秒前; 孤独的根号三完成签到，获得积分10

6秒前; Lucas的应助被did111采纳，获得10

7秒前; 科研通AI5上传了应助文件

8秒前; 轻松姒发布了新的文献求助10

8秒前; 开朗渊思完成签到，获得积分10

8秒前; vvvvvv发布了新的文献求助10

8秒前; 谨慎的秋灵完成签到，获得积分10

8秒前; star的应助被jgpiao采纳，获得10

9秒前; 王艺霖发布了新的文献求助10

9秒前; 桐桐的应助被闪闪寄风采纳，获得10

10秒前; gege完成签到，获得积分10

10秒前; 调皮初蓝发布了新的文献求助10

11秒前; Ava上传了应助文件

11秒前; 合适小懒猪完成签到，获得积分20

12秒前

高分求助中: Comprehensive Toxicology Fourth Edition 24000; (应助此贴封号)【重要！！请各用户(尤其是新用户)详细阅读】【科研通的精品贴汇总】 10000; Hydrothermal Circulation and Seawater Chemistry: Links and Feedbacks 1200; Pipeline and riser loss of containment 2001 - 2020 (PARLOC 2020) 1000; World Nuclear Fuel Report: Global Scenarios for Demand and Supply Availability 2025-2040 800; Risankizumab Versus Ustekinumab For Patients with Moderate to Severe Crohn's Disease: Results from the Phase 3B SEQUENCE Study 600; Lloyd's Register of Shipping's Approach to the Control of Incidents of Brittle Fracture in Ship Structures 500

热门求助领域（近24小时）

热门帖子: 关注科研通微信公众号，转发送积分 5154407; 求助须知：如何正确求助？哪些是违规求助？ 4350079; 关于积分的说明 13544335; 捐赠科研通 4192952; 什么是DOI，文献DOI怎么找？ 2299638; 邀请新用户注册赠送积分活动 1299586; 关于科研通互助平台的介绍 1244704

今日热心研友

昏睡的蟠桃

朝阳区李知恩

殷勤的紫槐

无敌霸王花

注：热心度 = 本日应助数 + 本日被采纳获取积分÷10

Copyright © 2020-2025 AbleSci.COM, 科研通, All Right Reserved

科研通是非营利科研互助平台，不忘初心，为科研助力

本站互助的所有文件仅供个人学习研究用，禁止任何人把求助的所得文献进行盈利或传播

皖ICP备2024041134号-1

皖公网安备34019202002308

科研通【文献互助QQ群】：如果您有特殊求助，或发布求助超过24小时未得到应助，可加群求助，群号：941272744【点击一键加群】

科研通【志愿服务QQ群】：如果您热爱文献互助，有热心愿意为更多人服务，请加入小伙伴群，点击申请加入

关注微信服务号

科研通