计算机科学
卷积神经网络
人工智能
推论
建筑
代表(政治)
图像处理
噪音(视频)
计算机视觉
外部数据表示
机器学习
模式识别(心理学)
人工神经网络
图像(数学)
计算机工程
高内存
变压器
医学影像学
特征提取
空间分析
数字信号处理
深度学习
系统体系结构
实时计算
数据挖掘
数据建模
降噪
计算模型
作者
Ali Emre Gök,Mustafa Yurdakul,Şakir Taşdemir
标识
DOI:10.20944/preprints202604.0948.v1
摘要
In medical image analysis, modeling local and global features in high-resolution data presents a significant challenge. While the widely used Convolutional Neural Networks (CNNs) struggle to capture long-range dependencies between distant pixels, the high computational cost (O(N2)) of Vision Transformer (ViT) architectures causes bottlenecks in clinical applications. This study investigates the integration of Mamba models which were developed to overcome these limitations and have linear complexity, into medical image analysis, along with recent studies in literature. This fundamentally continuous-time control theory-based architecture dynamically adapts to hardware resolution. The mamba models effectively retain anatomical structures and lesions in memory while filtering out irrelevant noise through their selective mechanism. Moreover, bidirectional scanning (Vision Mamba) and cross-scan (VMamba) methods are used to prevent the loss of spatial information and to overcome the necessity of processing one-dimensional data due to language-based structure of the models. The reviewed literature can be categorized under three main headings: hybrid models, efficient and lightweight designs, and spatial representation studies. Comprehensive analyses of literature indicate that Mamba models deliver significantly higher inference speed and memory efficiency compared to traditional CNN and ViT approaches owing to their hardware-aware design and linear computational efficiency. In conclusion, Mamba architecture has the potential to become a next-generation standard that demonstrates high performance while maintaining global contextual integrity across diverse medical fields such as radiology, ophthalmology, and dermatology.
科研通智能强力驱动
Strongly Powered by AbleSci AI