人工智能
计算机科学
计算机视觉
图像传感器
编码
计算
MNIST数据库
迭代重建
视觉对象识别的认知神经科学
编码孔径
多路复用
模式识别(心理学)
光学
对象(语法)
物理
人工神经网络
算法
探测器
基因
电信
化学
生物化学
作者
Xiuxi Pan,Xiaohong Chen,Tomoya Nakamura,Masahiro Yamaguchi
出处
期刊:Optics Express
[Optica Publishing Group]
日期:2021-10-14
卷期号:29 (23): 37962-37962
被引量:25
摘要
A mask-based lensless camera adopts a thin mask to optically encode the scene and records the encoded pattern on an image sensor. The lensless camera can be thinner, lighter and cheaper than the lensed camera. But additional computation is required to reconstruct an image from the encoded pattern. Considering that the significant application of the lensless camera could be inference, we propose to perform object recognition directly on the encoded pattern. Avoiding image reconstruction not only saves computational resources but also averts errors and artifacts in reconstruction. We theoretically analyze multiplexing property in mask-based lensless optics which maps local information in the scene to overlapping global information in the encoded pattern. To better extract global features, we propose a simplified Transformer-based architecture. This is the first time to study Transformer-based architecture for encoded pattern recognition in mask-based lensless optics. In the optical experiment, the proposed system achieves 91.47% accuracy on the Fashion MNIST and 96.64% ROC AUC on the cats-vs-dogs dataset. The feasibility of physical object recognition is also evaluated.
科研通智能强力驱动
Strongly Powered by AbleSci AI