计算机科学
人工智能
情态动词
模式识别(心理学)
系列(地层学)
融合
多元统计
机器学习
地质学
材料科学
语言学
哲学
古生物学
高分子化学
作者
Hao Jiang,Lianguang Liu,Cheng Lian
标识
DOI:10.1109/icaci55529.2022.9837525
摘要
With the development of sensor technology, multi-variate time series classification is an essential element in time data mining. Multivariate time series are everywhere in our daily lives, like finance, the weather, and the healthcare system. In the meantime, Transformers has achieved excellent results in terms of NLP and CV tasks. The Vision Transformer (ViT) achieves excellent results compared to SOTA's convolutional networks when pre-training large amounts of data and transferring it to multiple small to medium image recognition baselines while significantly reducing the required computing resources. At the same time, multi-modality can extract more excellent features, and related research has also developed significantly. In this work, we propose a multi-modal fusion transformer for time series classification. We use Gramian Angular Field (GAF) to convert time series to 2D images and then use CNN to extract features from 1D time series and 2D images separately to fuse them. Finally, the information output from the transformer encoder fuse is entered in ResNet for classification. We conduct extensive experiments on twelve time series datasets. Compared to several baselines, our model has obtained higher accuracy.
科研通智能强力驱动
Strongly Powered by AbleSci AI