推论
软件部署
计算机科学
延迟(音频)
方案(数学)
吞吐量
服务器
计算机图形学
实时计算
人工智能
分布式计算
操作系统
无线
数学
电信
数学分析
作者
Jiacong Fang,Qiong Liu,Jingzheng Li
标识
DOI:10.1109/icccbda51879.2021.9442557
摘要
Object detection constitutes a large part of computer vision applications. You Only Look Once (YOLO) v5 is a salient object detection algorithm that provides high accuracy and real-time performance. This paper illustrates a deployment scheme of YOLOv5 with inference optimizations on Nvidia graphics cards using an open-source deep-learning deployment framework named Triton Inference Server. Moreover, we developed a non-maximum suppression (NMS) operator with dynamic-batch-size support in TensorRT to accelerate inference. The experimental results show that both throughput and latency are improved significantly through our deployment scheme.
科研通智能强力驱动
Strongly Powered by AbleSci AI