Crowd Counting using Deep Learning Model on FPGA card

Thi Thu Thao Khong; Van Loc Tran; Hai Phong Phan; Duc Hung Duong

doi:10.26459/hueunijtt.v133i2B.7495

Vol. 133 No. 2B (2024), Research Articles

Vol. 133 No. 2B (2024)

Crowd Counting using Deep Learning Model on FPGA card

Research Articles

https://doi.org/10.26459/hueunijtt.v133i2B.7495

Published 2024-06-12

Thi Thu Thao KHONG*⁺⁻
Van Loc TRAN⁺⁻
Hai-Phong PHAN⁺⁻
Duc-Hung DUONG⁺⁻

Thi Thu Thao KHONG*

Faculty of Electrics, Electronics and Material Technology, Hue University of Sciences, Hue University, 77 Nguyen Hue, Hue, Vietnam

Van Loc TRAN

Brycen Viet Nam Co., LTD, 25 Nguyen Van Cu, Hue, Vietnam

Hai-Phong PHAN

Faculty of Electrics, Electronics and Material Technology, Hue University of Sciences, Hue University, 77 Nguyen Hue, Hue, Vietnam

Duc-Hung DUONG

Hue University, 3 Le Loi, Hue, Vietnam

PDF

Abstract

Machine learning and deep learning are becoming important tools for processing video in artificial intelligence applications, especially real-time tasks that require speed, accuracy, and flexibility. For this reason, we introduce a crowd counting and detecting system from RTSP video streams using a deep learning model. Our system uses FPGA cards, i.e. Xilinx Alveo U30 and U200, to accelerate the transmission of video streams and the deep learning inference. In the input and output stream, Vitis Video Analysis SDK GStreamer is utilized to leverage the features of Alveo U30 for streaming RTSP videos. In the deep learning inference, we apply the trained YOLOX model to detect and count people from video frames. YOLOX is accelerated by Alveo U200 based on the Mipsology Zebra framework. The proposed system not only processes multiple streams but also achieves faster inference and lower CPU usage than the system that just uses CPU for deep learning inference.

https://doi.org/10.26459/hueunijtt.v133i2B.7495

PDF

This work is licensed under a Creative Commons Attribution 4.0 International License.