This paper describes a system that can distinguish between visually-similar objects, specifically bicycles and motorcycles, successfully from the vantage point of traffic surveillance cameras. The You Only Look Once (YOLO) is used as the main framework in this research due to its speed performance among various machine learning models and methods evaluated. We built a dataset consisting of motorcycle and bicycle images from different CCTV footage for this project. CCTV footage may vary on the angle of view from the object, image resolution, and ambient environment settings. Using this dataset, we trained YOLOv3-based models, and their performances were compared to the vanilla version of YOLOv3 and other pre-trained models. Four (4) models were trained and compared; the best-performing model is shown to be associated with a dataset with properly labeled data (i.e., marking every instance of the object of interest) and having the most number of instances in the training and testing set.
Keywords: YOLO-based Network, Visual Detection
@ARTICLE{Cordel2021a, year = 2021, publisher = {{IOP} Publishing}, volume = {1922}, number = {1}, author = {Dequito, C. J. M. and Dichaves, I. J. L. and Juan, R. J. G. and Miyanaga, M. Y. K. T. and Ilao, J. P. and Cordel, M. O. and Del Gallego, N. P. A.}, title = {Vision-based bicycle and motorcycle detection using a {YOLO}-based Network}, journal = {Journal of Physics: Conference Series}, }