Optimasi Deteksi Objek pada Video dengan Kompresi Region of Interest menggunakan Model YOLOv8

Azhryl Assagaf; Muis Muhtadi

doi:10.29408/edumatic.v9i2.30007

Authors

Azhryl Assagaf Program Studi Teknik Informatika, Universitas Negeri Malang https://orcid.org/0009-0000-7779-0686
Muis Muhtadi Program Studi Teknik Informatika, Universitas Negeri Malang https://orcid.org/0009-0009-4265-8138

DOI:

https://doi.org/10.29408/edumatic.v9i2.30007

Keywords:

object detection, video compression, region of interest, yolov8, map evaluation

Abstract

The demand for real-time object detection systems, such as those used in video surveillance and autonomous vehicles, drives the need for efficient data storage and transmission without compromising accuracy. One promising approach is Region of Interest (ROI)-based video compression, which preserves visual quality in important areas. This study aims to evaluate the impact of video compression on object detection accuracy using the YOLOv8 model through statistical analysis using Analysis of Variance (ANOVA), and to compare the effectiveness of uniform and ROI-based compression methods. Videos from the VIRAT Video Dataset were compressed using the Constant Rate Factor (CRF) parameter and evaluated based on mAP_50, mAP_50_95, and file size. ANOVA results indicate no statistically significant differences between the two methods. At CRF 50, file size can be reduced by over 60%, but mAP_50 accuracy drops below 50% due to quality degradation in non-ROI areas, which disrupts the spatial context required by the model. This study contributes by examining the compression tolerance limits of YOLOv8 and reveals that overall visual quality, rather than just object-focused quality, plays a crucial role in model performance. These findings have important implications for real-time applications such as CCTV and autonomous vehicles, where a balance between compression efficiency and detection accuracy is critical. Future studies may explore adaptive ROI approaches that consider dynamic object movement.

References

Achlison, U., Santoso, N. J. T., Rozikin, N. K., & Diapoldo, N. F. (2023). Analisis Latensi Video Streaming Antara Jaringan Berbasis Local Area Network dan Web. Pixel Jurnal Ilmiah Komputer Grafis, 15(2), 473–477. https://doi.org/10.51903/pixel.v15i2.1037

Al, G. M. E. (2021). State-of-the-Art in Video Processing: Compression, Optimization, and Retrieval. Türk Bilgisayar Ve Matematik Eğitimi Dergisi, 12(5), 1256–1272. https://doi.org/10.17762/turcomat.v12i5.1793

Alnoor, J. A. A., & Mustafa, A. B. A. N. (2022). Comparative Objective Analysis of Video Quality between H.265/HEVC and H.264/AVC. University of Khartoum Engineering Journal, 10(1), 1–10. https://doi.org/10.53332/kuej.v10i1.914

Ariyanti, S., Setiawan, A. S., & Munandar, J. M. (2021). Study of Mobile Operator Readiness Measurement in Indonesia for 5G Technology Deployment. Buletin Pos dan Telekomunikasi, 19(2), 105-118. https://doi.org/10.17933/bpostel.2021.190203

Gandor, T., & Nalepa, J. (2022). First Gradually, Then Suddenly: Understanding the Impact of Image Compression on Object Detection Using Deep Learning. Sensors, 22(3), 1104. https://doi.org/10.3390/s22031104

Gong, T., Chen, K., Wang, X., Chu, Q., Zhu, F., Lin, D., Yu, N., & Feng, H. (2021). Temporal ROI Align for Video Object Recognition. Proceedings of the AAAI Conference on Artificial Intelligence, 35(2), 1442–1450. Vancouver, Kanada: Association for the Advancement of Artificial Intelligence. https://doi.org/10.1609/aaai.v35i2.16234

Haidous, A., Oswald, W., Das, H., & Gong, N. (2022). Content-Adaptable ROI-Aware Video Storage for Power-Quality Scalable Mobile Streaming. IEEE Access, 10, 26830–26848. https://doi.org/10.1109/access.2022.3156274

Hanooman, V., Kokaram, A. C., Su, Y., Birkbeck, N., & Adsumilli, B. (2021). The Effect of Degradation on Compressibility of Video. SPIE Optical Engineering, 35. https://doi.org/10.1117/12.2593916

Khadir, M., Hashmi, M. F., Kotambkar, D. M., & Gupta, A. (2024). Innovative Insights: A Review of Deep Learning Methods for Enhanced Video Compression. IEEE Access, 12, 125706–125725. https://doi.org/10.1109/access.2024.3450814

Kwon, M. J., Nam, S. H., Yu, I. J., Lee, H. K., & Kim, C. (2022). Learning JPEG Compression Artifacts for Image Manipulation Detection and Localization. International Journal of Computer Vision, 130(8), 1875-1895. https://doi.org/10.1007/s11263-022-01617-5

Lakhan, N. R., & Verma, N. K. S. (2023). Mastering the Art of Video Conferencing: Remote Learning and Virtual Conferences. International Journal of Scientific Research in Computer Science Engineering and Information Technology, 5, 554–561. https://doi.org/10.32628/cseit2390272

Li, B., Liang, J., Fu, H., & Han, J. (2023). ROI-Based Deep Image Compression With Swin Transformers. ICASSP 2022 - 2022 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), 1–5. Singapura: IEEE. https://doi.org/10.1109/icassp49357.2023.10094674

Liu, H., Jin, F., Zeng, H., Pu, H., & Fan, B. (2023). Image Enhancement Guided Object Detection in Visually Degraded Scenes. IEEE Transactions on Neural Networks and Learning Systems, 35(10), 14164–14177. https://doi.org/10.1109/tnnls.2023.3274926

Ma, Y., Zhai, Y., Yang, C., Yang, J., Wang, R., Zhou, J., Li, K., Chen, Y., & Wang, R. (2021). Variable Rate ROI Image Compression Optimized for Visual Quality. 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW), 1936–1940. New Orleans, USA: IEEE. https://doi.org/10.1109/cvprw53098.2021.00221

O’Byrne, M., Sugrue, M., Vibhoothi, N., & Kokaram, A. (2022). Impact of Video Compression on the Performance of Object Detection Systems for Surveillance Applications. 2022 19th IEEE International Conference on Advanced Video and Signal Based Surveillance (AVSS), 1–8. Singapore: IEEE. https://doi.org/10.1109/avss56176.2022.9959476

Rajasekhar, J. (2024). Understanding YOLO: Real-Time Object Detection Explained. International Journal of Scientific Research in Engineering and Management, 8(7), 1–9. https://doi.org/10.55041/ijsrem36359

Shinohara, Y., Itsumi, H., Florian, B., & Iwai, T. (2020). Video Compression Estimating Recognition Accuracy for Remote Site Object Detection. 2020 International Wireless Communications and Mobile Computing Conference (IWCMC), 285–290. Limassol, Cyprus: IEEE. https://doi.org/10.1109/iwcmc48107.2020.9148347

Tubagus, A. S., Mahdi, R. S., Rizal, A., & Suharso, A. (2021). Analisis Perbandingan Teknik Video Codec H.264/AVC, H.265/HEVC, VP9 dan AV1. Edumatic: Jurnal Pendidikan Informatika, 5(2), 187–195. https://doi.org/10.29408/edumatic.v5i2.3850

Terven, J., Córdova-Esparza, D.-M., & Romero-González, J.-A. (2023). A Comprehensive Review of YOLO Architectures in Computer Vision: From YOLOv1 to YOLOv8 and YOLO-NAS. Machine Learning and Knowledge Extraction, 5(4), 1680–1716. https://doi.org/10.3390/make5040083

Yao, F., Jia, X., Wang, J., & Deng, K. (2022). EWF-Based Unequal Error Protection Scheme for ROI-Coded Images. In Proceedings of the 2022 5th International Conference on Telecommunications and Communication Engineering (ICTCE ’22), 238–243. Chengdu, China: IEEE. https://doi.org/10.1145/3577065.3577108

Yu, J., Kim, Y., & Kim, Y. (2021). Intelligent Video Data Security: A Survey and Open Challenges. IEEE Access, 9, 26948–26967. https://doi.org/10.1109/access.2021.3057605