Klasifikasi Jenis Kejahatan berdasarkan Teks Amar Putusan Pengadilan Hukum Pidana KUHP menggunakan IndoBERT

Tirtanusa Kurnia Adhi Perdana; Dewi Soyusiawaty

doi:10.29408/edumatic.v9i2.30326

Authors

Tirtanusa Kurnia Adhi Perdana Program Studi Informatika,Universitas Ahmad Dahlan https://orcid.org/0009-0000-3041-1861
Dewi Soyusiawaty Program Studi Informatika,Universitas Ahmad Dahlan https://orcid.org/0000-0003-2862-2904

DOI:

https://doi.org/10.29408/edumatic.v9i2.30326

Keywords:

judicial decision text analysis, indobert, criminal law article classification, natural language processing in law, legal text

Abstract

The increasing number of the court’s rulings each year presents a challenge for the judiciary. One strategic solution is the application of Artificial Intelligence (AI). Indonesian-based models such as IndoBERT is potential to ease workloads by automatically classifying legal cases. This study aims to explore the capability of IndoBERT to automatically classifying the verdict of section of Indonesian KUHP rulings to accelerate crime type identification. This is an experimental study using supervised text classification. The dataset consists of 12000 verdicts collected from the Indonesian Supreme Court website, classified using IndoBERT fine-tuned with various hyperparameter configuration. Our findings show that the model with a batch size of 8 and learning rate 5e-5 achieved accuracy of 92.59%, precison of 92.93%, recall of 92.59%, and F1-Score of 92.59% on unseen test data. The high accuracy is supported by the explicit mention of crime types within verdict texts. To date, no study has specifically utilized IndoBERT or other models for automatic classification of KUHP articles. This finding has the potential to be integrated into the Supreme Court’s Directory of Decision as a support tool for automatic classification and legal document archiving.

References

Budiman, I., Faisal, M. R., Faridhah, A., Farmadi, A., Mazdadi, M. I., Saragih, H., & Abadi, F. (2024). Classification Performance Comparison of BERT and IndoBERT on Self-Report of COVID-19 Status on Social Media. Journal of Computer Sciences Institute, 30(December 2023), 61–67. http://dx.doi.org/10.35784/jcsi.5564

Deng, Y., Zhao, N., & Huang, X. (2023). Early ChatGPT User Portrait through the Lens of Data. 2023 IEEE International Conference on Big Data (BigData), 4770–4775. Italy: IEEE. https://doi.org/10.1109/BigData59044.2023.10386415

Denk, T. I., & Peleteiro Ramallo, A. (2020). Contextual {BERT}: Conditioning the Language Model Using a Global State. Proceedings of the Graph-based Methods for Natural Language Processing (TextGraphs), 46–50. Barcelona,Spain:Association for Computational Linguistics. https://doi.org/10.18653/v1/2020.textgraphs-1.5

Fahrudin, A., Nurhaipah, T., Ikhwan, G., & Sabda, A. (2025). Peran AI dalam Transformasi Komunikasi : Peluang dan Tantangan. Jurnal Ilmu Komunikasi Andalan, 8(1), 1–10.

Furqon, I. N., & Soyusiawaty, D. (2025). The Role of VADER and SentiWordNet Labeling in Naïve Bayes Accuracy for Sentiment Analysis of Rice Price Increases. Aviation Electronics, Information Technology, Telecommunications, Electricals, and Controls (AVITEC), 7(1), 72-85. https://doi.org/10.28989/avitec.v7i1.2806

Garrido-Merchan, E. C., Gozalo-Brizuela, R., & Gonzalez-Carvajal, S. (2023). Comparing BERT Against Traditional Machine Learning Models in Text Classification. Journal of Computational and Cognitive Engineering, 2(4), 352–356. https://doi.org/10.47852/bonviewJCCE3202838

Indriani, F., Nugroho, R. A., Faisal, M. R., & Kartini, D. (2024). Comparative Evaluation of IndoBERT, IndoBERTweet, and mBERT for Multilabel Student Feedback Classification. Jurnal RESTI, 8(6), 748–757. https://doi.org/10.29207/resti.v8i6.6100

Juarto, B., & Yulianto. (2023). Indonesian News Classification Using IndoBert. International Journal of Intelligent Systems and Applications in Engineering, 11(2), 454–460. http://dx.doi.org/10.59422/global.v2i02.229

Khairani, U., Mutiawani, V., & Ahmadian, H. (2024). Pengaruh Tahapan Preprocessing Terhadap Model Indobert Dan Indobertweet Untuk Mendeteksi Emosi Pada Komentar Akun Berita Instagram. Jurnal Teknologi Informasi Dan Ilmu Komputer, 11(4), 887–894. https://doi.org/10.25126/jtiik.1148315

Khoirunnisaa, N., Nabila Nastiti Kesuma, K., Setiawan, S., & Yunizar Pratama Yusuf, A. (2024). Klasifikasi Teks Ulasan Aplikasi Netflix Pada Google Play Store Menggunakan Algoritma Naïve Bayes Dan Svm. SKANIKA: Sistem Komputer Dan Teknik Informatika, 7(1), 64–73. https://doi.org/10.36080/skanika.v7i1.3138

Koto, F., Rahimi, A., Lau, J., & Baldwin, T. (2020). IndoLEM and IndoBERT: A Benchmark Dataset and Pre-trained Language Model for Indonesian NLP. Proceedings of the 28th International Conference on Computational Linguistics, 757-770. Barcelona,Spain: International Comitee on Computational Linguistics https://doi.org/10.18653/v1/2020.coling-main.66

Maulana, M. D., & Aditya, C. S. K. (2025). Perbandingan IndoBERT dan Bi-LSTM Dalam Mendeteksi Pelanggaran Undang-Undang ITE. SINTECH (Science and Information Technology) Journal, 8(1), 52-59.

Nabiilah, G. Z., Alam, I. N., Purwanto, E. S., & Hidayat, M. F. (2024). Indonesian multilabel classification using IndoBERT embedding and MBERT classification. International Journal of Electrical and Computer Engineering, 14(1), 1071–1078. https://doi.org/10.11591/ijece.v14i1.pp1071-1078

Pakpahan, R. (2021). Analisa Pengaruh Implementasi Artificial Intelligence Dalam Kehidupan Manusia. Journal of Information System, Informatics and Computing, 5(2), 506-513. https://doi.org/10.52362/jisicom.v5i2.616.

Rahma, I. A., & Suadaa, L. H. (2023). Penerapan Text Augmentation untuk Mengatasi Data yang Tidak Seimbang pada Klasifikasi Teks Berbahasa Indonesia. Jurnal Teknologi Informasi Dan Ilmu Komputer, 10(6), 1329-1340. https://doi.org/10.25126/jtiik.2023107325

Rohsawati, M. (2023). Tajamnya Pedang Jabatan Bisa Memenggal Keadilan. Seminar Nasional HUBISINTEK, 449–452.

Sudianto, S., Sripamuji, A. D., Ramadhanti, I. R., Amalia, R. R., Saputra, J., & Prihatnowo, B. (2022). Penerapan Algoritma Support Vector Machine dan Multi-Layer Perceptron pada Klasifikasi Topik Berita. Jurnal Nasional Pendidikan Teknik Informatika: JANAPATI, 11(2), 84-91.

Wu, T., Wang, Y., & Quach, N. (2025). Advancements in Natural Language Processing : Exploring Transformer-Based Architectures for Text Understanding. 5th International Conference on Artificial Intelligence and Industrial Technology Applications (AIITA 2025). https://doi.org/10.48550/arXiv.2503.20227. Xi'an, China: IEEE

Yasir, A. H., & Gunawan, A. (2024). Mengungkap Dampaknya : Peran Teknologi AI dalam Revolusi Industri 4 . 0 bagi Sumber Daya Manusia Pendahuluan Metode Hasil dan Pembahasan. GLOBAL Jurnal Lentera BITEP, 02(02), 48–55. https://doi.org/10.59422/global.v2i02.229

Yoo, S. (2021). Comparison of Artificial Intelligence and Human Motivation. Technium Social Sciences Journal, 25, 345–351. https://doi.org/10.47577/tssj.v25i1.4736

Zhang, C., & Lu, Y. (2021). Study on artificial intelligence: The state of the art and future prospects. Journal of Industrial Information Integration, 23, 100224. https://doi.org/10.1016/j.jii.2021.100224