Analisis Komparatif Pemodelan Topik Promosi Judi Online pada Komentar YouTube Menggunakan Latent Dirichlet Allocation dan BERTopic
DOI:
https://doi.org/10.29240/arcitech.v6i1.16764Keywords:
Topic Modeling, LDA,, BERTopic, YouTube, Online GamblingAbstract
This study aims to analyze topics in YouTube comments related to online gambling using Latent Dirichlet Allocation (LDA) and BERTopic, as well as to compare the performance of both methods. The dataset consists of 6,350 YouTube comments obtained from Kaggle. The analysis process includes preprocessing, topic modeling, and evaluation using topic coherence and topic diversity metrics. The results show that LDA achieves a topic coherence score of 0.511 and a topic diversity score of 1.0, while BERTopic achieves a topic coherence score of 0.667 and a topic diversity score of 0.449. These findings indicate that BERTopic produces more semantically coherent topics compared to LDA, although it has a higher level of overlap between topics. Furthermore, the interpretation results reveal that several identified topics are related to online gambling promotion, while others are influenced by noise in the comment data. Therefore, BERTopic is considered more effective for analyzing short and unstructured text data.
Downloads
References
Blei, D. M., Ng, A. Y., & Jordan, M. T. (2003). Latent dirichlet allocation. Advances in Neural Information Processing Systems, 3, 993–1022.
David, E., Sondakh, M., & Harilama, S. (2017). Pengaruh Konten Vlog dalam Youtube terhadap Pembentukan Sikap Mahasiswa Ilmu Komunikasi. Acta Diurna, 6(1). https://ejournal.unsrat.ac.id/index.php/index/index
Faizah, & Lin, B. S. (2023). Visualizing Change and Correlation of Topics With LDA and Agglomerative Clustering on COVID-19 Vaccine Tweets. IEEE Access, 11(June), 51647–51656. https://doi.org/10.1109/ACCESS.2023.3278979
Gunadi, I. M. D. A., & Sugiantari, A. A. W. (2024). Mekanisme dan regulasi penegakan hukum terhadap streamer game yang menyampaikan informasi tentang judi online di YouTube. Jurnal Hukum Mahasiswa, 4(1). https://doi.org/10.36733/jhm.v4i1
Grehenson, G. (2024). Judi Online Makin Marak di Kalangan Anak Muda, Pakar UGM Sarankan Perlunya Edukasi Literasi Keuangan. UNIVERSITAS GADJAH MADA. https://ugm.ac.id/id/berita/judi-online-makin-marak-di-kalangan-anak-muda-pakar-ugm-sarankan-perlunya-edukasi-literasi-keuangan/
Grootendorst, M. (2022). BERTopic: Neural topic modeling with a class-based TF-IDF procedure. http://arxiv.org/abs/2203.05794
Handayani, L. N. (2026). Analisis perbandingan performa NMF dengan LDA pada topik modeling berita online Indonesia (Tesis Magister). Universitas Teknologi Digital Indonesia.
Husain, W. R. A.-F. (2024). Hukum Pidana Judi Online Perspektif Indonesia Dan Perkembangan Aspek Legalitas. Journal Of Human And Education (JAHE), 4(6), 1297–1304. https://doi.org/10.31004/jh.v4i6.2049
Indra, S. M., & Srihadiati, T. (2025). Analisis kriminologi peran konstruksi media terhadap penyebaran konten judi online dalam media sosial Facebook. Ranah Research: Journal of Multidisciplinary Research and Development, 7(5)
Irawan, H. (2024). Regulasi hukum bisnis dalam praktik endorsement judi online di media sosial oleh influencer Indonesia: A review. Islamic Law Journal, 2(2).
Jelita, M. (2024). Text Mining dengan Topic Modelling LDA dari Pertanyaan Gelar Wicara Literasi Perpustakaan Nasional RI. Media Pustakawan, 31(3).
Kannitha, D. Z. T., Mustafid, M., & Kartikasari, P. (2022). Pemodelan Topik Pada Keluhan Pelanggan Menggunakan Algoritma Latent Dirichlet Allocation Dalam Media Sosial Twitter. Jurnal Gaussian, 11(2), 266–277. https://doi.org/10.14710/j.gauss.v11i2.35474
Nanayakkara, A. C., & Thennakoon, G. A. D. M. (2024). Enhancing Social Media Content Analysis with Advanced Topic Modeling Techniques: A Comparative Study. International Journal on Advances in ICT for Emerging Regions (ICTer), 17(1), 40–47. https://doi.org/10.4038/icter.v17i1.7276
Nura Nugraha, I., & Utami, E. (2024). Evaluation of Creative Economy and Tourism Industry Trends based on LDA Analysis with BERTopic. Digital Zone: Jurnal Teknologi Informasi Dan Komunikasi, 15(2), 182–195. https://doi.org/10.31849/digitalzone.v15i2.23796
Nursyahrina, N., Sarjon Defit, & Rini Sovia. (2024). Metode BERTopic dan LDA un-tuk Analisis Tren Penelitian Bidang Ilmu Komputer. Jurnal KomtekInfo, 11(4). DOI:10.35134/komtekinfo.v11i4.580
Pusat Pelaporan dan Analisis Transaksi Keuangan. (2026). Catatan capaian strategis PPATK tahun 2025: Menjaga kedaulatan dan integritas ekonomi bangsa. https://www.ppatk.go.id/siaran_pers/read/1594/
Samuel, & Kristiadi, D. P. (2024). Deteksi Teks Promosi Judi Online Menggunakan AI dengan Kombinasi NLP dan Deep Learning. Jurnal Sistem Informasi dan Teknologi (SINTEK), 5(2).
Sri Gustina, Alfarel Kurniawan, & Yusril Pandawa. (2025). Tindak Pidana Judi Online : Penegakan Hukum Oleh Kepolisian, Serta UpayaDan Strategi PenanganannyaOnline Gambling Crime: Law Enforcement by the Police, as well as Efforts andStrategies for Handling it. Jiic: Jurnal Intelek Insan Cendikia, 2(5), 7763–7776. https://jicnusantara.com/index.php/jiic
Syaifuddin, A., Harianto, R. A., & Santoso, J. (2021). Analisis Trending Topik untuk Percakapan Media Sosial dengan Menggunakan Topic Modelling Berbasis Algoritme LDA. Journal of Intelligent System and Computation, 2(1), 12–19. https://doi.org/10.52985/insyst.v2i1.150
Vigar, L. S., Himawan, K. K., & Mutiara, E. (2019). Hubungan antara Spiritualitas dan Religiusitas dengan Illusion of Control pada Emerging Adult. In Jurnal Ilmiah Psikologi MIND SET (Vol. 7, Issue 01, pp. 17–24). https://doi.org/10.35814/mindset.v7i01.305
Wahyuni, W., Lestari, T. P., Apriliana, M., & Gumelta, R. (2025). Implementation of BERTopic for topic modeling analysis of the free nutritious meal program based on YouTube comments. Journal of Applied Informatics and Computing, 9(4), 1964–1971.
Syahindra, W., Murlena, M., & Hartati, H. (2020). Pemodelan Implementasi Open Access Repository Menggunakan Eprints Software di IAIN Curup. Khizanah Al-Hikmah : Jurnal Ilmu Perpustakaan, Informasi, Dan Kearsipan, 8(1), 56–70. https://doi.org/10.24252/kah.v8i1a6
Yaemico. (2024). Deteksi Judi Online [Dataset]. Kaggle. https://www.kaggle.com/datasets/yaemico/deteksi-judi-online
Downloads
Published
How to Cite
Issue
Section
Citation Check
License
Copyright (c) 2026 Nur Aisyah Wahyuni, Hafiz Irsyad

This work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License.
Authors who publish with Arcitech: Journal of Computer science and Artificial Intelligence agree to the following terms:
- Authors retain copyright and grant the journal right of first publication with the work simultaneously licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License (CC BY-NC-SA 4.0) that allows others to share the work with an acknowledgment of the work's authorship and initial publication in this journal.
- Authors are able to enter into separate, additional contractual arrangements for the non-exclusive distribution of the journal's published version of the work (e.g., post it to an institutional repository or publish it in a book), with an acknowledgment of its initial publication in this journal.
- Authors are permitted and encouraged to post their work online (e.g., in institutional repositories or on their website) prior to and during the submission process, as it can lead to productive exchanges, as well as earlier and greater citation of published work (See The Effect of Open Access).







