Comparison of Text Representation Methods for Sentiment Analysis Using Support Vector Machine

Authors

  • Heri Suroyo Bina Darma University, Indonesia
  • Eric Juanda Pratama Bina Darma University, Indonesia

DOI:

https://doi.org/10.52435/jaiit.v7i1.610

Keywords:

Sentiment Analysis, SVM, Text Representation, TF-IDF, TikTok

Abstract

This study aims to analyse the sentiment of text from hashtags on TikTok regarding public services in Lampung Province, categorised into three groups: positive, negative, and neutral. Data is obtained from comments on TikTok. TikTok is a social media platform that offers users unique and engaging special effects. Recently, netizens were stirred by a viral TikTok video criticising Lampung's poor road conditions, titled 'Alasan Lampung Tidak Maju-maju' (Reasons Lampung is Not Progressing). This video sparked a range of comments from netizens, including supportive, critical, and neutral responses. The study employs the KDD (Knowledge Discovery in Database) method to extract insights from the existing database. The collected data will be manually labelled using the Support Vector Machine algorithm and Python programming software before being classified. The findings show that the classification model's accuracy differs based on the text representation technique. Of the three word-to-vector techniques, the Bag of Words method reached 48% accuracy, TF-IDF achieved 71%, and FastText achieved 50%. In summary, the sentiment classification model for public service content in Lampung Province on TikTok reveals that the Support Vector Machine combined with the TF-IDF method delivers the highest accuracy.

References

A. Amalia, O. S. Sitompul, E. B. Nababan, and T. Mantoro, “An efficient text classification using fasttext for bahasa indonesia documents classification,” in 2020 International Conference on Data Science, Artificial Intelligence, and Business Analytics (DATABIA), 2020, pp. 69–75.

G. de Jesus and S. Nunes, “Establishing a Foundation for Tetun Ad-Hoc Text Retrieval: Stemming, Indexing, Retrieval, and Ranking,” vol. 1, no. 1, 2024, [Online]. Available: http://arxiv.org/abs/2412.11758

A. Santosa, I. Purnamasari, and R. Mayasari, “Pengaruh Stopword Removal dan Stemming Terhadap Performa Klasifikasi Teks Komentar Kebijakan New Normal Menggunakan Algoritma LSTM,” J-SAKTI (Jurnal Sains Komput. dan Inform., vol. 6, no. 1, pp. 81–93, 2022.

D. Darwis, N. Siskawati, and Z. Abidin, “Penerapan Algoritma Naive Bayes Untuk Analisis Sentimen Review Data Twitter Bmkg Nasional,” J. Tekno Kompak, vol. 15, no. 1, pp. 131–145, 2021.

E. Hasibuan and E. A. Heriyanto, “Analisis Sentimen Pada Ulasan Aplikasi Amazon Shopping Di Google Play Store Menggunakan Naive Bayes Classifier,” J. Tek. Dan Sci., vol. 1, no. 3, pp. 13–24, 2022.

M. I. Fikri, T. S. Sabrila, Y. Azhar, and U. M. Malang, “Comparison of the Naïve Bayes Method and Support Vector Machine on Twitter Sentiment Analysis,” SMATIKA J. STIKI Inform. J, vol. 10, no. 2, pp. 71–76, 2020.

M. Z. Anbari and B. Sugiantoro, “Studi Komparasi Metode Analisis Sentimen Naïve Bayes, SVM, dan Logistic Regression Pada Piala Dunia 2022,” J. Media Inform. Budidarma, vol. 7, no. 2, pp. 688–695, 2023.

A. S. Nurul, “Analisis Perilaku Aparatur Sipil Negara Dalam Pelayanan Publik Pada Kantor Imigrasi Kelas I Makassar.” Politeknik Negeri Ujung Pandang, 2020.

I. S. Wibowo, A. Witanti, and I. Susilawati, “Keyword Extraction Judul Berita Online Di Indonesia Menggunakan Metode TF-IDF,” JATISI (Jurnal Tek. Inform. dan Sist. Informasi), vol. 11, no. 1, 2024.

S. S. Baskar, L. Arockiam, and S. Charles, “A Systematic Approach on Data Pre-processing In Data Mining,” An Int. J. Adv. Comput. Technol., vol. 2, no. 11, pp. 335–339, 2013.

H. Hermanto, A. Mustopa, and A. Y. Kuntoro, “Algoritma Klasifikasi Naive Bayes Dan Support Vector Machine Dalam Layanan Komplain Mahasiswa,” JITK (Jurnal Ilmu Pengetah. Dan Teknol. Komputer), vol. 5, no. 2, pp. 211–220, 2020.

H. Syahputra, Y. Wiyandra, and R. Simanjuntak, “Implementasi Metode Technique for Others Reference by Similarity to Ideal Solution (TOPSIS),” vol. 1, no. 1, pp. 36–41, 2021.

K. A. Rokhman, B. Berlilana, and P. Arsi, “Perbandingan metode support vector machine dan decision tree untuk analisis sentimen review komentar pada aplikasi transportasi online,” J. Inf. Syst. Manag., vol. 2, no. 2, pp. 1–7, 2021.

K. Hambardzumyan, “Data Preprocessing in Real-time Education Management System,” in International Conference on Computer Science and Information Technology, 2021.

O. Alotaibi, E. Pardede, and S. Tomy, “Cleaning big data streams: a systematic literature review,” Technologies, vol. 11, no. 4, p. 101, 2023.

U. Surapati and A. Y. Zulkarnain, “Implementasi metode Naïve Bayes untuk mendeteksi hate speech pada Twitter,” INTECOMS J. Inf. Technol. Comput. Sci, vol. 6, no. 2, pp. 830–837, 2023.

Downloads

Published

2025-05-20

How to Cite

Heri Suroyo, & Pratama, E. J. (2025). Comparison of Text Representation Methods for Sentiment Analysis Using Support Vector Machine. Journal of Advances in Information and Industrial Technology, 7(1), 21–30. https://doi.org/10.52435/jaiit.v7i1.610

Issue

Section

Research Article