Pengaruh Algoritma ADASYN dan SMOTE terhadap Performa Support Vector Machine pada Ketidakseimbangan Dataset Airbnb
DOI:
https://doi.org/10.29408/edumatic.v5i1.3125Keywords:
ADASYN, Classification, SMOTE, SVM, TravelingAbstract
Traveling activities are increasingly being carried out by people in the world. Some tourist attractions are difficult to reach hotels because some tourist attractions are far from the city center, Airbnb is a platform that provides home or apartment-based rentals. In lodging offers, there are two types of hosts, namely non-super host and super host. The super-host badge is obtained if the innkeeper has a good reputation and meets the requirements. There are advantages to being a super host such as having more visibility, increased earning potential and exclusive rewards. Support Vector Machine (SVM) algorithm classification process by these criteria data. Data set is unbalanced. The super host population is smaller than the non-super host. Overcoming the imbalance, this over sampling technique is carried out using ADASYN and SMOTE. Research goal was to decide the performance of ADASYN and sampling technique, SVM algorithm. Data analyse used over sampling which aims to handle unbalanced data sets, and confusion matrix used for testing Precision, Recall, and F1-SCORE, and Accuracy. Research shows that SMOTE SVM increases the accuracy rate by 1 percent from 80% to 81%, which is influenced by the increase in the True (minority) label test results and a decrease in the False label test results (majority), the SMOTE SVM is better than ADASYN SVM, and SVM without over sampling.
References
Ahmad, I., Basheri, M., Iqbal, M. J., & Rahim, A. (2018). Performance Comparison of Support Vector Machine, Random Forest, and Extreme Learning Machine for Intrusion Detection. IEEE Access, 6, 33789–33795. https://doi.org/10.1109/ACCESS.2018.2841987
Alsmadi, I., & Hoon, G. K. (2019). Term weighting scheme for short-text classification: Twitter corpuses. Neural Computing and Applications, 31(8), 3819–3831. https://doi.org/10.1007/s00521-017-3298-8
Chawla, N. V., Bowyer, K. W., Hall, L. O., & Kegelmeyer, W. P. (2002). SMOTE: Synthetic Minority Over-sampling Technique. Journal of Artificial Intelligence Research 16, 321–357. https://doi.org/10.1002/eap.2043
Chen, C. C., & Chang, Y. C. (2018). What drives purchase intention on Airbnb? Perspectives of consumer reviews, information quality, and media richness. Telematics and Informatics, 35(5), 1512–1523. https://doi.org/10.1016/j.tele.2018.03.019
Crommelin, L., Troy, L., Martin, C., & Pettit, C. (2018). Is Airbnb a Sharing Economy Superstar? Evidence from Five Global Cities. Urban Policy and Research, 36(4), 429–444. https://doi.org/10.1080/08111146.2018.1460722
Düntsch, I., & Gediga, G. (2020). Indices for rough set approximation and the application to confusion matrices. International Journal of Approximate Reasoning, 118, 155–172. https://doi.org/10.1016/j.ijar.2019.12.008
Fico, G., Montalva, J., Medrano, A., Liappas, N., Cea, G., & Arredondo, M. T. (2018). EMBEC & NBC 2017. IFMBE Proceedings, 65, 1089–1090. https://doi.org/10.1007/978-981-10-5122-7
Guttentag, D., Smith, S., Potwarka, L., & Havitz, M. (2018). Why Tourists Choose Airbnb: A Motivation-Based Segmentation Study. Journal of Travel Research, 57(3), 342–359. https://doi.org/10.1177/0047287517696980
Harianto, H., Sunyoto, A., & Sudarmawan, S. (2020). Optimasi Algoritma Naïve Bayes Classifier untuk Mendeteksi Anomaly dengan Univariate Fitur Selection. Edumatic: Jurnal Pendidikan Informatika, 4(2), 40–49. https://doi.org/10.29408/edumatic.v4i2.2433
He, H., Bai, Y., Garcia, E. A., & Li, S. (2008). ADASYN: Adaptive synthetic sampling approach for imbalanced learning. Proceedings of the International Joint Conference on Neural Networks, (3), 1322–1328. https://doi.org/10.1109/IJCNN.2008.4633969
Kusumawati, R., D’Arofah, A., & Pramana, P. A. (2019). Comparison Performance of Naive Bayes Classifier and Support Vector Machine Algorithm for Twitter’s Classification of Tokopedia Services. Journal of Physics: Conference Series, 1320(1), 0–10. https://doi.org/10.1088/1742-6596/1320/1/012016
Patil, N. M., & Nemade, M. U. (2017). Music Genre Classification Using MFCC , K-NN and SVM Classifier. International Journal of Computer Applications, 4(2), 43–47.
Pucci, F., & Rooman, M. (2017). Airbnb recsys. Kdd, 311–320. https://doi.org/10.1145/3219819.3219885
Rimal, B., Rijal, S., & Kunwar, R. (2020). Comparing Support Vector Machines and Maximum Likelihood Classifiers for Mapping of Urbanization. Journal of the Indian Society of Remote Sensing, 48(1), 71–79. https://doi.org/10.1007/s12524-019-01056-9
Rustam, Z., & Audia Ariantari, N. P. A. (2018). Support Vector Machines for Classifying Policyholders Satisfactorily in Automobile Insurance. Journal of Physics: Conference Series, 1028(1). https://doi.org/10.1088/1742-6596/1028/1/012005
Sari, V., Firdausi, F., & Azhar, Y. (2020). Perbandingan Prediksi Kualitas Kopi Arabika dengan Menggunakan Algoritma SGD, Random Forest dan Naive Bayes. Edumatic: Jurnal Pendidikan Informatika, 4(2), 1–9. https://doi.org/10.29408/edumatic.v4i2.2202
Thanh Noi, P., & Kappas, M. (2017). Comparison of Random Forest, k-Nearest Neighbor, and Support Vector Machine Classifiers for Land Cover Classification Using Sentinel-2 Imagery. Sensors (Basel, Switzerland), 18(1). https://doi.org/10.3390/s18010018
Zubrinic, K., Milicevic, M., & Zakarija, I. (2013). Comparison of Naïve Bayes and SVM Classifiers in Categorization of Concept Maps. International Journal of Computers, 7(3), 109–116.
Downloads
Published
Issue
Section
License
Semua tulisan pada jurnal ini adalah tanggung jawab penuh penulis. Edumatic: Jurnal Pendidikan Informatika bisa diakses secara free (gratis) tanpa ada pungutan biaya, sesuai dengan lisensi creative commons yang digunakan.
This work is licensed under a Lisensi a Creative Commons Attribution-ShareAlike 4.0 International License.