Building Synonym Sets for English WordNet with Robust Clustering using Links Method
DOI:
https://doi.org/10.29408/edumatic.v4i1.2063Keywords:
F-measure, Gold Standard, Robust Clustering Using Links, WordNet,Abstract
English WordNet is an important synonym set to present the similarity of meanings between words. Synonym Set is built using Oxford Thesaurus which is accessed through lexico.com, which is a part of the lexical database that will be used. After using the extraction process through Oxford Thesaurus it will produce a synonym set with the same meaning between words. The difference between WordNet and ordinary dictionaries is that the word is interconnected with other words. One method employed for this approach is Robust Clustering Using Links method, which is similarity values and synonym sets that have been created to be used to build a lexical database. Therefore the main purpose of the development of the English WordNet is to produce an accurate synonym set using clustering techniques. The evaluation calculation will use the F-measure method and will use the gold standard for the calculation method. With the ROCK method, there is an increase in accuracy output from dataset input. Building the English wordnet is to improve words that can be used to help research and development of other language wordnets with role models using more accurate English wordnets. And the use of ROCK method there is an increase in the accuracy upon results of the development of English wordnet compared to the previous method, which is using hierarchical clustering. The outcome of this study resulted in improved accuracy so that the ROCK method is one of the good methods used in the development of the English wordnet.
References
Chen, D., Jianzhuo, Y., Liying, F., & Bin, S. (2009). Measure Semantic Distance in WordNet Based on Directed Graph Search. International Conference on E-Learning, E-Business, Enterprise Information Systems, and E-Government, 57–60. https://doi.org/10.1109/EEEE.2009.16
Dembczynski, K. J., Waegeman, W., Cheng, W., & Hüllermeier, E. (2011). An Exact Algorithm for F-Measure Maximization. In J. Shawe-Taylor, R. S. Zemel, P. L. Bartlett, F. Pereira, & K. Q. Weinberger (Eds.), Advances in Neural Information Processing Systems 24 (pp. 1404–1412). Curran Associates, Inc. http://papers.nips.cc/paper/4389-an-exact-algorithm-for-f-measure-maximization.pdf
Fellbaum, C., & Miller, G. (1998). The Lexical Database. In WordNet: An Electronic Lexical Database (p. 22). MITP. http://ieeexplore.ieee.org/document/6285385
Gelbukh, A. (2007). Computational Linguistics and Intelligent Text Processing: 8th International Conference. Springer Science & Business Media.
Guha, S., Rastogi, R., & Shim, K. (2001). ROCK: A Robust Clustering Algorithm for Categorical Attributes. Information Systems, 25, 345–366. https://doi.org/10.1016/S0306-4379(00)00022-3
Gunawan, & Saputra, A. (2010). Building Synsets for Indonesian WordNet with Monolingual Lexical Resources. International Conference on Asian Language Processing, 297–300. https://doi.org/10.1109/IALP.2010.69
Hendrik, & Cahyono, A. (2017). Model WordNet Bahasa Indonesia berbasis Linked Data. Jurnal Nasional Teknik Elektro Dan Teknologi Informasi (JNTETI), 6(1), 8–14. https://doi.org/10.22146/jnteti.v6i1.288
Ilson, R. (2011). On the Historical Thesaurus of the Oxford English Dictionary. International Journal of Lexicography, 24(3), 241–260. https://doi.org/10.1093/ijl/ecq032
Jain, G., & Lobiyal, D. K. (2019). Word Sense Disambiguation of Hindi Text using Fuzzified Semantic Relations and Fuzzy Hindi WordNet. 9th International Conference on Cloud Computing, Data Science & Engineering (Confluence), 494–497. https://doi.org/10.1109/CONFLUENCE.2019.8776967
Kim, Y. B., & Kim, Y. S. (2008). Latent Semantic Kernels for WordNet: Transforming a Tree-Like Structure into a Matrix. International Conference on Advanced Language Processing and Web Information Technology, 76–80. https://doi.org/10.1109/ALPIT.2008.40
Miller, G. A. (1995). WordNet: A Lexical Database for English. Communications of the ACM, 38(11), 39–41. https://doi.org/10.1145/219717.219748
Priyatno, J., & Bijaksana, M. A. (2019). Clustering synonym sets in english wordNet. 7th International Conference on Information and Communication Technology, ICoICT 2019. https://doi.org/10.1109/ICoICT.2019.8835313
Samhith, K., Tilak, S. A., & Panda, G. (2016). Word sense disambiguation using WordNet Lexical Categories. International Conference on Signal Processing, Communication, Power and Embedded System (SCOPES), 1664–1666. https://doi.org/10.1109/SCOPES.2016.7955725
Swain, D., Tambe, M., Ballal, P., Dolase, V., Agrawal, K., & Rajmane, Y. (2019). Lexical Text Simplification Using WordNet (pp. 114–122). https://doi.org/10.1007/978-981-13-9942-8_11
Zhang, Y., & Hasi. (2015). A Constructing Method of Mongolia-Chinese-English Multilingual Semantic Net Based on WordNet. International Conference on Computer Science and Applications (CSA), 196–198. https://doi.org/10.1109/CSA.2015.47
Downloads
Additional Files
Published
Issue
Section
License
Semua tulisan pada jurnal ini adalah tanggung jawab penuh penulis. Edumatic: Jurnal Pendidikan Informatika bisa diakses secara free (gratis) tanpa ada pungutan biaya, sesuai dengan lisensi creative commons yang digunakan.
This work is licensed under a Lisensi a Creative Commons Attribution-ShareAlike 4.0 International License.