Building Synonym Sets for English WordNet with Robust Clustering using Links Method




F-measure, Gold Standard, Robust Clustering Using Links, WordNet,


English WordNet is an important synonym set to present the similarity of meanings between words. Synonym Set is built using Oxford Thesaurus which is accessed through, which is a part of the lexical database that will be used. After using the extraction process through Oxford Thesaurus it will produce a synonym set with the same meaning between words. The difference between WordNet and ordinary dictionaries is that the word is interconnected with other words. One method employed for this approach is Robust Clustering Using Links method, which is similarity values and synonym sets that have been created to be used to build a lexical database. Therefore the main purpose of the development of the English WordNet is to produce an accurate synonym set using clustering techniques. The evaluation calculation will use the F-measure method and will use the gold standard for the calculation method. With the ROCK method, there is an increase in accuracy output from dataset input. Building the English wordnet is to improve words that can be used to help research and development of other language wordnets with role models using more accurate English wordnets. And the use of ROCK method there is an increase in the accuracy upon results of the development of English wordnet compared to the previous method, which is using hierarchical clustering. The outcome of this study resulted in improved accuracy so that the ROCK method is one of the good methods used in the development of the English wordnet.

Author Biography

Sarah Suryaningsih, Department of Informatics Engineering, Universitas Telkom



Chen, D., Jianzhuo, Y., Liying, F., & Bin, S. (2009). Measure Semantic Distance in WordNet Based on Directed Graph Search. International Conference on E-Learning, E-Business, Enterprise Information Systems, and E-Government, 57–60.

Dembczynski, K. J., Waegeman, W., Cheng, W., & Hüllermeier, E. (2011). An Exact Algorithm for F-Measure Maximization. In J. Shawe-Taylor, R. S. Zemel, P. L. Bartlett, F. Pereira, & K. Q. Weinberger (Eds.), Advances in Neural Information Processing Systems 24 (pp. 1404–1412). Curran Associates, Inc.

Fellbaum, C., & Miller, G. (1998). The Lexical Database. In WordNet: An Electronic Lexical Database (p. 22). MITP.

Gelbukh, A. (2007). Computational Linguistics and Intelligent Text Processing: 8th International Conference. Springer Science & Business Media.

Guha, S., Rastogi, R., & Shim, K. (2001). ROCK: A Robust Clustering Algorithm for Categorical Attributes. Information Systems, 25, 345–366.

Gunawan, & Saputra, A. (2010). Building Synsets for Indonesian WordNet with Monolingual Lexical Resources. International Conference on Asian Language Processing, 297–300.

Hendrik, & Cahyono, A. (2017). Model WordNet Bahasa Indonesia berbasis Linked Data. Jurnal Nasional Teknik Elektro Dan Teknologi Informasi (JNTETI), 6(1), 8–14.

Ilson, R. (2011). On the Historical Thesaurus of the Oxford English Dictionary. International Journal of Lexicography, 24(3), 241–260.

Jain, G., & Lobiyal, D. K. (2019). Word Sense Disambiguation of Hindi Text using Fuzzified Semantic Relations and Fuzzy Hindi WordNet. 9th International Conference on Cloud Computing, Data Science & Engineering (Confluence), 494–497.

Kim, Y. B., & Kim, Y. S. (2008). Latent Semantic Kernels for WordNet: Transforming a Tree-Like Structure into a Matrix. International Conference on Advanced Language Processing and Web Information Technology, 76–80.

Miller, G. A. (1995). WordNet: A Lexical Database for English. Communications of the ACM, 38(11), 39–41.

Priyatno, J., & Bijaksana, M. A. (2019). Clustering synonym sets in english wordNet. 7th International Conference on Information and Communication Technology, ICoICT 2019.

Samhith, K., Tilak, S. A., & Panda, G. (2016). Word sense disambiguation using WordNet Lexical Categories. International Conference on Signal Processing, Communication, Power and Embedded System (SCOPES), 1664–1666.

Swain, D., Tambe, M., Ballal, P., Dolase, V., Agrawal, K., & Rajmane, Y. (2019). Lexical Text Simplification Using WordNet (pp. 114–122).

Zhang, Y., & Hasi. (2015). A Constructing Method of Mongolia-Chinese-English Multilingual Semantic Net Based on WordNet. International Conference on Computer Science and Applications (CSA), 196–198.