Wordnets Arabic
Wordnets Arabic
Wordnets
Mustafa Jarrar
Birzeit University
Jarrar © 2025 1
Watch this lecture
and download the slides
Keywords:
Arabic, Arabic NLP, NLP, Natural Language Processing, Thesauri, linguistic ontology, Arabic Ontology, Wordnet, Arabic WordNet, EURO WordNet, Global WordNet, Lexical Semantics, Word sense
disambiguation, mental lexicon, Synset, Concept, Gloss, Polysemy, Semantic Relations, Hyponymy, Meronymy, Antonymy, Ontology, ، مفهوم،متادفات ر، معجم ذهن، حوسبة الداللة، وردنت،شبكة المفردات
ر، تعدد اللغات، أنطولوجيا لغوية، االنطولوجيا العربية، حوسبة اللغة، مكت،تعريف ي
كل- عالقات جزء،ان
ي المع تصنيف ،التضاد ،المعان
ي تعدد ،اللغوي ادف
الت
Jarrar © 2025 2
Mustafa Jarrar: Lecture Notes on Introduction to Wordnets
Birzeit University, 2025
Part 3: EuroWordnet
Part 4: Global Wordnet
Part 5: Discussion
Part 6: Practice
Jarrar © 2025 3
Reading
Jarrar © 2025 4
Why Lexical Semantic Resources?
Examples:
Thesaurus of English words and phrases العرب المعارص زز
المكن
ز ز ي
محمود إسماعيل ز
Peter Mark Roget · 1883 1993 مصطف أحمد سليمان ،مصطف عبدالعزيز ناصيف، صين
ي
Jarrar © 2025 6
Thesaurus ( )مكتas a source of semantics
or
Jarrar © 2025 7
Mustafa Jarrar: Lecture Notes on Introduction to Wordnets
Birzeit University, 2025
Part 2: WordNet
Part 3: EuroWordnet
Part 4: Global Wordnet
Part 5: Discussion
Part 6: Practice
Jarrar © 2025 8
What is WordNet?
شبكة مفردات
03018908
{Table, Tabular Array}
A set of data arranged in rows
{Bureau, Dresser, and columns
Chest of Drawers,} 04615793
Furniture with drawers for
keeping clothes {work table} 06501650
Jarrar © 2025 18
WordNet Relations: Another Example
[Vossen]
{bumper} {hinge,
{car, auto, automobile, machine, motorcar}
flexible
hyponym {car window} joint}
Jarrar © 2025 21
Other WordNet Relations
Jarrar © 2025 22
Is WordNet a Thesaurus?
Yes:
• it groups together meaningfully related words
and more:
Ontological Precision:
WordNet: based on what native speakers agree roughly.
Ontology: based on Scientific and philosophical findings.
Classification:
WordNet: based on what native speakers agree roughly (Student IsA person)
Ontology: based on strict formal methodologies (student IsA role)
Formal Specification:
WordNet: logically vague (and, contains concepts without instance)
Ontology: strictly formal (every concepts can be instantiated)
Jarrar © 2025 24
Examples of ontological matters in WordNet
Examples problems in WordNet, which limited its use in IT applications:
• (Nile Is-a River) is formal mistake, Nile is an instance of River.
• (Student Is-a Person) is ontologically incorrect; Student is a Role
• (Italy Is-a Land5) and (Italy Is-a Nation) is ontologically incorrect. cannot
subsume the two disjoint concepts, land5 and nation, at the same time.
• (Reflate2 Is-a Inflate3) (Inflate3 Is-a Change1) and (Reflate2 Is-a Change1)
is meaningless, this is an implied relation.
• (Restrain1 Is-a Inhibit4) and (Inhibit4 Is-a Restrain1) is a cycle.
• (Islamic Month Is-a Month) is inaccurate, Month = twelve divisions of the
Gregorian year (i.e., 30.43 days); but Islamic month is 29.53 days.
• Moring and Evening Stars as different stars is inaccurate. They are the
same instance (i.e., Venus) that people see at different occasions.
From thesaurus to wordnet to linguistic ontology
Jarrar © 2025 25
Mustafa Jarrar: Lecture Notes on Introduction to Wordnets
Birzeit University, 2025
Part 3: EuroWordnet
Part 4: Global Wordnet
Part 5: Discussion
Part 6: Practice
Jarrar © 2025 26
EURO WordNet [Vossen]
• Languages covered:
EuroWordNet-1 (LE2-4003): English, Dutch, Spanish, Italian
EuroWordNet-2 (LE4-8328): German, French, Czech, Estonian.
• Size of vocabulary:
EuroWordNet-1: 30,000 concepts - 50,000 word meanings.
EuroWordNet-2: 15,000 concepts- 25,000 word meaning.
• Type of vocabulary:
the most frequent words of the languages
all concepts needed to relate more specific concepts.
Jarrar © 2025 27
EURO WordNet Model
[Vossen]
Domains Ontology
move bewegen
go Traffic 2OrderEntity gaan
III III
II
Lexical Items Table I I Lexical Items Table
Jarrar © 2025 29
EURO WordNet Model
[Vossen]
Jarrar © 2025 30
Some Downsides of the EuroWordNet Model
[Vossen]
• Coverage differs
• Not all wordnets can communicate with one another, i.e. linked
to different versions of English wordnet
Jarrar © 2025 31
Mustafa Jarrar: Lecture Notes on Introduction to Wordnets
Birzeit University, 2025
Jarrar © 2025 32
From EuroWordNet to Global WordNet
http://www.globalwordnet.org
Jarrar © 2025 33
From EuroWordNet to Global WordNet
[Vossen]
Jarrar © 2025 34
Arabic WordNet
• Literal and ad hoc translation for 10000 English synsets, and
never extended!
• The base concepts were then extended mostly downwards with more
specific concepts, and upwards with more general concepts, to improve
the maximal connectivity of those base concepts.
Part 5: Discussion
Part 6: Practice
Jarrar © 2025 36
Mustafa Jarrar: Lecture Notes on Introduction to Wordnets
Birzeit University, 2025
Part 6: Practice
Jarrar © 2025 37
References
[Vossen] Piek Vossen, From WordNet,toEuroWordNet,to the Global Wordnet Grid. Lecture notes
[MBC93] George A. Miller, Richard Beckwith, Christiane Fellbaum, Derek Gross, and Katherine Miller: Introduction to WordNet: An On-line Lexical Database. International Journal of Lexicography, Vol. 3, Nr. 4.
Pages 235-244. (1990)
[J21] Mustafa Jarrar: The Arabic Ontology - An Arabic Wordnet with Ontologically Clean Content. Applied Ontology Journal, 16:1, 1-26. IOS Press. 2021
1. Mustafa Jarrar, Tymaa Hammouda: Qabas: An Open-Source Arabic Lexicographic Database. In Proceedings of LREC-COLING 2024, pages 13363–13370, Torino, Italia. ELRA and ICCL.
2. Mustafa Jarrar, Sanad Malaysha, Tymaa Hammouda, Mohammed Khalilia: SALMA: Arabic Sense-Annotated Corpus and WSD Benchmarks. Proceedings the 1st ArabicNLP, Part of the ACL 2023. ACL.
3. Sana Ghanem, Mustafa Jarrar, Radi Jarrar, Ibrahim Bounhas: A Benchmark and Scoring Algorithm for Enriching Arabic Synonyms. In Proceedings of GWC2023, (pp.274-283). Spain, 2023
4. Sanad Malaysha, Mustafa Jarrar, Mohammed Khalilia: Context-Gloss Augmentation for Improving Arabic Target Sense Verification. In Proceedings of GWC2023, (pp.274-283). Spain, 2023
5. Moustafa Al-Hajj, Mustafa Jarrar: ArabGlossBERT: Fine-Tuning BERT on Context-Gloss Pairs for WSD. In Proceedings of the International Conference on Recent Advances in Natural Language Processing
(RANLP 2021). PP 40--48, 2021
6. Moustafa Al-Hajj, Mustafa Jarrar: LU-BZU at SemEval-2021 Task 2: Word2Vec and Lemma2Vec performance in Arabic Word-in-Context disambiguation. In Proceedings of the Fifteenth Workshop on
Semantic Evaluation (SemEval2021) Task 2: Multilingual and Cross-lingual Word-in-Context Disambiguation (MCL-WiC). PP 748--755, Association for Computational Linguistics. 2021
7. Mustafa Jarrar, Eman Karajah, Muhammad Khalifa, Khaled Shaalan: Extracting Synonyms from Bilingual Dictionaries. The 11th International Global Wordnet Conference (GWC2021), Global Wordnet
Association. (pp. 215-222). Pretoria, South Africa, 2021
8. Mustafa Jarrar, Hamzeh Amayreh: An Arabic-Multilingual Database with a Lexicographic Search Engine. The 24th International Conference on Applications of Natural Language to Information Systems
(NLDB 2019). Pages(234-246). LNCS 11608, Springer. 2019
9. Mustafa Jarrar, Hamzeh Amayreh, John P. McCrae: Representing Arabic Lexicons in Lemon - a Preliminary Study. The 2nd Conference on Language, Data and Knowledge (LDK 2019). Pages(29-33). CEUR,
Volume 2402. ISSN:1613-0073. Leipzig, Germany. 2019
10. Diana Alhafi, Anton Deik, Mustafa Jarrar: Usability Evaluation of Lexicographic e-Services. The 16th IEEE/ACS International Conference on Computer Systems and Applications (AICCSA). Pages(1-7). IEEE. Abu
Dhabi, UAE. 2019
11. Mustafa Jarrar, Fadi Zaraket, Rami Asia, Hamzeh Amayreh: Diacritic-Based Matching of Arabic Words. ACM Asian and Low-Resource Language Information Processing. Volume 18, No 2, Pages(10:1-10:21),
ACM, ISSN:2375-4699. December, 2018
12. Mustafa Jarrar, Werner Ceusters: Classifying Processes and Basic Formal Ontology. Proceedings of the 8th International Conference on Biomedical Ontology (ICBO 2017), Newcastle, UK. 2017
13. Mustafa Jarrar: Building a Formal Arabic Ontology (Invited Paper). In proceedings of the Experts Meeting on Arabic Ontologies and Semantic Networks. Alecso, Arab League. Tunis, July 26-28, 2011.
14. Mustafa Jarrar: Towards the notion of gloss, and the adoption of linguistic resources in formal ontology engineering. In proceedings of the 15th International World Wide Web Conference (WWW2006).
Edinburgh, Scotland. Pages 497-503. ACM Press. ISBN: 1595933239. May 2006.
15. Mustafa Jarrar, Anton Deik, Bilal Faraj: Ontology-based Data and Process Governance Framework -The Case of e-Government Interoperability in Palestine. Proceedings of the IFIP International Symposium
on Data-Driven Process Discovery and Analysis (SIMPDA’11). Pages(83-98). 2011.
16. Mustafa Jarrar and Robert Meersman: Ontology Engineering -The DOGMA Approach. Book Chapter in "Advances in Web Semantics I". Chapter 3. Pages 7-34. LNCS 4891, Springer. (2008).
17. Mustafa Jarrar: Tutorial on Arabic Ontology Engineering. The ACS/IEEE International Conference on Computer Systems and Applications. Tunis, 2017
18. Mustafa Jarrar, Maria Keet, and Paolo Dongilli: Multilingual verbalization of ORM conceptual models and axiomatized ontologies . Technical report. STARLab, Vrije Universiteit Brussel, February 2006.
19. Mustafa Jarrar: Mapping ORM into the SHOIN/OWL Description Logic- Towards a Methodological and Expressive Graphical Notation for Ontology Engineering . In OTM 2007 workshops: Proceedings of the
International Workshop on Object-Role Modeling (ORM'07). Pages (729-741), LNCS 4805, Springer. ISBN: 9783540768890. Portogal. November, 2007
Jarrar © 2025 38