0% found this document useful (0 votes)
25 views5 pages

Ali Najafi Turkish HS by LORA

This document presents the work of VRLLab at the HSD-2Lang 2024 competition, focusing on Turkish hate speech detection using the TurkishBERTweet model. The authors achieved third place by employing various methodologies, including an ensemble of models and fine-tuning techniques, and analyzed a dataset of 9,140 tweets related to hate speech. The study highlights the importance of model performance evaluation and the need for improved data quality in hate speech detection systems.

Uploaded by

saharlynds
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
25 views5 pages

Ali Najafi Turkish HS by LORA

This document presents the work of VRLLab at the HSD-2Lang 2024 competition, focusing on Turkish hate speech detection using the TurkishBERTweet model. The authors achieved third place by employing various methodologies, including an ensemble of models and fine-tuning techniques, and analyzed a dataset of 9,140 tweets related to hate speech. The study highlights the importance of model performance evaluation and the need for improved data quality in hate speech detection systems.

Uploaded by

saharlynds
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 5

VRLLab at HSD-2Lang 2024: Turkish Hate Speech Detection Online with

TurkishBERTweet
Ali Najafi1 , Onur Varol1,2,*
1
Faculty of Engineering and Natural Sciences, Sabanci University
2
Center of Excellence in Data Analytics, Sabanci University
*
Corresponding author
{ali.najafi, onur.varol}@sabanciuniv.edu

Abstract online sphere, researchers develop systems to au-


tomatically identify these activities, and platforms
Social media platforms like Twitter - recently
rebranded as X - produce nearly half a bil- build systems to moderate content and accounts.
lion tweets daily and host a significant number Hate speech detection is a task to identify hate-
of users that can be affected by content that ful content aimed towards groups such as refugees
is not properly moderated. In this work, we and individuals with certain beliefs or ethnicities
present an approach that ranked third at the (Waseem and Hovy, 2016; Zhang and Luo, 2019;
HSD-2Lang 2024 competition’s subtask-A, MacAvaney et al., 2019). In this work, we demon-
along with additional methodology developed
strate our approach as part of the HSD-2Lang
for this task and evaluation of different ap-
proaches. We utilize three different models, 2024 challenge to detect hate speech from textual
and the best-performing approach uses the pub- information presented in social media posts.
licly available TurkishBERTweet model with
low-rank adaptation (LoRA) for fine-tuning. 2 Data
We also experiment with another publicly avail-
This challenge is organized in collaboration with
able model and a novel methodology to en-
semble different hand-crafted features and out- the Hrant Dink Foundation for their ongoing
comes of different models. Finally, we report project about “Media Watch on Hate Speech.” Col-
the experimental results, competition scores, laborative efforts of computational and social sci-
and discussion to improve this effort further. entists defined hate speech on social media and
carried out a detailed procedure to annotate posts
1 Introduction
around specific topics and keywords. The provided
Despite the significant opportunities presented with dataset in this competition contains 9,140 tweets in
the use of social media, these platforms are shifting the context of Israel-Palestine and Turkish-Greek
towards more hostile environments, especially for conflicts and content produced against refugees and
marginalized groups. Social networks have been immigration (Uludogan et al., 2024).
used to access information efficiently (Aral et al., We preprocessed the dataset by removing sam-
2009; Wang et al., 2022), participate important ples with inconsistent ground truth information (ex-
societal events (Bas et al., 2022; Ogan and Varol, act text with different labels), and we applied dedu-
2017), and discuss political issues online (Varol plication, resulting in 8,805 tweets. Figure 1 shows
et al., 2014; Tufekci, 2017; Jackson et al., 2020). word and character length distributions. When the
The increasing popularity of social networks and ground-truth labels are considered, we measure that
the opportunities presented to reach millions of 30.5% of the dataset contains hate speech, suggest-
individuals simultaneously made these platforms ing an imbalance between the two classes. Since
vulnerable to manipulation of discourse by bad ac- the dataset only contains the textual information
tors who utilize automated accounts (Ferrara et al., presented in each tweet, we further processed them
2016; Varol et al., 2017), spread disinformation to take into account platform-specific features.
(Mosleh and Rand, 2022; Keller et al., 2020), and Removal of hyperlinks and mentions of other
coordinate targeted attacks (Shao et al., 2018; Varol accounts in the tweets. This information could
and Uluturk, 2020). These targeted attacks can be be valuable if we had a chance to process real-
coordinated or organic, and mostly, the target is time data by scraping external web content or using
minority and vulnerable groups. To prevent vul- profile information of accounts from Twitter’s API
nerable groups and improve their experience in the since these fields are omitted in the dataset. Since
185
Proceedings of the 7th Workshop on Challenges and Applications of Automated Extraction
of Socio-political Events from Text (CASE 2024), pages 185–189
March 22, 2024 ©2024 Association for Computational Linguistics
350 800 Ensemble of models (EoM) approach combines
300
Number of samples

250 600 outputs of aforementioned Hate Speech models


200
400
along with custom features extracted for this task.
150
100
These additional features consist of i) logits scores
200
50 retrieved from an emotion classifier based on a bert-
0
20 40 60
0
200 400 600 base model fine-tuned model for emotion analysis,3
Word length Character length
ii) logit scores of a sentiment classifier using Turk-
Figure 1: Tweet statistics. Distributions for word ishBERTweet sentiment analysis model, iii) collec-
count (left) and character length (right) presented for tion of Turkish blacked-list words4 used for token
the dataset. Character limits exhibit Twitter specific level features such as binary exact match feature,
limitations while some tweets may contain fewer words Levenshtein distance, hashtag exact match, and
possibly consist of hashtags. hashtag Levenshtein distance. These features are
concatenated, resulting in 16 features for the Ran-
we do not incorporate them into our analysis, we domForest classifier with 100 estimators trained
omit them from the dataset. to optimize gini-impurity. Since the outputs of
ensemble models for imbalanced datasets can be
Preprocessing pipeline for TurkishBERTweet
biased, we calibrated the outputs of the model us-
model. We consider different special tags for
ing Platt’s scaling for interpreting output scores as
Twitter-specific entities and translated the Unicode
probabilities (Niculescu-Mizil and Caruana, 2005).
characters of emojis to words describing the mean-
ing using the preprocessor created for the Turkish-
4 Results
BERTweet project (Najafi and Varol, 2023).
This section presents the experimental evaluation
3 Methodologies of approaches we tested within the dataset us-
In this challenge, we built different approaches. ing stratified 5-fold cross-validation. We also re-
We considered not only the textual data to fine-tune port the performance of models we submitted to
models but also incorporated additional signals ob- challenge for comparison. As Table 1 demon-
tained from text and blacklisted word dictionaries. strates, the Ensemble of models (EoM) gets
Here, we present the language models used as the the best performance compared to other approaches
foundation and additional features we extracted to when all models are evaluated with 5-fold cross-
improve the model’s performance. For the compe- validation. TurkishBERTweet+Lora model
tition, we submitted the model with the best public achieved the best private score, which led us to the
leaderboard score; however, one of our approaches third-best rank, although we observed a lower per-
achieved an even higher score in the private evalua- formance than the EoM model in cross-validated
tion. We presented all approaches and their respec- experiments. BERTurk+Lora model performed
tive performances in the results section. similarly to the TurkishBERTweet model using a
TurkishBERTweet1 is a new language model 5-fold setting; however, it led to a lower private
that was specifically trained on nearly 894M Turk- score. We suspect that the BERTurk model with
ish tweets and the model offers a special tokenizer standard or LoRA finetuning models was used by
that takes social media entities such as hashtags and other teams, considering the popularity and avail-
emojis into account. This model utilized LoRA (Hu ability of that model.
et al., 2021), which is a novel way of fine-tuning Considering the performance differences be-
LLMs in an efficient way, and recent research re- tween public and private leaderboards, the EoM
ports state-of-the-art performance and generaliz- demonstrates less variability than the other two ap-
ability capabilities (Najafi and Varol, 2023). proaches. Even though it is not our best-performing
BERTurk2 is a pre-trained model that utilizes model in both settings, we may consider it for our
large-scale corpus from various sources. It is a well- research projects since both cross-validated scores
known model among the Turkish NLP community point to better performance, and the leaderboard
(Schweter, 2020). score differences are negligible and can be due to
1 3
https://huggingface.co/VRLLab/TurkishBERTweet https://huggingface.co/maymuni/bert-base-turkish-
2 cased-emotion-analysis
https://huggingface.co/dbmdz/bert-base-turkish-128k-
4
uncased https://github.com/ooguz/turkce-kufur-karaliste

186
Table 1: Model comparisons. Weighted F1-score of the models in a 5-fold cross-validation setting. Best scores are
presented in bold font, and more than one model is highlighted when the difference is not significant.

Model F1-Weighted Public Score Private Score


TurkishBERTweet+LoRA 0.8137 ± 0.0059 0.70697 0.66431
BERTurk+LoRA 0.8132 ± 0.0054 0.70476 0.64944
Ensemble of Models 0.8941 ± 0.0073 0.68544 0.66103

noise in the test set of the competition. language modeling on the hate speech dataset, like
We also conduct an error analysis to identify Caselli et al. (2020) presented in their recent work
misclassifications that our model is making. This and improved the system’s performance.
effort can reveal additional features we can imple- Multilingual models could also be utilized for
ment and issues observed in the labeled dataset. this challenge since Turkish is a low-resource lan-
Table 2 shows example tweets classified wrong. guage, and the model can benefit from the other lan-
We first focus on false negatives since we can learn guages’ hate speech datasets to infuse the broader
from these mistakes to improve our model. For knowledge of hate speech and then obtain a better
instance, we could split hashtags into words to han- performance (Röttger et al., 2022).
dle cases like #ülkemdemülteciistemiyorum (Turk- Recently, commercial models like ChatGPT
ish for #wedontwantrefugees) or handle popular have been used in various challenges. Huang et al.
hashtags differently. Regarding false positives, we (2023) suggest that the ChatGPT demonstrates high
noticed that our model correctly classifies tweets accuracy and can be considered an alternative to
as hate speech based on our own judgment. We human annotators in detecting implicit hate speech
suspect the existence of mistakes in ground truth (Gilardi et al., 2023). Other work also investigated
labels considering the examples we presented in the performance of LLMs for hate-speech or offen-
Table 2. We highlight the words within the tweets sive language detection tasks in English (Guo et al.,
that we suspect are mislabeling. 2024), Portuguese (Oliveira et al., 2023), and Turk-
ish (Çam and Özgür, 2023). However, we want to
5 Discussion raise a concern about the adversarial use of these
models to attack vulnerable groups and bypass the
In the provided dataset, we noticed tweets written
detection systems. Additional information about
in languages other than Turkish, such as Arabic and
accounts, network structure, and temporal activi-
Hebrew. This could be an artifact of the data collec-
ties should be incorporated into detection systems
tion process, and one can consider i) language-level
to address the mentioned risk.
features, ii) filtering them, or iii) obtaining repre-
sentation from LLMs. Furthermore, a study about 6 Conclusion
the annotator’s influence on the annotation quality
for HateSpeech datasets shows that the expertise In this challenge, the collective effort of research
of annotators positively influences the data quality teams points to best practices and demonstrates the
(Waseem, 2016). Considering the annotators’ influ- capabilities of the state-of-the-art models. Here, we
ence, applying impurity analysis by randomly or demonstrated different approaches and their respec-
strategically changing the annotations and monitor- tive performances in detecting online hate speech
ing the Hate Speech system’s performance could toward three different groups. We obtained the
be a good practice. third rank in the final leaderboard of the competi-
Moreover, in this competition, we are only con- tion with the TurkishBERT+Lora model.
sidering the text data to detect the existence of hate We hope language models like TurkishBER-
speech. Infusing the account information into these Tweet will be used in different downstream tasks on
systems could help them be more accurate and reli- Turkish social media. Research efforts especially
able, such as the number of followers, number of need to assess the online participation of minor-
followings, account creation date, etc. ity groups. There is a significant need for pub-
Another approach for improving the perfor- licly available models since the quality of content
mance of the systems is to expose pre-trained moderation and use of automated accounts on plat-
models with hateful content by further masked- forms like X is questionable after the acquisition
187
Table 2: Misclassification analysis. We explored the errors of our model to improve further our approach (studying
false negatives) and investigate issues with the ground-truth dataset (pointing to false positives). Here, we select
instances where our model produces the correct outcome, but the annotation process suggests otherwise. We color
the text in red that we believe suggests hate speech.

False positive • #Katilİsrail [URL]


Model predicts as HS • Hükümet Cumhurbaşkanı Erdoğan Şerefsiz Suriyeliler Yağma Sizler şu an hem suç
Labeled no HS hem cinayet işliyorsunuz. İnsanlar Twitter ı kullanmak için VPN kullanıyor ve VPN
mobil cihazların şarj süresini oldukça azaltıyor. Tarihe böyle geçeceksiniz.
• onursuz ırkıcılar kökünüz kurusun lanet olsun size evet kürdüz türküz ermeniyiz
afgan’ız arabız ırkcı itler geberin lan bu ülke hepimizin # #hepimizkürdüz
• İnsanlık yapıp ülkeye alıyorsun hainlik,bu zor günde yağmacılık yapıyorlar.Bazı
şeref yoksunu suriyeliler yüzünden masum olan insanlar arada kaynıyor.Açıkçası
#ülkemdemülteciistemiyorum ! Allah herkesin yardımcısı olsun yardıma ihtiyacı
olana koşulsun ama ülkemi terketsinler. [URL]
False negative • #UELKEMDEMUELTECİİSTEMİYORUM [URL]
Model predicts no HS • Heryerde bilim uzmanı ve yer bilimci prof hocalar. Gerçeği açıklıyor. Sonra unutulup
Labeled as HS , açgözlü, rantçı,yağmacı yöneticiler soyguna devam eder. 3 yıllık bina yıkılmış, 3
yıl. #deprem #earthquake #Yağmacılar.
• sayıları 8 milyon olan suriyeli, afgan, irak ne varsa çok acil ülkelerine geri
gönderilmeli. *güvenlik tehdidi oluşturuyorlar. *işsizlik sorunu oluşturuyorlar. bill
gates #billgates #sedatpeker10

of Twitter (Varol, 2023a; Hickey et al., 2023). Pub- works. Proceedings of the National Academy of
licly available models will help researchers monitor Sciences, 106(51):21544–21549.
these platforms more closely and even help them Ozen Bas, Christine L Ogan, and Onur Varol.
develop models to protect vulnerable groups. 2022. The role of legacy media and social me-
Pre-trained models available online or devel- dia in increasing public engagement about violence
oped through challenges can be easily adapted against women in Turkey. Social Media+ Society,
8(4):20563051221138939.
for other projects. Publicly available datasets like
#Secim2023 can be used to study political discourse Nur Bengisu Çam and Arzucan Özgür. 2023. Evaluation
(Pasquetto et al., 2020; Najafi et al., 2022; Varol, of chatgpt and bert-based models for Turkish hate
2023b), and models can be utilized to study these speech detection. In Intl. Conf. on Computer Science
and Engineering, pages 229–233. IEEE.
datasets. The TurkishBERTweet that we used ap-
proach is publicly available on the HuggingFace Tommaso Caselli, Valerio Basile, Jelena Mitrović, and
platform along with the LoRA adapters for differ- Michael Granitzer. 2020. Hatebert: Retraining bert
ent tasks (Najafi and Varol, 2023). for abusive language detection in english. arXiv
preprint arXiv:2010.12472.
Open source models: TurkishBERTweet model
used in this challenge is available online at the Hug- Emilio Ferrara, Onur Varol, Clayton Davis, Filippo
gingFace platform. https://huggingface. Menczer, and Alessandro Flammini. 2016. The
co/VRLLab/TurkishBERTweet rise of social bots. Communications of the ACM,
59(7):96–104.
Acknowledgements: We thank Hasan Kemik
for discussing and supporting the challenge. We Fabrizio Gilardi, Meysam Alizadeh, and Maël Kubli.
thank TUBITAK (121C220 and 222N311) for par- 2023. Chatgpt outperforms crowd-workers for text-
annotation tasks. arXiv preprint arXiv:2303.15056.
tially funding this project. The TurkishBERTweet
model was trained and made publicly available Keyan Guo, Alexander Hu, Jaden Mu, Ziheng Shi, Zim-
thanks to the Google Cloud Research Credits pro- ing Zhao, Nishant Vishwamitra, and Hongxin Hu.
gram with the award GCP19980904. 2024. An Investigation of Large Language Mod-
els for Real-World Hate Speech Detection. arXiv
preprint arXiv:2401.03346.

References Daniel Hickey, Matheus Schmitz, Daniel Fessler, Paul E


Smaldino, Goran Muric, and Keith Burghardt. 2023.
Sinan Aral, Lev Muchnik, and Arun Sundararajan. Auditing elon musk’s impact on hate speech and bots.
2009. Distinguishing influence-based contagion In Proc. of the Intl. AAAI Conf. on Web and Social
from homophily-driven diffusion in dynamic net- Media, volume 17, pages 1133–1137.

188
Edward J Hu, Yelong Shen, Phillip Wallis, Zeyuan Paul Röttger, Haitham Seelawi, Debora Nozza, Zeerak
Allen-Zhu, Yuanzhi Li, Shean Wang, Lu Wang, Talat, and Bertie Vidgen. 2022. MULTILINGUAL
and Weizhu Chen. 2021. Lora: Low-rank adap- HATECHECK: Functional Tests for Multilingual
tation of large language models. arXiv preprint Hate Speech Detection Models. arXiv preprint
arXiv:2106.09685. arXiv:2206.09917.

Fan Huang, Haewoon Kwak, and Jisun An. 2023. Is Stefan Schweter. 2020. BERTurk - BERT models for
chatgpt better than human annotators? potential and Turkish.
limitations of chatgpt in explaining implicit hate Chengcheng Shao, Giovanni Luca Ciampaglia, Onur
speech. arXiv preprint arXiv:2302.07736. Varol, Kai-Cheng Yang, Alessandro Flammini, and
Filippo Menczer. 2018. The spread of low-credibility
Sarah J Jackson, Moya Bailey, and Brooke Foucault content by social bots. Nature Communications,
Welles. 2020. # HashtagActivism: Networks of race 9(1):1–9.
and gender justice. MIT Press.
Zeynep Tufekci. 2017. Twitter and tear gas: The power
Franziska B Keller, David Schoch, Sebastian Stier, and and fragility of networked protest. Yale U. Press.
JungHwan Yang. 2020. Political astroturfing on twit-
ter: How to coordinate a disinformation campaign. Gokce Uludogan, Somaiyeh Dehghan, Inanc Arin, Elif
Political Communication, 37(2):256–280. Erol, Berrin Yanıkoglu, and Arzucan Ozgur. 2024.
Overview of the Hate Speech Detection in Turkish
Sean MacAvaney, Hao-Ren Yao, Eugene Yang, Katina and Arabic Tweets (HSD-2Lang) Shared Task at
Russell, Nazli Goharian, and Ophir Frieder. 2019. CASE 2024. In Proceedings of the 7th Workshop
Hate speech detection: Challenges and solutions. on Challenges and Applications of Automated Ex-
PloS One, 14(8):e0221152. traction of Socio-political Events from Text (CASE).
Onur Varol. 2023a. Should we agree to disagree about
Mohsen Mosleh and David G Rand. 2022. Measuring Twitter’s bot problem? Online Social Networks and
exposure to misinformation from political elites on Media, 37:100263.
Twitter. Nature Communications, 13(1):7144.
Onur Varol. 2023b. Who Follows Turkish Presidential
Ali Najafi, Nihat Mugurtay, Ege Demirci, Serhat Candidates in 2023 Elections? In Signal Processing
Demirkiran, Huseyin Alper Karadeniz, and Onur and Communications Applications Conference, pages
Varol. 2022. # Secim2023: First Public Dataset for 1–4. IEEE.
Studying Turkish General Election. arXiv preprint
arXiv:2211.13121. Onur Varol, Emilio Ferrara, Clayton Davis, Filippo
Menczer, and Alessandro Flammini. 2017. Online
Ali Najafi and Onur Varol. 2023. TurkishBERTweet: human-bot interactions: Detection, estimation, and
Fast and Reliable Large Language Model for Social characterization. In Proc. of the Intl. AAAI Conf. on
Media Analysis. arXiv preprint arXiv:2311.18063. Web and Social Media, volume 11, pages 280–289.
Onur Varol, Emilio Ferrara, Christine L Ogan, Filippo
Alexandru Niculescu-Mizil and Rich Caruana. 2005. Menczer, and Alessandro Flammini. 2014. Evolution
Predicting good probabilities with supervised learn- of online user behavior during a social upheaval. In
ing. In Proc. of the Intl. Conf. on Machine Learning, Proc. of the ACM Conf. on Web Science, pages 81–90.
pages 625–632.
Onur Varol and Ismail Uluturk. 2020. Journalists on
Christine Ogan and Onur Varol. 2017. What is gained Twitter: self-branding, audiences, and involvement
and what is left to be done when content analysis is of bots. Journal of Computational Social Science,
added to network analysis in the study of a social 3(1):83–101.
movement: Twitter use during Gezi Park. Informa-
tion, Communication & Society, 20(8):1220–1238. Xindi Wang, Onur Varol, and Tina Eliassi-Rad. 2022.
Information access equality on generative models of
Amanda S Oliveira, Thiago C Cecote, Pedro HL Silva, complex networks. Applied Network Science, 7(1).
Jadson C Gertrudes, Vander LS Freitas, and Ed- Zeerak Waseem. 2016. Are you a racist or am i seeing
uardo JS Luz. 2023. How Good Is ChatGPT For De- things? annotator influence on hate speech detection
tecting Hate Speech In Portuguese? In Anais do XIV on twitter. In Proc. of the first Workshop on NLP and
Simpósio Brasileiro de Tecnologia da Informação e Computational Social Science, pages 138–142.
da Linguagem Humana, pages 94–103. SBC.
Zeerak Waseem and Dirk Hovy. 2016. Hateful symbols
Irene V Pasquetto, Briony Swire-Thompson, Michelle A or hateful people? predictive features for hate speech
Amazeen, Fabrício Benevenuto, Nadia M Brashier, detection on twitter. In Proc. of the NAACL Student
Robert M Bond, Lia C Bozarth, Ceren Budak, Ull- Research Workshop, pages=88–93.
rich KH Ecker, Lisa K Fazio, et al. 2020. Tackling
misinformation: What researchers could do with so- Ziqi Zhang and Lei Luo. 2019. Hate speech detection:
cial media data. The Harvard Kennedy School Misin- A solved problem? the challenging case of long tail
formation Review. on twitter. Semantic Web, 10(5):925–945.

189

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy