Ali Najafi Turkish HS by LORA
Ali Najafi Turkish HS by LORA
TurkishBERTweet
Ali Najafi1 , Onur Varol1,2,*
1
Faculty of Engineering and Natural Sciences, Sabanci University
2
Center of Excellence in Data Analytics, Sabanci University
*
Corresponding author
{ali.najafi, onur.varol}@sabanciuniv.edu
186
Table 1: Model comparisons. Weighted F1-score of the models in a 5-fold cross-validation setting. Best scores are
presented in bold font, and more than one model is highlighted when the difference is not significant.
noise in the test set of the competition. language modeling on the hate speech dataset, like
We also conduct an error analysis to identify Caselli et al. (2020) presented in their recent work
misclassifications that our model is making. This and improved the system’s performance.
effort can reveal additional features we can imple- Multilingual models could also be utilized for
ment and issues observed in the labeled dataset. this challenge since Turkish is a low-resource lan-
Table 2 shows example tweets classified wrong. guage, and the model can benefit from the other lan-
We first focus on false negatives since we can learn guages’ hate speech datasets to infuse the broader
from these mistakes to improve our model. For knowledge of hate speech and then obtain a better
instance, we could split hashtags into words to han- performance (Röttger et al., 2022).
dle cases like #ülkemdemülteciistemiyorum (Turk- Recently, commercial models like ChatGPT
ish for #wedontwantrefugees) or handle popular have been used in various challenges. Huang et al.
hashtags differently. Regarding false positives, we (2023) suggest that the ChatGPT demonstrates high
noticed that our model correctly classifies tweets accuracy and can be considered an alternative to
as hate speech based on our own judgment. We human annotators in detecting implicit hate speech
suspect the existence of mistakes in ground truth (Gilardi et al., 2023). Other work also investigated
labels considering the examples we presented in the performance of LLMs for hate-speech or offen-
Table 2. We highlight the words within the tweets sive language detection tasks in English (Guo et al.,
that we suspect are mislabeling. 2024), Portuguese (Oliveira et al., 2023), and Turk-
ish (Çam and Özgür, 2023). However, we want to
5 Discussion raise a concern about the adversarial use of these
models to attack vulnerable groups and bypass the
In the provided dataset, we noticed tweets written
detection systems. Additional information about
in languages other than Turkish, such as Arabic and
accounts, network structure, and temporal activi-
Hebrew. This could be an artifact of the data collec-
ties should be incorporated into detection systems
tion process, and one can consider i) language-level
to address the mentioned risk.
features, ii) filtering them, or iii) obtaining repre-
sentation from LLMs. Furthermore, a study about 6 Conclusion
the annotator’s influence on the annotation quality
for HateSpeech datasets shows that the expertise In this challenge, the collective effort of research
of annotators positively influences the data quality teams points to best practices and demonstrates the
(Waseem, 2016). Considering the annotators’ influ- capabilities of the state-of-the-art models. Here, we
ence, applying impurity analysis by randomly or demonstrated different approaches and their respec-
strategically changing the annotations and monitor- tive performances in detecting online hate speech
ing the Hate Speech system’s performance could toward three different groups. We obtained the
be a good practice. third rank in the final leaderboard of the competi-
Moreover, in this competition, we are only con- tion with the TurkishBERT+Lora model.
sidering the text data to detect the existence of hate We hope language models like TurkishBER-
speech. Infusing the account information into these Tweet will be used in different downstream tasks on
systems could help them be more accurate and reli- Turkish social media. Research efforts especially
able, such as the number of followers, number of need to assess the online participation of minor-
followings, account creation date, etc. ity groups. There is a significant need for pub-
Another approach for improving the perfor- licly available models since the quality of content
mance of the systems is to expose pre-trained moderation and use of automated accounts on plat-
models with hateful content by further masked- forms like X is questionable after the acquisition
187
Table 2: Misclassification analysis. We explored the errors of our model to improve further our approach (studying
false negatives) and investigate issues with the ground-truth dataset (pointing to false positives). Here, we select
instances where our model produces the correct outcome, but the annotation process suggests otherwise. We color
the text in red that we believe suggests hate speech.
of Twitter (Varol, 2023a; Hickey et al., 2023). Pub- works. Proceedings of the National Academy of
licly available models will help researchers monitor Sciences, 106(51):21544–21549.
these platforms more closely and even help them Ozen Bas, Christine L Ogan, and Onur Varol.
develop models to protect vulnerable groups. 2022. The role of legacy media and social me-
Pre-trained models available online or devel- dia in increasing public engagement about violence
oped through challenges can be easily adapted against women in Turkey. Social Media+ Society,
8(4):20563051221138939.
for other projects. Publicly available datasets like
#Secim2023 can be used to study political discourse Nur Bengisu Çam and Arzucan Özgür. 2023. Evaluation
(Pasquetto et al., 2020; Najafi et al., 2022; Varol, of chatgpt and bert-based models for Turkish hate
2023b), and models can be utilized to study these speech detection. In Intl. Conf. on Computer Science
and Engineering, pages 229–233. IEEE.
datasets. The TurkishBERTweet that we used ap-
proach is publicly available on the HuggingFace Tommaso Caselli, Valerio Basile, Jelena Mitrović, and
platform along with the LoRA adapters for differ- Michael Granitzer. 2020. Hatebert: Retraining bert
ent tasks (Najafi and Varol, 2023). for abusive language detection in english. arXiv
preprint arXiv:2010.12472.
Open source models: TurkishBERTweet model
used in this challenge is available online at the Hug- Emilio Ferrara, Onur Varol, Clayton Davis, Filippo
gingFace platform. https://huggingface. Menczer, and Alessandro Flammini. 2016. The
co/VRLLab/TurkishBERTweet rise of social bots. Communications of the ACM,
59(7):96–104.
Acknowledgements: We thank Hasan Kemik
for discussing and supporting the challenge. We Fabrizio Gilardi, Meysam Alizadeh, and Maël Kubli.
thank TUBITAK (121C220 and 222N311) for par- 2023. Chatgpt outperforms crowd-workers for text-
annotation tasks. arXiv preprint arXiv:2303.15056.
tially funding this project. The TurkishBERTweet
model was trained and made publicly available Keyan Guo, Alexander Hu, Jaden Mu, Ziheng Shi, Zim-
thanks to the Google Cloud Research Credits pro- ing Zhao, Nishant Vishwamitra, and Hongxin Hu.
gram with the award GCP19980904. 2024. An Investigation of Large Language Mod-
els for Real-World Hate Speech Detection. arXiv
preprint arXiv:2401.03346.
188
Edward J Hu, Yelong Shen, Phillip Wallis, Zeyuan Paul Röttger, Haitham Seelawi, Debora Nozza, Zeerak
Allen-Zhu, Yuanzhi Li, Shean Wang, Lu Wang, Talat, and Bertie Vidgen. 2022. MULTILINGUAL
and Weizhu Chen. 2021. Lora: Low-rank adap- HATECHECK: Functional Tests for Multilingual
tation of large language models. arXiv preprint Hate Speech Detection Models. arXiv preprint
arXiv:2106.09685. arXiv:2206.09917.
Fan Huang, Haewoon Kwak, and Jisun An. 2023. Is Stefan Schweter. 2020. BERTurk - BERT models for
chatgpt better than human annotators? potential and Turkish.
limitations of chatgpt in explaining implicit hate Chengcheng Shao, Giovanni Luca Ciampaglia, Onur
speech. arXiv preprint arXiv:2302.07736. Varol, Kai-Cheng Yang, Alessandro Flammini, and
Filippo Menczer. 2018. The spread of low-credibility
Sarah J Jackson, Moya Bailey, and Brooke Foucault content by social bots. Nature Communications,
Welles. 2020. # HashtagActivism: Networks of race 9(1):1–9.
and gender justice. MIT Press.
Zeynep Tufekci. 2017. Twitter and tear gas: The power
Franziska B Keller, David Schoch, Sebastian Stier, and and fragility of networked protest. Yale U. Press.
JungHwan Yang. 2020. Political astroturfing on twit-
ter: How to coordinate a disinformation campaign. Gokce Uludogan, Somaiyeh Dehghan, Inanc Arin, Elif
Political Communication, 37(2):256–280. Erol, Berrin Yanıkoglu, and Arzucan Ozgur. 2024.
Overview of the Hate Speech Detection in Turkish
Sean MacAvaney, Hao-Ren Yao, Eugene Yang, Katina and Arabic Tweets (HSD-2Lang) Shared Task at
Russell, Nazli Goharian, and Ophir Frieder. 2019. CASE 2024. In Proceedings of the 7th Workshop
Hate speech detection: Challenges and solutions. on Challenges and Applications of Automated Ex-
PloS One, 14(8):e0221152. traction of Socio-political Events from Text (CASE).
Onur Varol. 2023a. Should we agree to disagree about
Mohsen Mosleh and David G Rand. 2022. Measuring Twitter’s bot problem? Online Social Networks and
exposure to misinformation from political elites on Media, 37:100263.
Twitter. Nature Communications, 13(1):7144.
Onur Varol. 2023b. Who Follows Turkish Presidential
Ali Najafi, Nihat Mugurtay, Ege Demirci, Serhat Candidates in 2023 Elections? In Signal Processing
Demirkiran, Huseyin Alper Karadeniz, and Onur and Communications Applications Conference, pages
Varol. 2022. # Secim2023: First Public Dataset for 1–4. IEEE.
Studying Turkish General Election. arXiv preprint
arXiv:2211.13121. Onur Varol, Emilio Ferrara, Clayton Davis, Filippo
Menczer, and Alessandro Flammini. 2017. Online
Ali Najafi and Onur Varol. 2023. TurkishBERTweet: human-bot interactions: Detection, estimation, and
Fast and Reliable Large Language Model for Social characterization. In Proc. of the Intl. AAAI Conf. on
Media Analysis. arXiv preprint arXiv:2311.18063. Web and Social Media, volume 11, pages 280–289.
Onur Varol, Emilio Ferrara, Christine L Ogan, Filippo
Alexandru Niculescu-Mizil and Rich Caruana. 2005. Menczer, and Alessandro Flammini. 2014. Evolution
Predicting good probabilities with supervised learn- of online user behavior during a social upheaval. In
ing. In Proc. of the Intl. Conf. on Machine Learning, Proc. of the ACM Conf. on Web Science, pages 81–90.
pages 625–632.
Onur Varol and Ismail Uluturk. 2020. Journalists on
Christine Ogan and Onur Varol. 2017. What is gained Twitter: self-branding, audiences, and involvement
and what is left to be done when content analysis is of bots. Journal of Computational Social Science,
added to network analysis in the study of a social 3(1):83–101.
movement: Twitter use during Gezi Park. Informa-
tion, Communication & Society, 20(8):1220–1238. Xindi Wang, Onur Varol, and Tina Eliassi-Rad. 2022.
Information access equality on generative models of
Amanda S Oliveira, Thiago C Cecote, Pedro HL Silva, complex networks. Applied Network Science, 7(1).
Jadson C Gertrudes, Vander LS Freitas, and Ed- Zeerak Waseem. 2016. Are you a racist or am i seeing
uardo JS Luz. 2023. How Good Is ChatGPT For De- things? annotator influence on hate speech detection
tecting Hate Speech In Portuguese? In Anais do XIV on twitter. In Proc. of the first Workshop on NLP and
Simpósio Brasileiro de Tecnologia da Informação e Computational Social Science, pages 138–142.
da Linguagem Humana, pages 94–103. SBC.
Zeerak Waseem and Dirk Hovy. 2016. Hateful symbols
Irene V Pasquetto, Briony Swire-Thompson, Michelle A or hateful people? predictive features for hate speech
Amazeen, Fabrício Benevenuto, Nadia M Brashier, detection on twitter. In Proc. of the NAACL Student
Robert M Bond, Lia C Bozarth, Ceren Budak, Ull- Research Workshop, pages=88–93.
rich KH Ecker, Lisa K Fazio, et al. 2020. Tackling
misinformation: What researchers could do with so- Ziqi Zhang and Lei Luo. 2019. Hate speech detection:
cial media data. The Harvard Kennedy School Misin- A solved problem? the challenging case of long tail
formation Review. on twitter. Semantic Web, 10(5):925–945.
189