0% found this document useful (0 votes)
23 views4 pages

DSF Proposal

Uploaded by

wstar2176
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
23 views4 pages

DSF Proposal

Uploaded by

wstar2176
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 4

Analysis of a Large Joke Corpra

Abstract—Humor is a complex and subjective human experi- illustrating the potential of hybrid models to enhance the
ence, challenging to model and quantify. This project proposes a accuracy of humor detection systems .
comprehensive approach to analyze a large corpus of one million [1]In a similar vein, Weller and Seppi (2019) utilize trans-
jokes sourced from Reddit, utilizing state-of-the-art machine
learning algorithms and explainable AI techniques to not only former architectures to assess the humor content of jokes
detect and categorize humor but also to understand and predict collected from the Reddit platform. Their model leverages
its effectiveness. The project aims to address four primary tasks: community ratings to gauge humor, achieving results that
identifying genuine jokes, detecting duplicate jokes, categorizing parallel human judgment. This not only showcases the model’s
jokes into various humor types, and predicting the effectiveness efficacy but also its potential application in real-world scenar-
of jokes. We plan to employ a variety of models, including deep
neural networks like BERT and GPT-3, enhanced with explain- ios where public perception is key .
able AI methodologies such as SHAP, LIME and visualisation of [2]Hasan et al. (2021) develop a cutting-edge Humor
attention weights for greater transparency and interpretability. Knowledge Enriched Transformer (HKT) that integrates multi-
Metrics such as accuracy, F1 score, Jaccard index, and human- modal data—including text, audio, and visual inputs—and ex-
alignment score will be utilized to evaluate model performance ternal humor knowledge. This model sets new benchmarks in
and explanation effectiveness. The insights derived from this
project are expected to not only advance the understanding performance across various datasets and offers a comprehen-
of computational humor but also enhance the development of sive framework for analyzing humor in diverse communicative
AI-driven humor applications, improving user interaction and settings .
engagement in digital platforms.
II. DATASET
I. BACKGROUND R ESEARCH To ensure a comprehensive approach to humor detection and
analysis, this project will utilize multiple datasets sourced from
This project aims to advance the field of humor detection various platforms. Each dataset offers unique characteristics
by developing a sophisticated natural language processing sys- and content types, which are instrumental in training and
tem that utilizes state-of-the-art machine learning techniques, evaluating the machine learning models effectively.
including Transformer architectures. Focusing on the analysis 1. 1 Million Reddit Jokes Dataset[6]: A comprehensive
of jokes from diverse datasets, the project integrates advanced collection of approximately 1 million jokes from the r/Jokes
computational models with explainable AI to enhance both the subreddit, used for diverse humor style recognition and initial
accuracy and transparency of humor recognition. model training.
[5]Inácio et al. (2023) explore the dynamics of humor recog- 2. 200K Humor Detection Dataset[8]: Contains 200,000
nition within Portuguese texts using a BERT-based classifier, labeled instances for binary classification of humor, ideal
achieving an impressive F1-score of 99.6%. Their research for refining models and evaluating performance in controlled
highlights a critical insight: while these models are adept at environments.
identifying stylistic cues like punctuation, they often overlook 3. Puns Dataset[9]: Specializes in pun-based humor with
deeper elements of humor such as linguistic incongruity and detailed annotations, perfect for developing models to analyze
contextual nuances. This finding underscores a potential area and understand wordplay.
for improvement in current models, which could benefit from 4. Short Jokes Dataset[7]: Features a compact collection
an enhanced understanding of the subtleties that define humor. of jokes from a Kaggle competition, great for quick model
[4]Peyrard et al. (2021) demonstrate the capabilities of prototyping and iterative testing.
transformer models in distinguishing between humorous and
serious sentences. Their work not only confirms the effective- III. C HALLENGES
ness of transformers in humor detection but also delves into The most obvious challenge that poses itself in a joke
the mechanics of how these models prioritize and interpret detection mechanism is the absence of a fixed metric to search
different elements of the text, particularly through the lens for. The mechanism of jokes is heavily dependent on the
of attention mechanisms. This approach provides valuable context of the environment and thus have a variety of moving
insights into the specific aspects of text that are most influential parts that need to land correctly, for the success of the joke.
in determining humor . Surely, the presence of funny casual words indicate the onset
[3]Miraj and Aono (2021) present an innovative technique of a joke in a sentence, but however, the roots lie in the
that synergizes BERT with other embedding technologies such sentiment of the sentence being delivered.
as Word2Vec and FastText within a neural network ensemble. Another significant challenge is the explainability of pre-
This method has proven to significantly reduce error rates, dictions made by complex machine learning models used in
Fig. 1: System Flowchart

joke detection. Given the subjective nature of humor, it’s mor classification, making the models’ decision-making
crucial not only to predict whether something is humorous processes transparent and understandable to end-users.
but also to understand and communicate why a particular text V. P RELIMINARY W ORK
is considered a joke. This is particularly challenging with deep
learning models, which are often seen as black boxes. We conduct Exploratory Data Analysis on 4 of the above
datasets and present our findings as follows:
IV. M ETHODOLOGY A. Length of Jokes
The goal of this project is to create a sophisticated sys- The first and simplest metric we experiment with, is the
tem for humour analysis and detection by combining vari- length of the joke. Is there any correlation between how long
ous machine learning approaches and datasets in a thorough a statement runs, and how funny it sounds? Indeed there is
methodology. Through a series of clearly defined steps, the - an ideal joke has a length that is not too short, and not
methodology is designed to address the identification, classifi- too long, limited to about 15 words/ 75 characters on average.
cation, and evaluation of humour, utilising the power of both Naturally, the distribution varies between extremely short puns
advanced deep learning networks and conventional machine to relatively long story based jokes.
learning models.
1) Input and Data Collection
We begin by aggregating jokes from various sources,
including online platforms like Reddit. This stage in-
volves intensive data cleaning to remove duplicates, fix
formatting errors, and handle missing values, ensuring
high-quality data for analysis.
2) Data Processing and Analysis
Post-cleaning, the data is processed through two main
pathways: identifying genuine jokes using NLP tech-
niques to filter out non-humorous content and detecting
duplicates using feature extraction methods like TF-IDF
and Word2Vec. These methods enable the clustering of Fig. 2: Distribution of Joke Lengths for Puns
similar jokes, enhancing the dataset’s utility for precise
humor analysis. B. Commonly Occurring Words
3) Model Development Next, we try to map the most commonly occurring words as
Utilizing Transformer architectures, we develop models a wordcloud, to check if any words are frequently occurring
fine-tuned for humor detection, employing strategies amongst different jokes. Surprisingly, there are no words that
such as ensemble methods for optimal performance. stand out as ‘funny’ or universal to all jokes, suggesting that
Rigorous testing and validation ensure that the models all jokes vary in their style, dialogue and delivery from one
effectively handle diverse humor types and accurately another. Only words which suggest active/ passive voice, or
reflect nuanced humor distinctions. articles and verbs are something that top this statistic.
4) Explainability and Visualization
A significant emphasis is placed on making the models C. Sentiment Analysis
explainable through the visualization of attention mech- Lastly, we apply the sentiment analysis method to all the
anisms and the use of SHAP and LIME. These tools datasets in search of a more context based technique of
help illuminate how specific text elements influence hu- searching for a joke.
Fig. 3: Distribution of Joke Lengths for Storytelling based
Jokes Fig. 5: Sentiment analysis on Jokes

Fig. 4: K-Means Clustering of Jokes

1) Polarity: While jokes do vary on the scale of how


extreme they might sound, in terms of positive, negative or
neutral context, we observe that the sentiment and tone of
jokes are mostly taken in the neutral manner.
2) Subjectivity: An important variable change to account
for, is the subjectivity in matters of interpretation of a joke
(i.e a varying context creating a gap in the interpretation of
the kind of the joke). Our analysis goes to show that the Fig. 6: XAI: Bertviz
subjectivity varies a lot in all kinds of jokes, and hence the
interpretation of a joke (to be a joke in the first place) is also
hampered. • Regression Metrics for Predictive Tasks:
– MSE (Mean Squared Error) and RMSE (Root Mean
D. Explainable AI
Squared Error): Measure the average errors in humor
We employed BERT visualization techniques to analyze the rating predictions to assess accuracy.
attention mechanisms within our models. This involved the use
• Model Testing and Validation
of tools like BERTViz, which provided a detailed view of how
attention weights are distributed across the input sequences. By – Cross-Validation: Ensure model robustness and reli-
visualizing these attention patterns, we were able to pinpoint ability.
which specific words or phrases the model focused on when – Confusion Matrices: Visualize performance across
making predictions about humor. This insight was instrumental different humor categories to identify potential biases
in understanding the influence of individual words on the or weaknesses for improvement.
model’s output, enabling targeted improvements in the model’s • Explainability and User Engagement:
ability to detect and interpret humor accurately. – SHAP Values and Feature Importance Scores:
Enhance transparency and interpretability of the
VI. M ETRICS
model’s decisions.
To ensure a comprehensive evaluation of the humor de- – User Satisfaction Ratings: Gather during interactions
tection system, we will primarily utilize accuracy, precision, to measure user engagement.
recall, and F1-score to assess the overall effectiveness and
classification capabilities of the model.
R EFERENCES
[1] Orion Weller and Kevin Seppi. “Humor detection: A
transformer gets the last laugh”. In: arXiv preprint
arXiv:1909.00252 (2019).
[2] Md Kamrul Hasan et al. “Humor knowledge enriched
transformer for understanding multimodal humor”. In:
Proceedings of the AAAI conference on artificial intel-
ligence. Vol. 35. 14. 2021, pp. 12972–12980.
[3] Rida Miraj and Masaki Aono. “Humor Detection Using a
Bidirectional Encoder Representations from Transform-
ers (BERT) based Neural Ensemble Model”. In: 2021
8th International Conference on Advanced Informatics:
Concepts, Theory and Applications (ICAICTA). IEEE.
2021, pp. 1–6.
[4] Maxime Peyrard et al. “Laughing Heads: Can Transform-
ers Detect What Makes a Sentence Funny?” In: arXiv
preprint arXiv:2105.09142 (2021).
[5] Marcio Inácio, Gabriela Wick-Pedro, and Hugo Gonçalo
Oliveira. “What do humor classifiers learn? An attempt
to explain humor recognition models”. In: Proceedings
of the 7th Joint SIGHUM Workshop on Computational
Linguistics for Cultural Heritage, Social Sciences, Hu-
manities and Literature. 2023, pp. 88–98.
[6] Priyam Choksi. 1 Million Reddit Jokes - r/Jokes. https:
/ / www. kaggle . com / datasets / priyamchoksi / 1 - million -
reddit-jokes-rjokes. Accessed: 2024-11-14.
[7] Kaggle Humor Detection Competition - Dataset. https:
//www.kaggle.com/competitions/humor- detection/data.
Accessed: 2024-11-14.
[8] Vishnu. Humour Detection Using NLP. https : / / www.
kaggle.com/code/vishnu0399/humour- detection- using-
nlp/input. Accessed: 2024-11-14.
[9] Orion Weller. Reddit Humor Detection - Full Datasets.
https://github.com/orionw/RedditHumorDetection/tree/
master/full datasets. Accessed: 2024-11-14.

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy