z
Can AI Keep
Online Gaming
Civil?
How Machines Learn to Detect Toxicity — and
What It Says About Us
By Siddharth Rao Kartik | CMPE 255 – Data Mining
Short Story Assignment
z
Introduction
 Online games today are
massive social platforms
where millions of players
communicate every minute.
Toxic chat behavior, however,
is a growing concern. Human
moderation alone cannot
handle the scale, which is
why researchers are turning
to AI to detect harmful
messages automatically.
z
Research
Focus
 The study 'A Comparative
Study on Toxicity Detection
in Gaming Chats Using
Machine Learning and Large
Language Models' by Yehor
Tereshchenko and Mika
Hämäläinen (2025)
compares multiple AI
techniques for moderation:
 • Traditional ML models
using TF-IDF and
embeddings
 • Fine-tuned transformer
models such as DistilBERT
 • Large Language Models
(LLMs) for zero/few-shot
learning
z
Key
Findings
 DistilBERT achieved the best trade off
‑
between accuracy, processing speed,
and computational cost. Traditional
models were fast but less context-
aware, while LLMs performed well but
were too expensive and slow for real-
time moderation.
z
Relevance to
Data Mining
 This study demonstrates
several core data mining
principles:
 • Text classification and
pattern recognition
 • Dimensionality
reduction via
embeddings
 • Clustering of behavior
and anomaly detection
 • Evaluation metrics
such as precision, recall,
and F1-score
z
Ethical and Practical
Insights
 Automated moderation must balance fairness and
empathy. Mislabeling harmless banter can alienate
players, while missing genuine abuse harms
communities. A hybrid approach—AI for filtering,
humans for review—offers the best path forward.
z
Conclusion
 This research highlights how data mining and NLP
can support healthier online communities.
Combining efficiency of AI with human
understanding ensures fairness and scalability in
gaming moderation.
z
z
Thank you

Toxicity_Detection_Presentation_SID.pptx

  • 1.
    z Can AI Keep OnlineGaming Civil? How Machines Learn to Detect Toxicity — and What It Says About Us By Siddharth Rao Kartik | CMPE 255 – Data Mining Short Story Assignment
  • 2.
    z Introduction  Online gamestoday are massive social platforms where millions of players communicate every minute. Toxic chat behavior, however, is a growing concern. Human moderation alone cannot handle the scale, which is why researchers are turning to AI to detect harmful messages automatically.
  • 3.
    z Research Focus  The study'A Comparative Study on Toxicity Detection in Gaming Chats Using Machine Learning and Large Language Models' by Yehor Tereshchenko and Mika Hämäläinen (2025) compares multiple AI techniques for moderation:  • Traditional ML models using TF-IDF and embeddings  • Fine-tuned transformer models such as DistilBERT  • Large Language Models (LLMs) for zero/few-shot learning
  • 4.
    z Key Findings  DistilBERT achievedthe best trade off ‑ between accuracy, processing speed, and computational cost. Traditional models were fast but less context- aware, while LLMs performed well but were too expensive and slow for real- time moderation.
  • 5.
    z Relevance to Data Mining This study demonstrates several core data mining principles:  • Text classification and pattern recognition  • Dimensionality reduction via embeddings  • Clustering of behavior and anomaly detection  • Evaluation metrics such as precision, recall, and F1-score
  • 6.
    z Ethical and Practical Insights Automated moderation must balance fairness and empathy. Mislabeling harmless banter can alienate players, while missing genuine abuse harms communities. A hybrid approach—AI for filtering, humans for review—offers the best path forward.
  • 7.
    z Conclusion  This researchhighlights how data mining and NLP can support healthier online communities. Combining efficiency of AI with human understanding ensures fairness and scalability in gaming moderation.
  • 8.