Ensuring Safe and Respectful Online Spaces: A Look at AI-Based Text Moderation

Product
March 12, 2025
March 4, 2025
Alokika Dash

Paul Haley

Abhishek Pradhan

Arya Bulusu

Mohammad Niknazar

Shom Ponoth

With the explosion of user-generated content, moderating digital conversations has become critical in keeping online platforms welcoming and respectful. Across the AI community, a variety of techniques—from detailed multi-attribute scoring to constitution-based frameworks—are being explored to address the complexities of filtering out harmful or off-topic material.

At Emergence, we’ve been experimenting with two specific approaches:

·      General Text Moderation, which tags content across multiple dimensions such as “Derogatory,” “Insult,” or “Toxic.”

·      Child Text Moderation, which uses a “constitution” (or set of rules) to align content with specific age groups—elementary, middle, or high school.

Recently, we ran several popular models head-to-head, comparing sensitivity (the ability to catch problematic content) and F1 score (how well the model balances pinpointing issues with avoiding false positives). The table below highlights some results from our testing:

Overall, different models showcase different strengths. Some options favor high sensitivity (flagging abroad range of content), and others aim to achieve a moderate balance of precision and recall (F1 score). Our own experiments with multi-attribute and constitution-based moderation reflect the variety of ways AI can help address the complex task of policing online content.

More from the Journal

June 19, 2025

State of the Art Results in Agentic Memory

Eager to apply more sophisticated agentic memory to the largest conversational benchmark, LongMemEval, we discuss the benchmark, our approach, our somewhat disappointing state of the art findings, and the need for a more comprehensive benchmark for agentic memory than LongMemEval.

June 18, 2025

SOTA on LongMemEval with RAG

LongMemEval is highlighted as the premier benchmark for evaluating long-term memory, surpassing simple tasks with its complex requirements. Despite this, our RAG-like methods have achieved state-of-the-art results, suggesting that while LongMemEval is effective, it may not fully capture all aspects of memory, indicating a need for further benchmark development.

June 12, 2025

Agents Are Redefining Cybersecurity Resilience

Emergence AI agents are revolutionizing cybersecurity by autonomously correlating vast telemetry data, detecting threats in real time, automating compliance monitoring, and orchestrating efficient SOC operations while reducing manual workloads and enhancing decision-making. By acting as tireless digital teammates, these agents empower organizations to build a scalable, resilient, and proactive security posture fit for today’s complex threat landscape.