What this benchmark is
Controlled, interpretable, and release-closed
The benchmark object is a released form layer with adjudicated gold labels, a linked sense layer with source-level evidence, and one canonical audio clip per retained form. It supports form-level text classification, sense-level classification, audio-only coarse classification, controlled text-audio fusion, and additional speech-conditioned benchmark views under one locked data line.
What this benchmark is not
Not a broad spoken moderation crawl
- Not an in-the-wild conversational speech corpus
- Not a speaker-generalization benchmark or a speaker-disjoint split design
- Not a deployment-ready moderation standard
- Not a dataset whose form gold should be inferred by flattening raw sense evidence
Start here
Where to read first
Release completeness
What the site closes explicitly
- Public workbook alias and manifest-backed release identity
- Readable public pages for scope, QC, ethics, access, and reproducibility
- Download hub with stable public filenames and speaker-packaged audio
- Benchmark summary aligned with the manuscript and public release files
Stable benchmark anchors
Representative benchmark summary
The overview highlights the main benchmark results summarized in the manuscript and public release notes.
| Task | Metric | Best configured model | Value |
|---|---|---|---|
| Form coarse classification | Macro-F1 | XLM-R | 0.5353 |
| Form fine classification | Macro-F1 | Hierarchical TF-IDF + Linear SVM | 0.4594 |
| Severity prediction | Spearman | char TF-IDF + Ridge | 0.6037 |
| Fusion coarse classification | Macro-F1 | Tuned late fusion | 0.5415 |
| Sense coarse classification | Macro-F1 | TF-IDF + LogReg | 0.5033 |
| Sense fine classification | Macro-F1 | TF-IDF + LogReg | 0.4057 |
Download Hub
Actual release files, not just descriptions
Open the download hub for the public workbook alias, release manifest, statistics, benchmark highlights, metadata bundle zip, and the speaker-packaged audio downloads.
Protocol
Clear task and input contract
Input availability, forbidden fields, split policy, and benchmark interpretations are centralized in one place.
Package Identity
Paper truth and website truth aligned
Release snapshot, workbook hash, exclusions, and public package names are all tied back to the same locked release contract.