Main benchmark summary

Representative result matrix

Task Metric Model Value
Form coarse classificationMacro-F1XLM-R0.5353
Form fine classificationMacro-F1Hierarchical TF-IDF + Linear SVM0.4594
Form severity regressionSpearmanchar TF-IDF + Ridge0.6037
Fusion coarse classificationMacro-F1Tuned late fusion0.5415
Sense coarse classificationMacro-F1TF-IDF + LogReg0.5033
Sense fine classificationMacro-F1TF-IDF + LogReg0.4057

Reading frame

How these anchors are meant to be used

Anchor family Why it stays foregrounded
Text and sense tasksThey summarize the main lexical and ambiguity-sensitive benchmark results.
Severity predictionIt remains part of the core manuscript narrative and is interpretable within the benchmark rubric.
Fusion coarse classificationIt provides a compact speech-grounded extension in the public summary.

How to read this page

Representative summary

  • This page groups the main benchmark results reported in the manuscript.
  • The public metadata bundle is aligned to the same release snapshot.
  • Task definitions for the full benchmark suite remain documented in Protocol.
Bundle note. This supplementary page is aligned to the current public release snapshot. For the exact workbook, manifest, statistics, benchmark highlights, and speaker-packaged audio files, use Downloads.