Reference files

Use these files as the release snapshot

  • cantharm_release_workbook.xlsx
  • cantharm_release_manifest.json
  • cantharm_dataset_statistics.csv and cantharm_benchmark_highlights.csv
  • cantharm_audio_inventory.csv and the speaker-level audio packages

Public reproducibility boundary

What this page is meant to support

Use the released workbook, manifest, statistics, benchmark summary, and audio inventory as the reference snapshot for counts, splits, and benchmark interpretation. Reproducibility on this site is about matching the public release state and the benchmark summary reported in the manuscript.

Public consistency check

What should stay aligned

The workbook, manifest, statistics file, benchmark summary, and website text should all refer to the same release snapshot and the same benchmark definitions.

Sanity checks

What should match in the release snapshot

Check Expected value
Forms4823
Senses6365
Canonical audio files4823
Form coarse split3375 / 483 / 965
Sense fine split4451 / 634 / 1266
Release snapshot2026-04-02 public release
Consistency rule. This site, the paper, and the reproducibility note all point to the same final release snapshot.