Current release files

Files listed here describe the 2026-04-02 locked public release only. Public mirrors and archives are listed below for reviewer access and reproducible citation.

Included public files

The current downloadable release remains the 2026-04-02 locked release. Only files listed in the current public manifest are included in this release.

FileDescriptionLicense scope or caveatStatus note
cantharm_release_manifest.jsonRelease identity, counts, public file names, and audio package list.Release metadata.Current release manifest.
cantharm_release_workbook.xlsxPublic workbook for form and sense records.Workbook artifact under CC BY 4.0, excluding restricted dictionary-derived definitions or source text unless source-specific permission is confirmed.Current release workbook.
cantharm_release_metadata_bundle.zipManifest, workbook, README, statistics, benchmark highlights, reproducibility note, and audio inventory.Mixed documentation and workbook caveats apply.Current release metadata bundle.
cantharm_dataset_statistics.csvCounts and split summaries.Open release metadata.Current release table.
cantharm_benchmark_highlights.csvRepresentative benchmark summary for the locked release.Open release metadata.Does not include non-release benchmark scores.
cantharm_audio_inventory.csvInventory for 4,823 canonical audio clips.Open metadata. Speaker IDs are pseudonymous.Current release inventory.
cantharm_audio_spk01.zip to cantharm_audio_spk11.zipSpeaker-packaged canonical audio clips.Audio under CC BY 4.0 with speaker consent, accompanied by acceptable-use guidance against voice misuse.Current release audio packages.

Public mirrors and archives

These channels mirror the 2026-04-02 locked public release. The release boundary is the 2026-04-02 locked public release.

ChannelStatusContentsLink / identifierNotes
Official project websiteCurrent primary websitePrimary access point and release documentation.https://cantharm.dataset.aidimsum.com/Current public website.
Direct website downloadsCurrent direct release downloadsManifest-backed current release objects.https://cantharm.dataset.aidimsum.com/downloads.htmlCurrent official downloads page.
GitHub repositoryPublic reachableReviewer-facing documentation, citation files, checksums, and lightweight metadata.https://github.com/GZU-JK/CantHarmAudited public channel.
GitHub releasePublic reachableRelease assets for the 2026-04-02 public release.v2026.04.02 releaseAudited public channel.
HuggingFace DatasetPublic reachableDataset card, metadata, allowed files, and access instructions.https://huggingface.co/datasets/jk-gjom/CantHarmAudited public channel.
ZenodoPublic reachable; DOI mintedArchival snapshot and DOI.Zenodo record 20511573
10.5281/zenodo.20511573
DOI string: 10.5281/zenodo.20511573.
OSFPublic reachablePublic OSF project mirror.https://osf.io/3uhpx/Audited public channel.

Versioned candidate materials

v1.1-gb-candidate publishes a complete second-recording candidate line for the original 4,823 forms, together with candidate result tables and archive identifiers for inspection.

Scope note: v1.0 remains the stable public release; v1.1-gb materials are cited and inspected as a separate candidate line.

Candidate channelLink
Website candidate pagerelease-v2026-06-gb-candidate.html
Candidate results pagerevised-candidate-results.html
GitHub pre-releasev2026.06.05-gb-candidate
HuggingFace candidate branchjk-gjom/CantHarm/tree/v2026.06.05-gb-candidate
Zenodo candidate DOI10.5281/zenodo.20558851
OSF candidate componenthttps://osf.io/pwgcy/

Excluded from this release

  • Materials outside the current manifest-backed release, benchmark summary, public channels, and release documentation.
  • Non-release operational notes, non-release result artifacts, trained checkpoints, and non-release annotation notes.
  • Redistribution of dictionary-derived definitions or source text requires source-specific permission.

Release documentation

Release documentation is available below.

DocumentPurpose
README.mdRelease overview and file index.
LICENSE_DATA.mdData license wording and source-text caveat.
ACCEPTABLE_USE.mdResearch use and unacceptable use notes.
DATASHEET.mdDetailed dataset documentation for the current release.
Public release manifest summaryPublic-facing release manifest summary for this website documentation.