Current release files
Files listed here describe the 2026-04-02 locked public release only. Public mirrors and archives are listed below for reviewer access and reproducible citation.
Included public files
The current downloadable release remains the 2026-04-02 locked release. Only files listed in the current public manifest are included in this release.
| File | Description | License scope or caveat | Status note |
|---|---|---|---|
cantharm_release_manifest.json | Release identity, counts, public file names, and audio package list. | Release metadata. | Current release manifest. |
cantharm_release_workbook.xlsx | Public workbook for form and sense records. | Workbook artifact under CC BY 4.0, excluding restricted dictionary-derived definitions or source text unless source-specific permission is confirmed. | Current release workbook. |
cantharm_release_metadata_bundle.zip | Manifest, workbook, README, statistics, benchmark highlights, reproducibility note, and audio inventory. | Mixed documentation and workbook caveats apply. | Current release metadata bundle. |
cantharm_dataset_statistics.csv | Counts and split summaries. | Open release metadata. | Current release table. |
cantharm_benchmark_highlights.csv | Representative benchmark summary for the locked release. | Open release metadata. | Does not include non-release benchmark scores. |
cantharm_audio_inventory.csv | Inventory for 4,823 canonical audio clips. | Open metadata. Speaker IDs are pseudonymous. | Current release inventory. |
cantharm_audio_spk01.zip to cantharm_audio_spk11.zip | Speaker-packaged canonical audio clips. | Audio under CC BY 4.0 with speaker consent, accompanied by acceptable-use guidance against voice misuse. | Current release audio packages. |
Public mirrors and archives
These channels mirror the 2026-04-02 locked public release. The release boundary is the 2026-04-02 locked public release.
| Channel | Status | Contents | Link / identifier | Notes |
|---|---|---|---|---|
| Official project website | Current primary website | Primary access point and release documentation. | https://cantharm.dataset.aidimsum.com/ | Current public website. |
| Direct website downloads | Current direct release downloads | Manifest-backed current release objects. | https://cantharm.dataset.aidimsum.com/downloads.html | Current official downloads page. |
| GitHub repository | Public reachable | Reviewer-facing documentation, citation files, checksums, and lightweight metadata. | https://github.com/GZU-JK/CantHarm | Audited public channel. |
| GitHub release | Public reachable | Release assets for the 2026-04-02 public release. | v2026.04.02 release | Audited public channel. |
| HuggingFace Dataset | Public reachable | Dataset card, metadata, allowed files, and access instructions. | https://huggingface.co/datasets/jk-gjom/CantHarm | Audited public channel. |
| Zenodo | Public reachable; DOI minted | Archival snapshot and DOI. | Zenodo record 20511573 10.5281/zenodo.20511573 | DOI string: 10.5281/zenodo.20511573. |
| OSF | Public reachable | Public OSF project mirror. | https://osf.io/3uhpx/ | Audited public channel. |
Versioned candidate materials
v1.1-gb-candidate publishes a complete second-recording candidate line for the original 4,823 forms, together with candidate result tables and archive identifiers for inspection.
Scope note: v1.0 remains the stable public release; v1.1-gb materials are cited and inspected as a separate candidate line.
| Candidate channel | Link |
|---|---|
| Website candidate page | release-v2026-06-gb-candidate.html |
| Candidate results page | revised-candidate-results.html |
| GitHub pre-release | v2026.06.05-gb-candidate |
| HuggingFace candidate branch | jk-gjom/CantHarm/tree/v2026.06.05-gb-candidate |
| Zenodo candidate DOI | 10.5281/zenodo.20558851 |
| OSF candidate component | https://osf.io/pwgcy/ |
Excluded from this release
- Materials outside the current manifest-backed release, benchmark summary, public channels, and release documentation.
- Non-release operational notes, non-release result artifacts, trained checkpoints, and non-release annotation notes.
- Redistribution of dictionary-derived definitions or source text requires source-specific permission.
Release documentation
Release documentation is available below.
| Document | Purpose |
|---|---|
| README.md | Release overview and file index. |
| LICENSE_DATA.md | Data license wording and source-text caveat. |
| ACCEPTABLE_USE.md | Research use and unacceptable use notes. |
| DATASHEET.md | Detailed dataset documentation for the current release. |
| Public release manifest summary | Public-facing release manifest summary for this website documentation. |