IDEAS home Printed from https://ideas.repec.org/a/gam/jdataj/v11y2026i6p136-d1963069.html

Cervical Cancer Dataset Catalog (CCDCAT-U_v1.0; Release v0.1): A Machine-Readable, Reproducible Catalog of Discoverable Human Cervical Cancer and Pre-Cancer Datasets Across Modalities

Author

Listed:
  • Kula Kekeba Tune

    (Department of Software Engineering, HPC and Big Data Analytics Center of Excellence, Addis Ababa Science and Technology University, Addis Ababa 16417, Ethiopia)

  • Foziya Ahmed Mohammed

    (Department of Software Engineering, HPC and Big Data Analytics Center of Excellence, Addis Ababa Science and Technology University, Addis Ababa 16417, Ethiopia
    Lina Pharmaceuticals and Medical Devices Inc., Addis Ababa, Ethiopia
    Enkoy LLC, 6418 Tiffany Ct, Lanham, MD 20706, USA)

  • Juhar Ahmed Mohammed

    (Lina Pharmaceuticals and Medical Devices Inc., Addis Ababa, Ethiopia)

  • Seid Muhie

    (Enkoy LLC, 6418 Tiffany Ct, Lanham, MD 20706, USA
    The Geneva Foundation, Silver Spring, MD 20910, USA)

Abstract

Human cervical cancer and pre-cancer research relies on datasets scattered across modality-specific archives, imaging repositories, benchmark platforms, trial registries, and controlled-access catalogs. This fragmentation—combined with heterogeneous metadata, ambiguous use of “cervical” terminology, and inconsistent indexing of pre-cancer and screening/triage resources—limits reproducible discovery, access planning, and cross-modal benchmarking. We present the Cervical Cancer Dataset Catalog (CCDCAT), a machine-readable, versioned dataset of datasets that enumerates host-specific dataset-instance records anchored to stable identifiers and resolvable landing records within an explicitly declared discoverable source universe (U_v1.0) and a frozen discovery/labeling lexicon (Q_v1.0). The CCDCAT spans invasive cervical cancer, pre-cancer/dysplasia, and cervix-focused screening and triage phenotypes, and it covers molecular omics, imaging and microscopy (including cervix photography, cytology, and digital pathology), trial registry records, benchmark resources, and controlled-access catalogs represented as metadata with explicit access pathways. Eligibility and labels are assigned conservatively from source-provided metadata; when evidence is insufficient, the CCDCAT abstains rather than infers. In the initial release (CCDCAT-U_v1.0; v0.1), we enumerate 14 eligible dataset instances across 11 host systems within a declared universe of 21 sources. Releases include manuscript-ready tables and interoperable artifacts (schema, controlled vocabularies, provenance logs, abstention ledgers, and a queryable database), enabling reproducible filtering, linkage, and auditable reuse planning.

Suggested Citation

  • Kula Kekeba Tune & Foziya Ahmed Mohammed & Juhar Ahmed Mohammed & Seid Muhie, 2026. "Cervical Cancer Dataset Catalog (CCDCAT-U_v1.0; Release v0.1): A Machine-Readable, Reproducible Catalog of Discoverable Human Cervical Cancer and Pre-Cancer Datasets Across Modalities," Data, MDPI, vol. 11(6), pages 1-30, June.
  • Handle: RePEc:gam:jdataj:v:11:y:2026:i:6:p:136-:d:1963069
    as

    Download full text from publisher

    File URL: https://www.mdpi.com/2306-5729/11/6/136/pdf
    Download Restriction: no

    File URL: https://www.mdpi.com/2306-5729/11/6/136/
    Download Restriction: no
    ---><---

    More about this item

    Keywords

    ;
    ;
    ;
    ;
    ;
    ;
    ;

    Statistics

    Access and download statistics

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:gam:jdataj:v:11:y:2026:i:6:p:136-:d:1963069. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    We have no bibliographic references for this item. You can help adding them by using this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: MDPI Indexing Manager The email address of this maintainer does not seem to be valid anymore. Please ask MDPI Indexing Manager to update the entry or send us the correct address (email available below). General contact details of provider: https://www.mdpi.com .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.