The ICH Framework, Global Regulators, and What Statistical Programmers Should Know

1. Introduction: Why the Upstream Architecture Matters

Most statistical programmers enter the clinical data world through SDTM, ADaM, and Define-XML. These are the daily tools of the trade, and for good reason — they are the standards that directly govern how datasets are structured, analysis is performed, and submission packages are assembled. But these standards did not emerge in a vacuum. They sit at the implementation layer of a much larger regulatory architecture, one that begins with international harmonization efforts and cascades through regional regulators before reaching a programmer’s desk.

Understanding this upstream architecture is not academic. It answers practical questions: Why does the FDA mandate CDISC standards while some other regulators do not? Why do PMDA submissions have additional requirements beyond what the FDA expects? Why is SEND required for certain nonclinical studies but not others? And critically, when new standards like Dataset-JSON or Analysis Results Metadata (ARM) emerge, where do they fit in the regulatory hierarchy, and when will they actually become mandatory?

This article traces the full top-down path — from ICH guidelines to CDISC implementation standards — and maps the global regulatory landscape that statistical programmers operate within.

2. ICH: The Apex of the Data Standards Framework

The International Council for Harmonisation of Technical Requirements for Pharmaceuticals for Human Use (ICH) is the starting point. Established in 1990, ICH is not a regulator. It cannot approve drugs, issue marketing authorizations, or enforce compliance. Instead, it is a harmonization body where regulators and pharmaceutical industry representatives jointly develop guidelines intended to reduce duplication and inconsistency across regulatory regions.

2.1 ICH Structure and Membership

ICH operates through a structured membership model. Its founding regulatory members are the U.S. Food and Drug Administration (FDA), the European Medicines Agency (EMA), and Japan’s Pharmaceuticals and Medical Devices Agency (PMDA), each paired with industry counterparts (PhRMA, EFPIA, JPMA). Over time, additional regulatory members have joined, including Health Canada, China’s National Medical Products Administration (NMPA), the UK’s Medicines and Healthcare products Regulatory Agency (MHRA), Swissmedic, Brazil’s ANVISA, the Republic of Korea’s MFDS, and Singapore’s HSA, among others.

ICH also includes observer organizations such as the World Health Organization (WHO) and the International Federation of Pharmaceutical Manufacturers and Associations (IFPMA). Regulators from countries like Australia (TGA), South Africa (SAHPRA), and others may participate as observers or through regional harmonization initiatives but are not full ICH members.

2.2 ICH Guideline Categories

ICH guidelines are organized into four topic categories, each identified by a letter prefix. Two of these — the E-series and the M-series — are directly relevant to clinical data standards and deserve close attention from statistical programmers.

Category	Prefix	Focus	Relevance to Programmers
Quality	Q	Drug substance and product manufacturing, stability, specifications	Low direct relevance
Safety	S	Nonclinical safety studies (carcinogenicity, genotoxicity, reproductive toxicity)	Moderate — drives SEND requirements
Efficacy	E	Clinical trial design, conduct, statistical analysis, reporting	High — foundational to data collection and analysis
Multidisciplinary	M	Cross-cutting topics: electronic submissions, CTD structure, MedDRA, clinical terminology	High — defines submission format and electronic standards

3. The E-Series: Efficacy Guidelines That Shape Data Collection and Analysis

The E-series guidelines are the regulatory backbone of clinical trial conduct and reporting. While programmers may associate “data standards” primarily with CDISC, it is ICH E-series guidelines that define why data must be collected in certain ways, what statistical analyses must demonstrate, and how clinical study reports must be structured.

3.1 ICH E6: Good Clinical Practice (GCP)

ICH E6(R2) — and its forthcoming R3 revision — is the cornerstone guideline for clinical trial conduct. GCP establishes the ethical and scientific quality standard for designing, conducting, recording, and reporting trials. For statistical programmers, E6 matters because it mandates that all clinical trial data be recorded, handled, and stored in a way that allows accurate reporting, interpretation, and verification. This principle directly underpins why sponsors adopt standardized data models.

Practical example: E6(R2) Section 5.5 requires that the sponsor ensure “the reliability of trial data” through quality control at each stage of data handling. When your organization implements edit checks on CDASH-based CRF data before mapping it to SDTM, that process traces back to this GCP requirement. The traceability expectation — from CRF to SDTM to ADaM to TFL — is a GCP-derived principle.

3.2 ICH E3: Structure and Content of Clinical Study Reports

ICH E3 defines the structure of the Clinical Study Report (CSR), including the appendices that contain patient data listings, summary tables, and datasets. Section 16 of the E3 guideline, which describes the CSR appendices, is where the submission datasets that programmers produce ultimately reside.

Practical example: When you produce an SDTM dataset package with a Define-XML file, and your biostatistician produces ADaM datasets with corresponding analysis results, these are assembled into Section 16 of the CSR as defined by ICH E3. The Analysis Results Metadata (ARM) standard being developed by CDISC is explicitly designed to link the analysis results in Section 14 of the CSR back to the ADaM datasets and programs that generated them. ARM, in this sense, is an implementation of the E3 traceability expectation.

3.3 ICH E8: General Considerations for Clinical Trials

ICH E8(R1) provides the overarching framework for clinical trial design, including the principle that data collection should be fit for purpose. E8(R1) introduced the concept of “quality by design” in clinical trials, encouraging sponsors to identify which data elements are critical to the trial’s objectives and to design collection processes accordingly. This philosophy influenced CDISC’s development of CDASH, which aims to standardize data collection at the CRF level to reduce downstream variability.

3.4 ICH E9: Statistical Principles for Clinical Trials

ICH E9(R1) is the guideline that biostatisticians and statistical programmers should know intimately. It establishes the statistical framework for confirmatory clinical trials, including principles around estimands, analysis populations, multiplicity adjustments, and interim analyses. For programmers, E9 drives the structure of ADaM datasets — particularly the requirement for clear population flags (SAFFL, ITTFL, EFFFL, RANDFL), analysis-ready endpoints, and well-defined baseline values. The E9(R1) addendum on estimands has further sharpened expectations around how intercurrent events are handled in analysis datasets.

Practical example: When you create ADSL with population flags derived from protocol-defined criteria, or when you derive AVAL and BASE in an ADaM BDS dataset, you are implementing E9 principles. The estimand framework from E9(R1) also affects how DTYPE (derivation type) is used in ADaM, since the handling of missing data and intercurrent events must be transparent in the analysis dataset.

3.5 Other Notable E-Series Guidelines

Guideline	Title	Programmer Relevance
E2A/E2B(R3)	Pharmacovigilance and Individual Case Safety Reports	Drives AE collection standards; E2B(R3) XML format used for safety reporting interconnects with SDTM AE domain structure
E14/E14(R3)	Clinical Evaluation of QT/QTc Interval Prolongation	Directly relevant to ECG/EG domain in SDTM and ADEG in ADaM; defines the Thorough QT (TQT) study design and analysis expectations
E17	Multi-Regional Clinical Trials	Impacts pooling strategies and how regional subgroup analyses are structured in ADaM, particularly for global submissions to multiple regulators simultaneously

4. The M-Series: Multidisciplinary Guidelines That Define Submission Infrastructure

If the E-series defines what data must demonstrate, the M-series defines how data is transmitted and packaged for regulatory review. For statistical programmers, M-series guidelines are the bridge between producing standards-compliant datasets and successfully delivering them to regulators.

4.1 ICH M2: Electronic Standards for the Transfer of Regulatory Information

M2 established the principle that regulatory submissions should use electronic standards for data exchange. While M2 itself is a high-level framework document, it set the stage for more specific guidance on electronic submission formats that followed. M2 is the conceptual ancestor of requirements like the FDA’s Study Data Standards Resources (SDSR) and the EMA’s eSubmission gateway specifications.

4.2 ICH M4: The Common Technical Document (CTD)

M4 defines the Common Technical Document, the standardized format for organizing a marketing authorization application. The CTD is organized into five modules:

Module	Content	Programmer Touchpoint
Module 1	Administrative and prescribing information (region-specific)	Not standardized by ICH; varies by regulator
Module 2	Summaries (quality, nonclinical, clinical overview and summary)	Clinical Summary (2.7) relies on analysis outputs programmers produce
Module 3	Quality (CMC data)	Minimal direct programmer involvement
Module 4	Nonclinical study reports	SEND datasets reside here for nonclinical studies
Module 5	Clinical study reports and datasets	SDTM, ADaM, Define-XML, and analysis outputs reside here

Practical example: When a regulatory affairs team assembles an NDA or MAA, the datasets you produce as a statistical programmer are placed in Module 5 of the CTD structure. Your SDTM and ADaM datasets, along with Define-XML, reviewer’s guides, and analysis programs, all slot into specific locations within this architecture. Understanding Module 5 helps you understand why certain file naming conventions, folder structures, and metadata requirements exist.

4.3 ICH M8: Electronic Common Technical Document (eCTD)

M8 takes the CTD concept digital. The eCTD specification defines the electronic format, folder structure, and XML backbone for submitting the CTD electronically. The eCTD is the actual vehicle that carries your datasets to regulators. FDA, EMA, PMDA, Health Canada, and many other regulators now mandate eCTD submissions for marketing applications. The current eCTD v4.0 specification aligns across ICH regions, although implementation timelines vary.

4.4 ICH M11: Clinical Electronic Structured Harmonised Protocol (CeSHarP)

M11 is one of the newest ICH guidelines and represents a significant step toward upstream data standardization. It defines a structured, machine-readable format for clinical trial protocols. For statistical programmers, M11 matters because a structured protocol can automate downstream processes — including CRF design (CDASH), data collection standards, and even preliminary SDTM mapping specifications. As M11 adoption matures, the manual interpretation steps between protocol and database design may become increasingly automated.

4.5 Other M-Series Guidelines of Note

Guideline	Title	Relevance
M1 (MedDRA)	Medical Dictionary for Regulatory Activities	The controlled medical terminology used across AE, MH, CM, and other SDTM domains. Programmers use MedDRA-coded terms extensively in safety analyses.
M5	Data Elements and Standards for Drug Dictionaries	Governs drug coding dictionaries (WHODrug); relevant to CM and concomitant medication coding in SDTM.
M10	Bioanalytical Method Validation	Relevant to PK/PD data that programmers handle in SDTM PC/PP domains.

5. From ICH Guidelines to Regulatory Implementation

A critical concept for statistical programmers to internalize is that ICH guidelines are not law. They are harmonized recommendations. Each regulatory authority must independently adopt, transpose, or reference ICH guidelines into its own legal and regulatory framework before they have binding force. This is why the same ICH guideline can produce different practical requirements depending on which regulator you are submitting to.

5.1 FDA (United States)

The FDA is the most prescriptive regulator regarding CDISC data standards for study data submissions. Key implementation mechanisms include:

21 CFR Part 11: Governs electronic records and electronic signatures, establishing the legal basis for accepting electronic datasets.

Study Data Technical Conformance Guide (TCG): Published by the FDA’s Center for Drug Evaluation and Research (CDER) and Center for Biologics Evaluation and Research (CBER), the TCG provides detailed technical requirements for study data submissions, including which CDISC standard versions are supported, file format requirements, dataset size limits, and Define-XML expectations. The TCG is the single most important reference document for a statistical programmer submitting to the FDA.

Study Data Standards Resources (SDSR): The FDA’s catalog of supported CDISC standards and terminology versions. The SDSR is updated periodically and specifies which versions of SDTMIG, ADaMIG, SEND, controlled terminology, and other standards the FDA currently accepts.

FDA Data Standards Catalog: Lists the data standards that FDA supports for regulatory submissions. Importantly, SDTM and ADaM became a requirement (not just a recommendation) for NDA/BLA submissions starting in December 2016 for studies begun after December 2016.

Practical example: If you are submitting SEND datasets for a 2-year carcinogenicity study to the FDA, you must consult both the SDSR (to confirm the supported SENDIG version) and the TCG (for technical formatting requirements). The FDA has required SEND for certain nonclinical study types since 2017, making it one of the first regulators to mandate nonclinical data standardization.

5.2 PMDA (Japan)

PMDA adopts ICH guidelines through the Japanese regulatory framework and has additional specific technical requirements:

PMDA Technical Conformance Guide: Similar in concept to the FDA’s TCG but with Japan-specific requirements. PMDA requires CDISC standards for new drug applications and has its own expectations around study data validation rules, Japanese language handling, and specific dataset structures.

PMDA-specific validation rules: PMDA publishes its own set of validation rules for SDTM and ADaM datasets that differ from the FDA’s Pinnacle 21 (now Formedix/Pinnacle 21) validation ruleset. Programmers working on Japan submissions must run both FDA and PMDA validation checks.

Japanese language data: PMDA requires that certain data elements (such as site names and investigator names) be provided in Japanese, which creates encoding and dataset structure considerations that do not apply to FDA submissions.

Practical example: A global Phase III trial submitting simultaneously to FDA and PMDA may require two sets of validation outputs — one using FDA business rules and one using PMDA-specific rules. The SDTM datasets may be identical, but the validation reports and any required remediation may differ.

5.3 EMA (European Union)

The EMA has taken a more incremental approach to CDISC adoption:

Current state: The EMA has not yet mandated CDISC standards for all submissions in the same way the FDA has. However, CDISC standards are increasingly expected and accepted. The EMA’s Clinical Data publication policy and its evolving eSubmission requirements signal a trajectory toward greater standardization.

EU Clinical Trials Regulation (CTR) No 536/2014: The CTR, which became fully applicable in January 2023 through the Clinical Trials Information System (CTIS), introduces structured data requirements for clinical trial applications in the EU. While not directly mandating CDISC, the CTR’s structured data expectations align with the direction of standardization.

EMA/HMA collaboration: The Heads of Medicines Agencies (HMA) and EMA have signaled future alignment with international data standards, including potential mandating of CDISC standards for centralized marketing authorization procedures.

Practical example: A sponsor submitting an MAA to the EMA may currently submit datasets in CDISC format voluntarily. However, if the same datasets are being submitted to the FDA, they are already in CDISC format, making dual submission straightforward. The gap is primarily in enforcement and validation expectations, not in the standards themselves.

5.4 NMPA (China)

China’s National Medical Products Administration (NMPA) joined ICH as a regulatory member in 2017 and has been progressively aligning with ICH guidelines:

CDISC adoption: NMPA has announced plans to adopt CDISC standards and has published draft guidance on study data requirements. The timeline for full mandatory adoption is still evolving, but the trajectory is clear.

China-specific considerations: Similar to PMDA, NMPA may require Chinese language elements in certain data fields, and local regulatory requirements may add supplementary data expectations beyond what ICH guidelines specify.

5.5 The Broader Global Landscape

Country/Region	Regulatory Authority	ICH Status	CDISC Adoption Status
Canada	Health Canada	Regulatory Member	CDISC accepted; moving toward expected for new submissions; aligns closely with FDA technical requirements
United Kingdom	MHRA	Regulatory Member (post-Brexit)	CDISC accepted and increasingly expected; MHRA collaborates with FDA and EMA on data standards; post-Brexit regulatory independence adds nuance
Australia	TGA	Observer	CDISC accepted but not mandated; TGA recognizes FDA/EMA assessments, so CDISC-formatted data submitted to those agencies often flows through
Switzerland	Swissmedic	Regulatory Member	Aligns with EU/ICH standards; CDISC accepted for submissions
Brazil	ANVISA	Regulatory Member	Early-stage CDISC adoption; ICH guideline implementation underway
South Korea	MFDS	Regulatory Member	Active CDISC adoption; published guidance on CDISC-based submissions
India	CDSCO	Not a member (observer via WHO pathway)	Limited CDISC adoption; relies on national guidelines with some ICH alignment
South Africa	SAHPRA	Observer	ICH guidelines referenced but CDISC not yet mandated

Key takeaway for programmers: For global submissions, always design your data packages to the most stringent regulator’s requirements (typically the FDA), then layer on any additional region-specific requirements (e.g., PMDA validation, Japanese language fields). This “design to the highest standard” approach minimizes rework across multi-regional filings.

6. Where CDISC Fits: The Implementation Layer

If ICH defines what must be standardized and why, CDISC defines how. The Clinical Data Interchange Standards Consortium is a global, non-profit organization that develops the implementation-level standards used to structure, format, and transmit clinical trial data. CDISC standards translate the high-level principles of ICH guidelines into concrete, machine-readable specifications that programmers implement daily.

Most programmers know the core trio well: SDTM (Study Data Tabulation Model) for tabulating collected data, ADaM (Analysis Data Model) for structuring analysis-ready datasets, and Define-XML for providing metadata about both. These standards, along with Controlled Terminology and the Therapeutic Area Standards, form the backbone of CDISC’s portfolio. Rather than revisiting these in depth, this article focuses on the less-discussed but equally important CDISC standards that programmers should understand.

6.1 CDASH: Clinical Data Acquisition Standards Harmonization

CDASH operates upstream of SDTM. While SDTM defines how data should be structured after collection, CDASH defines how data should be collected at the CRF (Case Report Form) level. CDASH provides recommended fields, question text, and data collection conventions for common clinical assessments, ensuring that data enters the pipeline in a form that maps cleanly to SDTM.

Why it matters for programmers: When CDASH is implemented correctly, the SDTM mapping process is significantly simplified. Conversely, when CRFs are designed without CDASH alignment, programmers face complex, study-specific mappings that increase both effort and risk.

CDASH to SDTM Mapping Example: Vital Signs

Consider a blood pressure measurement collected on a CRF. Under CDASH:

CDASH CRF Field	CDASH Variable	Maps to SDTM	SDTM Variable
Vital Sign Test	VSTEST	VS.VSTEST	SYSBP / DIABP
Result	VSORRES	VS.VSORRES	120 / 80
Unit	VSORRESU	VS.VSORRESU	mmHg
Date of Assessment	VSDTC	VS.VSDTC	2024-06-15
Position	VSPOS	VS.VSPOS	SITTING
Location	VSLOC	VS.VSLOC	ARM

When a CRF follows CDASH conventions, the mapping from CRF to SDTM VS domain is nearly one-to-one. Variable names align, value-level conventions match, and the programmer’s task shifts from interpretation to transformation. Without CDASH, the CRF might collect blood pressure as a single field (“120/80 mmHg sitting, left arm”), forcing the programmer to parse, split, and derive multiple SDTM variables from free text.

6.2 SEND: Standard for Exchange of Nonclinical Data

SEND is the nonclinical counterpart to SDTM. It provides a standardized structure for submitting nonclinical (animal) study data, including toxicology, carcinogenicity, and pharmacokinetic studies. SEND follows the same foundational model as SDTM but with domains tailored to nonclinical endpoints.

FDA mandate: The FDA has required SEND for certain nonclinical study types since 2017. The current SENDIG (SEND Implementation Guide) versions cover general toxicology studies, carcinogenicity studies, and cardiovascular/respiratory safety pharmacology studies.

SEND Domain Example: Carcinogenicity Study

A 2-year rat carcinogenicity study submitted in SEND format would include domains such as:

SEND Domain	Code	Content	Parallels SDTM
Body Weight	BW	Individual animal body weights over study duration	VS (Vital Signs) conceptually
Body Weight Gain	BG	Calculated weight changes between timepoints	Derived, similar to ADaM derivations
Clinical Observations	CL	In-life clinical signs (palpable masses, behavioral changes)	CE (Clinical Events)
Macroscopic Findings	MA	Gross pathology findings at necropsy	No direct SDTM parallel
Microscopic Findings	MI	Histopathology results for examined tissues	No direct SDTM parallel
Organ Measurements	OM	Organ weights at necropsy	No direct SDTM parallel
Tumor Findings	TF	Tumor incidence and classification	Conceptually parallel to Oncology TAUG TU/TR domains

SEND shares SDTM’s foundational structure — Findings, Events, and Interventions observation classes; standard variables like --TESTCD, --ORRES, --STRESC; and controlled terminology. Programmers experienced with SDTM will find SEND structurally familiar, though the scientific content and validation rules reflect the nonclinical context.

6.3 Therapeutic Area User Guides (TAUGs)

TAUGs provide disease-specific extensions to the base SDTM and CDASH standards. They add domains, variables, and conventions that address the unique data requirements of particular therapeutic areas. TAUGs do not replace SDTMIG; they layer on top of it.

Why TAUGs matter: The base SDTMIG provides a general-purpose framework, but many therapeutic areas collect data that does not map neatly to the standard domains. Oncology trials, for example, require tumor measurement tracking, response assessments, and specific lesion-level data structures that the base SDTMIG does not address. TAUGs fill this gap.

Available TAUGs (Selected)

Therapeutic Area	Key Additions	Example Use Case
Oncology	TU (Tumor Identification), TR (Tumor Results), RS (Disease Response) domains	RECIST-based tumor assessment data: TU identifies each lesion, TR captures measurement at each visit, RS records investigator and independent assessments (CR, PR, SD, PD)
Alzheimer’s Disease	Cognitive assessment domains, biomarker data structures	Structured collection of ADAS-Cog, CDR-SB, and amyloid PET imaging endpoints
Cardiovascular	ECG interval data structures, cardiac event adjudication	Thorough QT study data with interval-level ECG measurements mapped to EG domain per E14 requirements
Vaccines	Immunogenicity domains (IS), reactogenicity data	Antibody titer data in IS domain, solicited adverse event collection aligned with vaccine-specific CRF conventions
Pain	Pain assessment scales, rescue medication tracking	VAS/NRS pain scores collected on specific schedules with standardized scoring variables
Diabetes	HbA1c, glucose monitoring, hypoglycemic event classification	Continuous glucose monitoring (CGM) data structured with time-series measurement conventions

TAUG Practical Example: Oncology Tumor Assessment

Consider a solid tumor trial using RECIST 1.1 criteria. Without the Oncology TAUG, a programmer would need to improvise how to represent lesion-level data in SDTM. The TAUG standardizes this:

TU (Tumor/Lesion Identification): Each target and non-target lesion is identified at baseline, assigned a unique TULNKID, and classified by location (TULOC) and method of measurement (TUMETHOD). For example, TULNKID = “TL01”, TULOC = “LIVER”, TUMETHOD = “CT SCAN”.

TR (Tumor/Lesion Results): At each assessment visit, measurements are recorded for each identified lesion. TRLNKID links back to the TU record, TRORRES contains the measurement value, and TRORRESU contains the unit. For target lesions, this might be TRTESTCD = “DIAMETER”, TRORRES = “25”, TRORRESU = “mm”.

RS (Disease Response): Overall response assessments are captured here. RSTESTCD = “OVRLRESP”, RSORRES = “PARTIAL RESPONSE”, with RSEVAL indicating who made the assessment (e.g., “INVESTIGATOR” or “INDEPENDENT ASSESSOR”).

When to consult a TAUG: At the start of any new study in a covered therapeutic area, before CRF design and SDTM mapping specification work begins. TAUGs inform both upstream CRF design (via CDASH extensions) and downstream SDTM mapping decisions.

6.4 The Broader CDISC Ecosystem

Beyond the standards discussed above, several additional CDISC components merit brief mention:

Controlled Terminology (CT): The standardized code lists used across all CDISC standards. CT is updated quarterly, and submissions must reference specific CT versions. Programmers must ensure that coded values in SDTM, SEND, and ADaM datasets align with the CT version specified in the submission.

Dataset-JSON: CDISC’s next-generation transport format, intended to replace the legacy SAS Version 5 Transport (XPT) format. Dataset-JSON offers advantages including support for longer variable names, better handling of special characters, and modern tooling compatibility. FDA has published guidance on accepting Dataset-JSON, and the transition from XPT is underway.

Analysis Results Metadata (ARM): A standard for describing the relationship between analysis results (tables, figures, listings) and the analysis datasets that produced them. ARM provides traceability from the statistical output back to the ADaM datasets, programs, and parameters used. This directly serves the ICH E3 requirement for transparent, reproducible clinical study reporting.

Define-XML: The metadata standard that describes the contents and structure of submission datasets. Define-XML provides variable-level documentation, value-level metadata, and computational algorithms, enabling regulatory reviewers to understand datasets without external documentation.

SDTM, ADaM, and associated IGs: The core standards most programmers work with daily. SDTM provides the tabulation model for collected data; ADaM provides the analysis-ready model. Both are supported by Implementation Guides (SDTMIG, ADaMIG) and supplementary guidance like the Metadata Submission Guidelines (MSG).

7. Putting It All Together: The Full Standards Cascade

The entire framework, from ICH guideline to programmer implementation, can be understood as a cascade:

Layer	Who	What	Example
Harmonization	ICH	E-series and M-series guidelines establishing principles	ICH E6 (GCP) requires reliable data; ICH M4 (CTD) defines where datasets reside in submissions
Regulation	FDA, EMA, PMDA, etc.	Adoption of ICH guidelines into binding regulatory requirements	FDA Study Data Standards Rule (2016); PMDA Technical Conformance Guide
Standards Development	CDISC	Implementation-level specifications for data structure and exchange	SDTMIG v3.4, ADaMIG v1.3, SENDIG v3.1, CDASH v2.2, TAUGs
Technical Execution	Statistical Programmers, Data Managers	Building datasets, running validations, assembling submission packages	Creating VS domain from CDASH CRF, deriving ADAE from SDTM AE, validating with Pinnacle 21

Every task a statistical programmer performs sits at the bottom of this cascade, but understanding the layers above provides essential context. When a new CDISC standard is published, its adoption timeline depends on where regulators choose to place it in their own implementation frameworks. When an ICH guideline is revised (such as E6 R3 or E9 R1), the downstream effects eventually reach the programmer’s desk in the form of new CDISC standard versions, updated validation rules, or revised regulatory guidance.

8. Practical Guidance for Statistical Programmers

8.1 ICH Guidelines to Bookmark

Guideline	Why You Need It
ICH E3	Defines the CSR structure that your datasets and outputs support
ICH E6(R2/R3)	GCP requirements that underpin data quality and traceability expectations
ICH E9(R1)	Statistical principles and the estimand framework that drive ADaM design
ICH M4/M8	CTD/eCTD structure — understand where your datasets physically reside in a submission
ICH M11	Structured protocols — emerging standard that will reshape upstream processes

8.2 How to Use TAUGs Effectively

TAUGs should be consulted at the beginning of a new study’s data standards planning, not after CRFs have been designed and data has been collected. The most common mistake is treating TAUGs as an afterthought during SDTM mapping, when their primary value lies in informing CRF design (via CDASH alignment) and establishing domain structures before data flows in.

Step 1: Identify whether a TAUG exists for your therapeutic area at the CDISC website.

Step 2: Review the TAUG’s domain and variable additions alongside the protocol’s endpoints.

Step 3: Incorporate TAUG-specified domains into your SDTM mapping specifications and annotated CRFs.

Step 4: Ensure controlled terminology versions align with both the TAUG requirements and the target regulator’s SDSR.

8.3 Navigating Multi-Regional Submissions

When submitting to multiple regulators, the practical approach is:

Design to the strictest standard: Typically the FDA, which has the most detailed CDISC requirements and validation expectations.

Layer on regional requirements: PMDA-specific validation rules, Japanese language fields, and any PMDA-unique technical specifications.

Monitor evolving landscapes: EMA’s CDISC mandate timeline, NMPA’s adoption progress, Health Canada’s alignment updates.

Maintain a single source of truth: Use one SDTM/ADaM dataset package as the master, with region-specific supplementary materials (e.g., additional Define-XML annotations, reviewer’s guide sections) added per regulator.

9. Conclusion

Data standards in clinical trials are not defined by any single organization. They emerge from a layered, collaborative architecture: ICH establishes harmonized principles, regulators adopt and enforce them within their jurisdictions, and CDISC develops the implementation-level standards that programmers execute. Understanding this architecture transforms CDISC standards from arbitrary rules into logical implementations of specific regulatory requirements.

For statistical programmers, the practical value of this understanding is significant. It explains why standards change, predicts how they will evolve, and provides the context needed to make informed decisions when guidance is ambiguous. Whether you are mapping a CDASH CRF to SDTM, building a SEND dataset for a carcinogenicity study, or consulting an Oncology TAUG for tumor assessment domains, you are operating at the implementation layer of a framework that begins with ICH and flows through the world’s regulatory authorities.

The standards exist for a reason. Knowing that reason makes you a better programmer.

References and Official Sources

1. ICH Official Website — https://www.ich.org

2. ICH E6(R2) Guideline: Good Clinical Practice — https://database.ich.org/sites/default/files/E6_R2_Addendum.pdf

3. ICH E3 Guideline: Structure and Content of Clinical Study Reports — https://database.ich.org/sites/default/files/E3_Guideline.pdf

4. ICH E9(R1) Guideline: Statistical Principles for Clinical Trials — https://database.ich.org/sites/default/files/E9-R1_Step4_Guideline_2019_1203.pdf

5. ICH M4 Guideline: The Common Technical Document — https://ich.org/page/ctd

6. ICH M8 Guideline: eCTD — https://ich.org/page/ectd

7. ICH M11 Guideline: Clinical Electronic Structured Harmonised Protocol — https://www.ich.org/page/multidisciplinary-guidelines

8. FDA Study Data Technical Conformance Guide — https://www.fda.gov/regulatory-information/search-fda-guidance-documents/study-data-technical-conformance-guide

9. FDA Study Data Standards Resources (SDSR) — https://www.fda.gov/industry/fda-data-standards-advisory-board/study-data-standards-resources

10. PMDA Technical Conformance Guide — https://www.pmda.go.jp/english/review-services/reviews/advanced-efforts/0002.html

11. EMA eSubmission Gateway — https://esubmission.ema.europa.eu

12. CDISC Official Website — https://www.cdisc.org

13. CDISC CDASH Standard — https://www.cdisc.org/standards/foundational/cdash

14. CDISC SEND Standard — https://www.cdisc.org/standards/foundational/send

15. CDISC Therapeutic Area Standards — https://www.cdisc.org/standards/therapeutic-areas

Disclaimer: This article references ICH guidelines, CDISC standards, and regulatory guidance as of the publication date. Regulatory requirements evolve; always consult the latest versions of official documents before making submission decisions. Some portions of this article are generated with the help of AI and users should be cautious and check for errors while using AI generated content.

Who Defines the Clinical Data Standards?