A working reference for biostatisticians and statistical programmers

GxP is shorthand for the family of regulations and quality guidelines that apply to any work that touches a regulated medicinal product. The G stands for Good, the x is a placeholder for the discipline (Clinical, Laboratory, Manufacturing, Pharmacovigilance, Distribution), and the P stands for Practice. For a biostatistician or statistical programmer, GxP is the framework that decides whether the analysis you ran on Tuesday can be trusted by an FDA reviewer two years later.

This piece keeps the survey of GLP, GMP, and 21 CFR Part 11 short, and spends most of its weight on Good Clinical Practice, which is the GxP that statistical programming actually lives inside. Examples are given in SAS, R, and Python because submission shops increasingly run on all three.

The GxP family

GxP	Domain	Primary regulator references	Programmer relevance
GLP	Non-clinical / preclinical labs	FDA 21 CFR Part 58, OECD Principles	Low to medium (tox, PK datasets)
GCP	Clinical trials in humans	ICH E6(R3), FDA 21 CFR 312, EMA Annex I	High (SDTM, ADaM, TLF, eCTD m5)
GMP	Drug substance and product manufacturing	FDA 21 CFR 210/211, EU GMP Vol. 4	Low (CMC stability tables, batch data)
GVP	Post-market safety and pharmacovigilance	EMA GVP Modules, FDA 21 CFR 314.80	Medium (PSUR, DSUR, signal datasets)
GDP / GSP	Distribution and storage	EU GDP 2013/C 343/01, WHO TRS 957	Low

Table 1. The GxP family at a glance, with the regulator references a programmer is most likely to encounter on a global submission.

Good Laboratory Practice

GLP governs how non-clinical safety studies (tox, carcinogenicity, reproductive, safety pharmacology) are planned, performed, monitored, recorded, archived and reported. It is the oldest formal GxP, born from the 1976 Industrial BioTest Labs scandal and codified in 21 CFR Part 58 in 1979. The OECD Principles of GLP are the international counterpart and are mutually accepted across most ICH regions.

For a programmer the GLP touchpoints are narrow but real. Datasets coming out of a GLP-compliant tox lab carry a Quality Assurance Unit (QAU) signature. If your team does PK/PD or tox-table programming on top of those data, the source data, derivation specifications, and final outputs need to be traceable to that QAU-released source. Mixing GLP source with non-GLP data in the same dataset without flagging it is a finding waiting to happen.

Good Manufacturing Practice

GMP applies to the drug substance and drug product itself, not the analyses around them. FDA codifies it in 21 CFR 210 and 211; the EU codifies it in EudraLex Volume 4. A statistical programmer rarely writes GMP-relevant code, but stability analysis, content uniformity, and batch-release decisions are statistical activities, and the data systems that hold them are GMP-validated. If you are pulled into CMC, expect tighter change control, a formal validated environment, and a different SOP set than the one you use for clinical work.

21 CFR Part 11 and EU Annex 11

Part 11 is the FDA rule that defines when an electronic record or electronic signature is acceptable as the equivalent of paper. EU Annex 11 covers the same ground for computerised systems used in any GxP activity in the EU. Part 11 is the rule that makes a SAS or R script production-grade, because it is what forces validated environments, audit trails, role-based access, and qualified backups.

Part 11 / Annex 11 area	What the rule wants	What it looks like in a programming shop
Validation	Documented evidence the system does what it should	IQ/OQ/PQ for SAS, R, Python build; URS, FS, Test Cases under change control
Audit trail	Independent, computer-generated, time-stamped record of changes	Versioned repository (Git, SVN), JOBSCAN logs, RStudio Workbench audit logs
Access control	Authority checks; unique user IDs	AD groups for SDTM/ADaM/TLF folders; locked production after database lock
E-signatures	Identification of signer, meaning, link to record	Sign-off in the eTMF or DocuSign on SAP, ADaM specs, validation reports
Copies of records	Accurate, complete copies for inspection	Reproducible builds; archived program + log + lst + dataset for every run

Table 2. How Part 11 / Annex 11 expectations land on the day-to-day work of a statistical programming team.

Good Clinical Practice — the programmer's core competency

GCP is an international ethical and scientific quality standard for designing, conducting, recording and reporting trials that involve human subjects. The current global anchor is ICH E6(R3), adopted by ICH in January 2025. FDA implemented the previous revision (R2) through 21 CFR 312 and the 2018 guidance, and is moving toward harmonisation with R3. EMA implements GCP through Directive 2001/20/EC and the Clinical Trials Regulation (EU) No 536/2014, with the EMA/INS/GCP inspection procedures supplying enforcement detail.

ICH E6 is not the only guideline that matters for a programmer. The bundle that actually drives day-to-day work is the table below.

Guideline	Subject	Why a programmer cares
ICH E6(R3)	Good Clinical Practice	Defines the quality system, sponsor responsibilities, computerised system requirements (Annex 1)
ICH E8(R1)	General Considerations for Clinical Studies	Quality by Design, critical-to-quality factors that flow into the SAP
ICH E9	Statistical Principles for Clinical Trials	Defines analysis populations, multiplicity, missing data, interim analyses
ICH E9(R1)	Estimands and Sensitivity Analyses	Five-attribute estimand framework; intercurrent event strategies in the SAP and ADaM
ICH E3	Structure and Content of Clinical Study Reports	Section 14 tables, listings; annex 16 patient data listings
ICH E2A / E2B(R3)	Safety reporting and ICSR	SAE/SUSAR datasets; CIOMS, E2B(R3) XML
CDISC SDTM, ADaM, Define-XML	Data standards required by FDA, PMDA; recommended by EMA	Conformance is the mechanical face of GCP for the programmer

Table 3. ICH guidelines and CDISC standards a stat programmer touches on a typical NDA, BLA, or MAA.

What changed with ICH E6(R3)

R3 is the first GCP revision built around fit-for-purpose quality and computerised systems rather than the paper-trial model E6(R2) inherited. Three shifts matter for programmers. First, Annex 1 (interventional trials) carries an explicit set of expectations for computerised systems used to generate or hold trial data, formalising what most sponsors had already pulled from EMA's Notice to Sponsors and FDA's 2007 Computerised Systems guidance. Second, the principle of proportionate quality means a SAP risk assessment is no longer an internal nicety, it is the basis for what gets validated and how heavily. Third, sponsor oversight of vendors (CRO, EDC, central lab, IRT, ePRO) is sharpened, which directly affects the audit trails and dataset provenance a programmer has to be able to defend.

What is programmer's role in the GCP lifecycle

The flow below is the practical version of what E6(R3) calls the data lifecycle. Every box has a GCP control around it. If you can name the SOP, the validated tool, the approver, and the audit trail for each box on your own study, you are GCP-ready.

Stage	Programmer activity	GCP / regulatory hook
Protocol & SAP	Review estimands, analysis populations, planned tables	ICH E8, E9, E9(R1); SAP signed before database lock
CRF / EDC build	Review of CRF annotations, edit checks, controlled terminology	E6(R3) Annex 1; CDASH; SDTM IG
Source data → SDTM	SDTM mapping, define-xml, reviewer's guide (cSDRG)	FDA Study Data Technical Conformance Guide; PMDA notification
SDTM → ADaM	ADaM build, ADaM IG conformance, validation logs	ADaM IG v1.3+; CDISC ADaM Validation Checks
ADaM → TLF	Production + QC of section 14 tables, listings, figures	ICH E3; SAP traceability
DBL → Lock	Final run, freeze, archival; signatures on outputs	21 CFR 11; Annex 11; E6(R3) record retention
Submission (eCTD m5)	Transport datasets (XPT/Dataset-JSON), ADRG, define-xml	FDA TCG, EMA eCTD EU M1, PMDA validator rules

Figure 1. The clinical data lifecycle viewed from the programmer's chair, with the GCP and regional regulatory hook for each stage.

Practical compliance: what GCP looks like in code

Three habits separate a GCP-ready programming team from a team that will fail a sponsor audit. The first is independent verification, usually called double programming. The second is full traceability, source data through to displayed result. The third is a clean, reviewable log for every production run.

Independent verification (double programming)

ICH E6 does not literally say "double programme your tables", but it requires that data are recorded, handled and stored in a way that allows accurate reporting, interpretation and verification. Sponsors operationalise this through a producer/QC workflow on every key derivation and every primary efficacy / safety output. In SAS the comparison is conventionally PROC COMPARE; in R it is most often diffdf or arsenal::compare; in Python it is pandas.testing.assert_frame_equal.

/* SAS: production vs QC ADaM compare for ADSL */

proc compare base = prod.adsl

compare = qc.adsl

out = work.adsl_diff outnoequal outbase outcomp

listall criterion = 1e-8;

id usubjid;

run;

/* Read the return code; fail the job if not equal */

%if &sysinfo ne 0 %then %do;

%put ERROR: ADSL prod vs QC mismatch, sysinfo=&sysinfo;

endsas;

%end;

R and Python equivalents

# R: diffdf for ADaM QC

library(haven); library(diffdf)

prod <- read_xpt("prod/adsl.xpt")

qc <- read_xpt("qc/adsl.xpt")

diff <- diffdf(prod, qc, keys = "USUBJID", strict_numeric = TRUE)

if (diffdf::diffdf_has_issues(diff)) stop("ADSL QC failed")

# Python: pandas-based ADaM compare

import pandas as pd, pyreadstat, sys

prod, _ = pyreadstat.read_xport("prod/adsl.xpt")

qc, _ = pyreadstat.read_xport("qc/adsl.xpt")

prod = prod.sort_values("USUBJID").reset_index(drop=True)

qc = qc.sort_values("USUBJID").reset_index(drop=True)

try:

pd.testing.assert_frame_equal(prod, qc, check_exact=False, atol=1e-8)

except AssertionError as e:

sys.exit(f"ADSL QC failed: {e}")

Traceability and the audit trail

Define-XML is the artefact that carries traceability into the submission. Inside the team, the same idea has to live in the program headers, the spec, and the reviewer's guide. A compliant header tells an inspector who wrote the program, when, against which spec version, and what input data it consumed. Anything the inspector cannot reconstruct from your archive is, for GCP purposes, not reproducible.

/*--------------------------------------------------------------------

Program : t_ae_soc_pt.sas

Purpose : Table 14.3.1.1 AEs by SOC and PT, Safety population

Spec : SAP v3.0 dated 2026-02-14, Table shell T-14.3.1.1 v2

Inputs : ADaM.ADSL, ADaM.ADAE (locked snapshot 2026-04-10)

Outputs : t_ae_soc_pt.rtf, t_ae_soc_pt.lst

Author : V. Doth (programmer), A. Reviewer (QC)

History : 1.0 2026-02-20 First production run

1.1 2026-04-12 Updated PT MedDRA v27.0 dictionary

--------------------------------------------------------------------*/

Log review and the "clean log" rule

An auditor will not read your code first, they will read your log. A production run that emits WARNING, ERROR, uninitialized, or NOTE: MERGE statement has more than one... is, by sponsor SOP, a failed run. Most shops automate a log scrubber that fails the job if forbidden strings are found, and stores the scrubbed log alongside the dataset and the lst output.

# Python log scrubber, runs at the end of every batch

import re, sys, pathlib

FORBIDDEN = [r"^ERROR", r"^WARNING", r"uninitialized",

r"more than one", r"converted from char",

r"Invalid (data|numeric|argument)"]

rx = re.compile("|".join(FORBIDDEN), re.I | re.M)

fails = []

for log in pathlib.Path("prod/logs").glob("*.log"):

text = log.read_text(errors='ignore')

if rx.search(text):

fails.append(log.name)

if fails:

sys.exit(f"Log scrubber failed for: {fails}")

Inspection readiness for the programming team

An FDA BIMO inspection or an EMA GCP inspection rarely starts in the statistics function, but if the inspector has questions about how a primary efficacy result was derived, statistics is where they end. Three artefacts are normally requested: the locked SAP, the reviewer's guides (cSDRG and ADRG), and the program-plus-log archive that produced the result. If those three line up, the conversation is short.

Common findings the programming team owns

Finding category	What it usually looks like in practice
SAP / ADaM divergence	ADaM derives a parameter (e.g. baseline definition) differently from the SAP without a documented amendment
Untraceable derivation	ADaM variable has no source mapping in the spec or define-xml; reviewer cannot trace it back to SDTM
Unresolved log warnings	Production run archived with WARNING messages; no documented rationale
Late or missing QC	Primary table not independently programmed, or QC done after sign-off
Version drift	Submission dataset built with a different SAS / R / Python version than was qualified
Open production environment	Programmers retain write access to the production folder after database lock
Missing 21 CFR 11 controls	No audit trail on the validated environment; shared user accounts

Table 4. The findings that, in the author's submission experience across NDA, BLA, and MAA filings, recur most often in the statistical programming function.

Closing the loop

GxP is not paperwork bolted onto the analysis after the fact. It is the set of habits that make the analysis defensible: a signed SAP before the lock, an ADaM that traces cleanly to SDTM, a production run with a clean log, an independent QC that compared equal, and an archive an auditor can open three years from now and re-run. Build the habits into the project plan and the inspection looks after itself.

A pre-lock checklist for the programming team

The questions below are the ones a sponsor lead programmer should be able to answer yes to on the morning of database lock. They are stitched together from the GCP, Part 11, and CDISC requirements covered above, and from the recurring findings in Table 4.

#	Question	Anchor
1	Is the SAP signed and dated, with all amendments captured before lock?	ICH E9, E6(R3) §5
2	Does every ADaM variable trace cleanly to SDTM via the spec and define-xml?	ADaM IG, Define-XML 2.1
3	Has every primary and key secondary table been independently QC'd to PROC COMPARE / diffdf / assert_frame_equal pass?	Sponsor SOP, E6(R3) Annex 1
4	Is the production environment locked to read-only, with named approvers on file?	21 CFR 11, EU Annex 11
5	Are program, log, lst, and dataset archived together with the validated tool versions?	21 CFR 11.10(c), GCP record retention
6	Is the cSDRG / ADRG aligned with the as-built submission package?	FDA TCG, PMDA validator

Table 5. Six questions to clear before any GCP-regulated database lock.

References:

ICH E6(R3) Good Clinical Practice (Step 4, January 2025); ICH E8(R1) General Considerations for Clinical Studies (2021); ICH E9 Statistical Principles for Clinical Trials and ICH E9(R1) Estimands; FDA 21 CFR Parts 11, 50, 54, 56, 312, 314; FDA Study Data Technical Conformance Guide (current version); EU Clinical Trials Regulation (EU) No 536/2014; EMA Annex 11 to EU GMP; CDISC SDTM IG, ADaM IG, Define-XML 2.1, Dataset-JSON 1.1; PMDA Notification 0427001 on Electronic Study Data Submission. For working examples and CDISC tooling, see clinstandards.org.

GxP Guidelines in Pharma and Biotech