The Analysis Data Model (ADaM) is a foundational CDISC standard that defines the structure and content of analysis-ready datasets required for regulatory submissions to the FDA and other global health authorities. For electronic regulatory submissions to the FDA under current study data standards requirements, ADaM datasets are required for analysis data. Historically, the creation of these datasets has been the exclusive domain of SAS programming — a language deeply entrenched in the pharmaceutical industry for decades.
That paradigm is changing. The {admiral} package — short for ADaM in R Asset Library — represents a landmark open-source effort to bring ADaM dataset generation into the R ecosystem. Developed collaboratively by Roche and GSK under the pharmaverse umbrella, {admiral} provides a modular, well-documented, and comprehensively tested toolbox that enables statistical programmers to create CDISC-compliant ADaM datasets entirely in R.
This article provides a comprehensive, technical deep dive into {admiral}: its architecture, its function taxonomy, practical code examples for common ADaM datasets, the broader ecosystem of extension packages, and strategic considerations for organizations evaluating adoption.
The {admiral} project was born from a recognized gap in the R ecosystem. While the R/Pharma community had developed numerous packages for tables, listings, and figures (TLFs), no comprehensive solution existed for the upstream challenge of creating the ADaM datasets that feed those outputs.
Roche and GSK — two of the world's largest pharmaceutical companies — joined forces to address this gap. Their collaboration, formalized under the pharmaverse initiative, produced {admiral} as an open-source, cross-company R package licensed under Apache License 2.0. The project has since expanded to include contributors from Cytel, Johnson & Johnson, Bayer, and numerous other organizations.
R has been used in regulatory submissions to the FDA and EMA, including production workflows built with {admiral}, marking a significant milestone in the industry's acceptance of R for regulatory submissions.
As of early 2026, the latest stable release of {admiral} can be found on CRAN. The package follows a regular release cycle, with core updates and ongoing expansion into therapeutic area extensions.
The simplest way to install {admiral} is from CRAN:
install.packages("admiral")
For the latest features and bug fixes, install directly from GitHub:
# install.packages("pak")
pak::pkg_install("pharmaverse/admiral", dependencies = TRUE)
A typical {admiral} workflow relies on several companion packages:
# Core dependencies and companionslibrary(admiral) # ADaM derivation functionslibrary(pharmaversesdtm) # Test SDTM datasets (CDISC Pilot)
library(dplyr) # Data manipulation (tidyverse)
library(lubridate) # Date/time handlinglibrary(stringr) # String manipulationlibrary(tibble) # Enhanced data frames
# For metadata and transport fileslibrary(metacore) # Dataset metadata managementlibrary(metatools) # Metadata utilitieslibrary(xportr) # XPT file generation for submission
The central design principle of {admiral} is that an ADaM dataset is built through a sequence of derivations. Each derivation is a function call that adds one or more variables or records to the dataset being constructed. This modular approach provides several advantages: derivations can be easily added, removed, or reordered; each step is independently testable; and the resulting code reads as a clear, linear pipeline of transformations.
This stands in deliberate contrast to a "black-box" approach where a single function call generates an entire dataset. The {admiral} team has explicitly stated that their goal is not to automate ADaM creation with a single command, but rather to provide a toolbox of reusable, composable functions that programmers can assemble according to their study-specific requirements.
{admiral} organizes its functions into four primary categories:
1. Derivation Functions — The workhorses of the package. These functions add variables or records to a dataset. They follow a consistent naming convention:
derive_vars_*() — Add one or more variables to the input datasetderive_var_*() — Add a single variablederive_param_*() — Add new parameter records (rows)derive_extreme_*() — Derive extreme (first/last) records or events2. Computation Functions — Vector-in, vector-out functions for calculations:
compute_bmi() — Calculate Body Mass Indexcompute_bsa() — Calculate Body Surface Areacompute_map() — Calculate Mean Arterial Pressurecompute_qtc() — Calculate corrected QT intervalcompute_egfr() — Calculate estimated Glomerular Filtration Ratecompute_age_years() — Convert age to yearscompute_duration() — Calculate time durations3. Higher Order Functions — Advanced functions that take other functions as input:
call_derivation() — Call a derivation multiple times with varying argumentsrestrict_derivation() — Execute a derivation on a subset of the datasetslice_derivation() — Execute different derivations on different subsets4. Utility Functions — Supporting functions for common tasks:
convert_blanks_to_na() — Handle SAS-to-R missing value conversionconvert_dtc_to_dt() — Convert character dates to Date objectsconvert_dtc_to_dtm() — Convert character dates to datetime objectsexprs() — Create lists of expressions (used extensively in arguments){admiral} uses a consistent convention for function arguments built on top of R's non-standard evaluation (NSE):
by_vars — Expects a list of symbols: by_vars = exprs(STUDYID, USUBJID)filter — Expects a single expression: filter = PARAMCD == "TEMP"order — Expects a list of expressions: order = exprs(AVISIT, desc(AESEV))new_vars — Expects expressions for new variable definitions: new_vars = exprs(TRTSDTM = EXSTDTM)The exprs() function from {admiral} (re-exported from rlang) is used extensively and returns expressions in an unevaluated form, enabling efficient lazy evaluation across large datasets.
One of the most practical features of {admiral} is its built-in template system. Templates are pre-built R scripts that serve as starting points for common ADaM datasets. They demonstrate the recommended derivation sequence and can be customized to meet study-specific requirements.
library(admiral)
list_all_templates()
#> Existing ADaM templates in package 'admiral':
#>
#> • ADAE (Adverse Event Analysis Dataset)
#> • ADCM (Concomitant Medication Analysis Dataset)
#> • ADEG (ECG Analysis Dataset)
#> • ADEX (Exposure Analysis Dataset)
#> • ADLB (Laboratory Analysis Dataset)
#> • ADLBHY (Hy's Law Analysis Dataset)
#> • ADMH (Medical History Analysis Dataset)
#> • ADPC (Pharmacokinetic Concentration Dataset)
#> • ADPP (Pharmacokinetic Parameter Dataset)
#> • ADPPK (Population PK Analysis Dataset)
#> • ADSL (Subject Level Analysis Dataset)
#> • ADVS (Vital Signs Analysis Dataset)
# Generate the ADSL template script in your working directoryuse_ad_template(
adam_name = "adsl",
save_path = "./ad_adsl.R"
)
# Generate ADAE templateuse_ad_template(
adam_name = "adae",
save_path = "./ad_adae.R"
)
# Generate from extension packagesuse_ad_template(
adam_name = "adrs",
save_path = "./ad_adrs.R",
package = "admiralonco"
)
These templates are not meant to be used as-is. They are intentionally designed as starting points that programmers customize for their specific study designs, protocol requirements, and company standards.
ADSL is the most fundamental ADaM dataset — a one-record-per-subject dataset containing demographic information, treatment assignments, disposition dates, and population flags. Below is a condensed but representative example.
library(admiral)
library(dplyr, warn.conflicts = FALSE)
library(pharmaversesdtm)
library(lubridate)
library(stringr)
# ── Load Source SDTM Datasets ──
dm <- pharmaversesdtm::dm %>% convert_blanks_to_na()
ds <- pharmaversesdtm::ds %>% convert_blanks_to_na()
ex <- pharmaversesdtm::ex %>% convert_blanks_to_na()
ae <- pharmaversesdtm::ae %>% convert_blanks_to_na()
lb <- pharmaversesdtm::lb %>% convert_blanks_to_na()
# ── Pre-process Exposure Dates ──
ex_ext <- ex %>%
derive_vars_dtm(
dtc = EXSTDTC,
new_vars_prefix = "EXST"
) %>%
derive_vars_dtm(
dtc = EXENDTC,
new_vars_prefix = "EXEN"
)
# ── Build ADSL Step-by-Step ──
adsl <- dm %>%
# Step 1: Derive Treatment Variables
mutate(
TRT01P = ARM,
TRT01A = ACTARM
) %>%
# Step 2: Derive Treatment Start Date (TRTSDTM)
derive_vars_merged(
dataset_add = ex_ext,
filter_add = (EXDOSE > 0 |
(EXDOSE == 0 & str_detect(EXTRT, "PLACEBO"))) &
!is.na(EXSTDTM),
new_vars = exprs(TRTSDTM = EXSTDTM, TRTSTMF = EXSTTMF),
order = exprs(EXSTDTM, EXSEQ),
mode = "first",
by_vars = exprs(STUDYID, USUBJID)
) %>%
# Step 3: Derive Treatment End Date (TRTEDTM)
derive_vars_merged(
dataset_add = ex_ext,
filter_add = (EXDOSE > 0 |
(EXDOSE == 0 & str_detect(EXTRT, "PLACEBO"))) &
!is.na(EXENDTM),
new_vars = exprs(TRTEDTM = EXENDTM, TRTETMF = EXENTMF),
order = exprs(EXENDTM, EXSEQ),
mode = "last",
by_vars = exprs(STUDYID, USUBJID)
) %>%
# Step 4: Convert Datetimes to Dates
derive_vars_dtm_to_dt(
source_vars = exprs(TRTSDTM, TRTEDTM)
) %>%
# Step 5: Derive Treatment Duration
derive_var_trtdurd() %>%
# Step 6: Derive Disposition Date
derive_vars_merged(
dataset_add = ds %>%
derive_vars_dt(dtc = DSSTDTC, new_vars_prefix = "DSST"),
by_vars = exprs(STUDYID, USUBJID),
new_vars = exprs(EOSDT = DSSTDT),
filter_add = DSCAT == "DISPOSITION EVENT" &
DSDECOD != "SCREEN FAILURE"
) %>%
# Step 7: Derive Age Groups
derive_vars_cat(
definition = exprs(
AGEGR1 = case_when(
AGE < 18 ~ "<18",
between(AGE, 18, 64) ~ "18-64",
AGE > 64 ~ ">64"
)
)
) %>%
# Step 8: Derive Population Flags
derive_var_merged_exist_flag(
dataset_add = ex,
by_vars = exprs(STUDYID, USUBJID),
new_var = SAFFL,
condition = (EXDOSE > 0 |
(EXDOSE == 0 & str_detect(EXTRT, "PLACEBO")))
) %>%
mutate(
ITTFL = if_else(!is.na(ARM), "Y", "N")
)
Key Takeaway: Each derive_* call adds specific variables to the dataset in a clear, sequential pipeline. The code is self-documenting: you can read it top-to-bottom and understand exactly what each step contributes to the final ADSL.
ADVS follows the Basic Data Structure (BDS) pattern — one record per subject per parameter per analysis timepoint.
# ── Load source data ──
vs <- pharmaversesdtm::vs %>% convert_blanks_to_na()
# admiral_adsl is an example dataset included in the package for demonstrationadsl <- admiral::admiral_adsl
# ── Define ADSL variables needed in ADVS ──
adsl_vars <- exprs(TRTSDT, TRTEDT, TRT01P, TRT01A)
# ── Begin ADVS construction ──
advs <- vs %>%
# Step 1: Merge ADSL variables
derive_vars_merged(
dataset_add = adsl,
new_vars = adsl_vars,
by_vars = exprs(STUDYID, USUBJID)
) %>%
# Step 2: Map SDTM to ADaM variable names
mutate(
PARAMCD = VSTESTCD,
PARAM = VSTEST,
AVAL = VSSTRESN,
AVALU = VSSTRESU
) %>%
# Step 3: Derive Analysis Dates
derive_vars_dt(
dtc = VSDTC,
new_vars_prefix = "A",
highest_imputation = "D"
) %>%
# Step 4: Derive Analysis Day
derive_vars_dy(
reference_date = TRTSDT,
source_vars = exprs(ADT)
) %>%
# Step 5: Derive Visit Information
mutate(
AVISIT = case_when(
str_detect(VISIT, "SCREEN|UNSCHED|RETRIEVAL|AMBUL") ~
NA_character_,
!is.na(VISIT) ~ str_to_title(VISIT),
TRUE ~ NA_character_
),
AVISITN = as.numeric(case_when(
VISIT == "BASELINE" ~ "0",
str_detect(VISIT, "WEEK") ~
str_trim(str_replace(VISIT, "WEEK", "")),
TRUE ~ NA_character_
))
) %>%
# Step 6: Derive Baseline Flag and Value
restrict_derivation(
derivation = derive_var_extreme_flag,
args = params(
by_vars = exprs(STUDYID, USUBJID, PARAMCD),
order = exprs(ADT, VSSEQ),
new_var = ABLFL,
mode = "last"
),
filter = !is.na(AVAL) & ADT <= TRTSDT
) %>%
derive_var_base(
by_vars = exprs(STUDYID, USUBJID, PARAMCD),
source_var = AVAL,
new_var = BASE
) %>%
# Step 7: Derive Change from Baseline
derive_var_chg() %>%
derive_var_pchg() %>%
# Step 8: Derive BMI (computed parameter)
derive_param_bmi(
by_vars = exprs(STUDYID, USUBJID, !!!adsl_vars,
VISIT, VISITNUM, ADT, ADY),
set_values_to = exprs(PARAMCD = "BMI"),
get_unit_expr = VSSTRESU,
constant_by_vars = exprs(USUBJID)
) %>%
# Step 9: Derive MAP (Mean Arterial Pressure)
derive_param_map(
by_vars = exprs(STUDYID, USUBJID, !!!adsl_vars,
VISIT, VISITNUM, ADT, ADY),
set_values_to = exprs(PARAMCD = "MAP"),
get_unit_expr = VSSTRESU
)
ADAE follows the Occurrence Data Structure (OCCDS) pattern.
# ── Load source data ──
ae <- pharmaversesdtm::ae %>% convert_blanks_to_na()
# admiral_adsl is an example dataset included in the package for demonstrationadsl <- admiral::admiral_adsl
# ── ADSL variables required for ADAE ──
adsl_vars <- exprs(TRTSDT, TRTEDT, DTHDT, EOSDT)
# ── Build ADAE ──
adae <- ae %>%
# Step 1: Merge ADSL variables
derive_vars_merged(
dataset_add = adsl,
new_vars = adsl_vars,
by_vars = exprs(STUDYID, USUBJID)
) %>%
# Step 2: Derive Analysis Start Datetime
derive_vars_dtm(
dtc = AESTDTC,
new_vars_prefix = "AST",
highest_imputation = "M",
min_dates = exprs(TRTSDT)
) %>%
# Step 3: Derive Analysis End Datetime
derive_vars_dtm(
dtc = AEENDTC,
new_vars_prefix = "AEN",
highest_imputation = "M",
date_imputation = "last",
time_imputation = "last",
max_dates = exprs(DTHDT, EOSDT)
) %>%
# Step 4: Convert to Dates
derive_vars_dtm_to_dt(source_vars = exprs(ASTDTM, AENDTM)) %>%
# Step 5: Derive Relative Days
derive_vars_dy(
reference_date = TRTSDT,
source_vars = exprs(ASTDT, AENDT)
) %>%
# Step 6: Derive Treatment-Emergent Flag
derive_var_trtemfl(
trt_start_date = TRTSDT,
trt_end_date = TRTEDT,
end_window = 30
) %>%
# Step 7: Derive Severity/Toxicity Grades
# Note: derive_var_atoxgr() may require specific input variables
# and may be available via extension packages such as {admiralonco}
# depending on the grading criteria used.
derive_var_atoxgr() %>%
# Step 8: Derive Occurrence Flags
derive_var_extreme_flag(
by_vars = exprs(STUDYID, USUBJID),
order = exprs(AESEV, ASTDT, AESEQ),
new_var = AOCCIFL,
mode = "first"
)
Beyond the core package, {admiral} supports a family of extension packages organized by therapeutic area and company-specific needs:
| Package Therapeutic Area Key Datasets Notable Functions | |||
| {admiralonco} | Oncology | ADRS, ADTR, ADTTE | Best overall response (BOR), RECIST 1.1 and iRECIST criteria, tumor response derivations |
| {admiralophtha} | Ophthalmology | ADOE | Affected eye derivation, BCVA to logMAR conversion, Snellen category mapping |
| {admiralvaccine} | Vaccines | ADIS, ADCE | Immunogenicity specimen derivations, fever record detection, criteria evaluation flags |
| {admiralpeds} | Pediatrics | Pediatric ADaMs | Anthropometric indicators, child growth/development chart parameters |
| {admiralmetabolic} | Metabolic Disorders | Metabolic ADaMs | Specialized metabolic disease derivations |
| Package Purpose | |
| {admiraldev} | Developer utilities, common functions shared across all {admiral} packages |
| {pharmaversesdtm} | Test SDTM datasets from the CDISC Pilot Project |
| {pharmaverseadam} | Test ADaM datasets generated from {admiral} templates |
Companies can create their own extension packages (e.g., {admiralroche}, {admiralgsk}) that plug into the {admiral} framework with company-specific metadata access, naming conventions, and custom derivation logic.
Every line of {admiral} code is publicly available on GitHub. Functions are comprehensively documented with real-data examples, and the entire test suite is visible. This transparency is critical for regulatory submissions where code reviewability is paramount.
{admiral} is not a single-company effort. Contributors span Roche, GSK, Cytel, Johnson & Johnson, Bayer, and independent contributors. This cross-industry development ensures the package addresses common needs rather than company-specific quirks, and it distributes the maintenance burden across organizations.
Derivation functions are designed to align with the CDISC ADaM Implementation Guide. The package's alignment with CDISC standards helps reduce the risk of non-compliance during regulatory review, though compliance ultimately depends on how the functions are applied within each study context.
{admiral} employs extensive unit testing — a practice less common in traditional SAS macro libraries. Each function includes automated tests that verify correct behavior across edge cases, missing values, and boundary conditions. This level of testing provides confidence in the correctness of derivations.
R scripts built with {admiral} are inherently reproducible. Combined with version pinning (via renv or similar tools), an {admiral}-based pipeline produces identical output given the same inputs and package versions — a property that is essential for regulatory submissions and QC workflows.
The built-in template system reduces the cognitive overhead of starting a new ADaM dataset. Rather than building from scratch, programmers start with a vetted template and customize it, reducing both development time and the risk of missing standard derivations.
Multiple pharmaceutical regulatory submissions to the FDA and EMA have reportedly used R and {admiral} in production workflows. These successful submissions establish precedent and reduce the regulatory risk perceived by organizations considering adoption.
{admiral} operates within the broader pharmaverse ecosystem. A complete end-to-end clinical reporting workflow in R typically involves:
| Stage Package(s) Purpose | ||
| SDTM Creation | {sdtm.oak}, {sdtmchecks} | Transform raw data to SDTM |
| ADaM Creation | {admiral} + extensions | Transform SDTM to ADaM |
| Metadata Management | {metacore}, {metatools} | Dataset specifications and metadata |
| Transport Files | {xportr} | Generate XPT files for submission |
| Tables & Listings | {rtables}, {Tplyr}, {pharmaRTF} | Create regulatory tables and listings |
| Figures | {tern}, {ggplot2} | Create clinical trial figures |
For organizations and individuals looking to adopt {admiral}, the following roadmap is recommended:
Step 1: Install and Explore
Install {admiral} and its companion packages, then run the built-in ADSL template against the included CDISC Pilot data:
install.packages(c("admiral", "pharmaversesdtm", "dplyr",
"lubridate", "stringr"))
library(admiral)
use_ad_template("adsl", save_path = "./ad_adsl.R")
Run the template script and examine the resulting dataset. Compare it against the ADSL specification from the CDISC Pilot Project.
Step 2: Customize for Your Study
Take the template output and begin modifying it for your study's specific requirements. Add custom derivations, adjust imputation rules, and integrate your company's metadata conventions.
Step 3: Integrate Metadata and Transport
Use {metacore}, {metatools}, and {xportr} to apply variable labels, formats, and lengths from your dataset specification, then generate submission-ready XPT files.
Step 4: Validate and QC
Establish a double-programming or independent QC process. The modular, function-based structure of {admiral} code makes it straightforward to review and verify individual derivation steps.
Step 5: Engage with the Community
Join the pharmaverse Slack workspace for support, contribute to GitHub issues, and consider contributing functions or improvements back to the package.
While {admiral} represents a significant advancement, there are practical considerations to keep in mind:
The {admiral} package is not merely an R package — it represents a philosophical shift in how the pharmaceutical industry approaches clinical data programming. By providing an open-source, collaboratively developed, and rigorously tested toolbox for ADaM dataset generation, {admiral} is enabling a transition that many thought was years away: the viable use of R for end-to-end regulatory submissions.
For statistical programmers, the message is clear: {admiral} has been used in regulatory submissions, is backed by an active cross-industry community, and continues to mature with each release. Whether your organization is beginning to explore R or is already deep into its R transition, {admiral} provides the foundation for building CDISC-aligned ADaM datasets with confidence.
| Resource URL | |
| {admiral} CRAN Page | https://cran.r-project.org/package=admiral |
| {admiral} Documentation | https://pharmaverse.github.io/admiral/ |
| GitHub Repository | https://github.com/pharmaverse/admiral |
| Pharmaverse Blog | https://pharmaverse.github.io/blog/ |
| Pharmaverse YouTube Channel | https://www.youtube.com/@pharmaverse |
| Pharmaverse Slack | https://pharmaverse.slack.com |
| CDISC ADaM Standard | https://www.cdisc.org/standards/foundational/adam |
This article is published on clinstandards.org — a technical publication serving the statistical programming community in pharmaceutical research.
No comments yet. Be the first!