A guardrailed clinical table shell compiler for validated R and SAS output
TLF Compiler V2 supports a clinical programming workflow where table shells, AdaM metadata, and programming conventions must stay aligned. It separates interpretation from code writing: the LLM parses and proposes structure, but code is only generated after the recipe passes validation.
The design uses a compiler-like pipeline. Instead of asking the model to write a full program directly, the system asks for a structured recipe. This gives the application a place to validate, repair, and fall back before emitting R or SAS.
{
"approach": "tplyr",
"dataset_var": "adae",
"pre_filters": ["SAFFL == 'Y'", "TRTEMFL == 'Y'"],
"derived_vars": [{"dataset_var": "adae", "name": "ANY_EVENT", "expr": "'Yes'"}],
"tables": [{
"table_var": "t1",
"dataset_var": "adae",
"treatment_var": "TRTP",
"add_total": true,
"layers": [
{"type": "group_count", "var": "ANY_EVENT", "nested_var": null, "by_var": null, "distinct_by": "USUBJID"},
{"type": "group_count", "var": "AEBODSYS", "nested_var": "AEDECOD", "by_var": null, "distinct_by": "USUBJID"}
]
}],
"combine_method": "bind_rows"
}
V2 validates at multiple levels. The system checks whether required fields exist, whether values are plausible variable names, whether treatment and layer variables are known, and whether adverse-event tables use the required nested SOC/PT pattern.
The same validated recipe now drives both program outputs. R uses Tplyr-oriented assembly; SAS uses PROC FREQ, PROC MEANS, DATA steps, and PROC LIFETEST for survival-style outputs. Keeping both outputs tied to the same recipe reduces drift.
A common failure mode is treating the visible PT examples on the shell as the complete output. V2 instead marks SOC and PT rows as dynamic and produces a nested layer so all observed SOC/PT values in ADAE can appear.
add_layer(
group_count(vars(AEBODSYS, AEDECOD)) %>%
set_format_strings("n (%)" = f_str("xx (xx.x%)", n, pct)) %>%
set_distinct_by(USUBJID)
)
The evaluation harness runs golden cases under controlled settings. It records route accuracy, recipe issue counts before and after repair, fallback use, and AE nested-layer accuracy when applicable.
The app logs session and run identifiers, event sequence, status, and structured details. When a case fails, the log includes route, expected route, recipe issue counts, repair attempts, fallback use, and error text.
{
"event": "eval_case_completed",
"status": "WARNING",
"details": {
"case_id": "ae_soc_pt_core",
"route": "ae",
"expected_route": "ae",
"post_repair_recipe_issue_count": 1,
"used_deterministic_fallback": true
}
}
No comments yet. Be the first!
