A platform that parses a SAP (and optional TFL shell and ADaM spec), extracts each output with its class, type, and population, runs QC on the result, and presents an analytics dashboard with copyable SAS pseudocode — using deterministic patterns rather than an LLM.
No conformance tool reads a Statistical Analysis Plan. This extracts the programming specification from the SAP itself — the step before any dataset exists to be checked.
How it works
Typical layout
By the numbers
Screenshots
cti-platform-01-upload.png into/public/screenshots/cti-platform/
cti-platform-02-outputs.png into/public/screenshots/cti-platform/
cti-platform-03-analytics.png into/public/screenshots/cti-platform/
Data flow
A Statistical Analysis Plan describes dozens to hundreds of outputs in prose. Turning that into a structured programming specification by hand is slow, and details about output class, type, and population are easy to misread.
Input: Primary SAP (DOCX/RTF/TXT)
+ optional TFL shell (DOCX) + optional ADaM spec (XLSX)
|
v
Parsers (src/parsers) read documents into text + structure
|
v
Entity Extraction identify each output + attributes
|
v
Classification (utils/classification.py)
| output class / type / population
v
Rule Engine + Normalization dataset inference, ADaM variable registry
|
v
QC Engine readiness score, category + severity
|
v
Streamlit dashboard --> outputs, analytics, SAS pseudocode, Excel export Engineering trade-offs
At a glance
A quick visual read of the countable facts; full detail in the table.
Relative scale · values labelled · unit: count
Processing characteristics
| Metric | Value | Notes |
|---|---|---|
| Inputs | SAP + shell + ADaM spec | DOCX/RTF/TXT, DOCX, XLSX |
| Outputs per study | 40-100+ | Shell structure adds outputs beyond the SAP prose |
| Extraction method | Pattern-based | No LLM, no internet required |
| QC | Readiness score | Filterable by category and severity |
| Test fixtures | 5 mock SAPs + 150-page | Phase 3, oncology TTE, crossover PK, safety, adaptive |
| Output | SAS pseudocode + Excel | Copyable st.code SAS blocks; Excel workbook |
Functional wins
Module dependencies
- Python
- streamlit
- pandas
- python-docx
- openpyxl
- pytest