Clinical Integrity Suite (CIS)

← All projects

A local Streamlit tool that runs a large library of built-in checks across SDTM/ADaM datasets and presents findings in a two-tab dashboard, each with an impact level, a plain-English explanation, a resolution hint, and reproducible SAS to drill into offending records.

What it adds

Pinnacle 21 checks conformance to the CDISC standard. CIS runs locally before a P21 run and adds what a programmer actually needs to act: one impact scale, a plain-English explanation per finding, and copyable SAS to reproduce it.

How it works

InputsSAS7BDAT datasetsXPT datasetsSDTM + ADaM

Process1Parser to common frame2Domain-modular check registry3Engine runs 260+ checks4Impact + aggregator (1-5)

OutputsIssue Summary tabDetail tab + SASRotating audit log

Typical layout

By the numbers

260+

Built-in checks

109

Test functions

1-5

Impact scale

Dashboard tabs

Screenshots

Add image

The sidebar uploader with SAS7BDAT/XPT files dropped in and the Run Validation button visible

Drop sdtm-checks-cis-01-upload.png into
/public/screenshots/sdtm-checks-cis/

Add image

The Issue Summary tab showing findings grouped by impact level (Critical/High/Medium/Low/Info) with colour coding

Drop sdtm-checks-cis-02-summary.png into
/public/screenshots/sdtm-checks-cis/

Add image

The Detail tab for one finding showing the plain-English explanation, resolution hint, and the copyable SAS code block

Drop sdtm-checks-cis-03-detail.png into
/public/screenshots/sdtm-checks-cis/

Data flow

SDTM and ADaM datasets accumulate data quality issues that are expensive to find late. A full Pinnacle 21 run or sponsor review is slow feedback for a programmer mid-build.

Input: SAS7BDAT / XPT datasets (SDTM + ADaM)
        |
        v
  Parser (modules/parser.py)        normalises datasets into a common frame
        |
        v
  Check Registry (registry.py)      domain-modular checks: modules/checks/<domain>.py
        |                            (ae, lb, dm, cm, ds, eg, ex, vs, pk, onc ...)
        v
  Engine (engine.py)                runs each check, collects findings
        |
        v
  Impact + Aggregator               single 1-5 impact scale, dedupes, ranks
        |
        v
  Reporter (reporter.py)  -->  Two-tab Streamlit dashboard: Summary + Detail

Engineering trade-offs

Domain-modular check files (modules/checks/<domain>.py)

Each SDTM domain owns its checks, so a programmer can find and extend AE checks without reading the whole engine.

Single 1-5 impact scale (v33 change)

Replaced four separate severity taxonomies with one consistent scale so findings are directly comparable across domains.

Reproducible SAS emitted per finding

A programmer acts on a finding faster when they can paste the exact SAS to see the offending records themselves.

Centralised rotating log at logs/cis.log

5MB x 5 backups gives a durable audit trail without unbounded disk growth in a shared environment.

At a glance

A quick visual read of the countable facts; full detail in the table.

Built-in checks260

Test functions109

SDTM domains20

Relative scale · values labelled · unit: count

Processing characteristics

Metric	Value	Notes
Built-in checks	260+	Domain-modular; v34 vectorised AE018/19/21, MH003, DS010, EG008
Test functions	109	Counted across the test suite
Impact scale	1-5	Critical / High / Medium / Low / Info
Dashboard tabs	2	Issue Summary + Detail (reduced from eight)
Input formats	SAS7BDAT, XPT	Read via pyreadstat
Logging	Rotating 5MB x 5	logs/cis.log

Functional wins

01Runs 260+ SDTM/ADaM checks locally before a Pinnacle 21 run or sponsor review, shortening the feedback loop for the programmer.

02Every finding carries an impact level, a plain-English explanation, a resolution hint, and copyable SAS to reproduce it.

03Domain-modular check organisation lets new checks be added in the relevant domain file without touching the engine.

04Vectorised hot-path checks to keep large-dataset runs responsive.

Module dependencies

core

Python 3.9+
pyyaml

streamlit
fastapi
uvicorn

data

pandas
pyreadstat
openpyxl
xlsxwriter

testing

pytest