Smart Validation Tool — Nilesh Borade

← All projects

A tool that scans SAS logs and LST listings, maps them to the TFL plan, checks run order and timing, detects empty and redundant outputs, and rolls everything into a health score with an HTML report.

What it adds

Conformance tools check datasets; this checks the delivery — logs, listings, run order, TFL coverage and redundancy. It is the operational QC layer no SDTM checker looks at.

How it works

InputsSAS logs.LST listingsTFL plan

Process1Scanner discovers files2Parsers extract status + timing3Order / coverage / redundancy4Severity + health score

OutputsHealth scoreHTML reportCLI summary

Typical layout

By the numbers

v0.13.1

Version

315

Test functions

Interfaces (CLI+UI)

Check families

Screenshots

Add image

The app after a deliverable folder is selected, showing the list of logs and listings discovered by the scanner

Drop smart-validation-01-scan.png into
/public/screenshots/smart-validation/

Add image

The summary view showing the overall health score with severity breakdown across logs, timing, coverage and redundancy

Drop smart-validation-02-score.png into
/public/screenshots/smart-validation/

Add image

The TFL coverage view mapping produced outputs to the planned TFL shells, highlighting missing or extra outputs

Drop smart-validation-03-coverage.png into
/public/screenshots/smart-validation/

Data flow

Before a deliverable goes out, a programmer must confirm logs are clean, outputs ran in the right order, TFL coverage matches the planned shells, and nothing is empty or redundant. Doing this by hand across many files is slow and easy to miss.

Input: SAS logs + LST listings + TFL plan
        |
        v
  Scanner (scanner.py)            discovers logs/listings in the delivery
        |
        v
  Parsers (lst_parser.py, lst.py) extract status, timing, content signals
        |
        +--> Run-order check (run_order.py)
        +--> Timing checks (timing_checks.py)
        +--> TFL coverage (tfl_map.py, tnf_coverage.py)
        +--> Empty-dataset + redundancy (empty_dataset.py, redundancy.py)
        |
        v
  Severity + Health score (severity.py, health_score.py)
        |
        v
  Report (html_report.py)  -->  HTML report   |   CLI (Typer/Rich)

Engineering trade-offs

Both a Typer/Rich CLI and a Streamlit UI

The CLI fits a scripted pre-delivery gate; the UI fits an interactive review. Same engine, two front doors.

Single health score over many signals

Reviewers need one go/no-go read; the score rolls up logs, order, timing, coverage, and redundancy into a comparable number.

TFL plan as the coverage source of truth

Mapping outputs to the planned shells catches both missing outputs and unexpected extras, not just log errors.

Pure-Python optional .exe packaging

Runs in a locked-down environment with no install, and can be packaged standalone where Python is unavailable.

At a glance

A quick visual read of the countable facts; full detail in the table.

Test functions315

Check families5

Interfaces2

Relative scale · values labelled · unit: count

Processing characteristics

Metric	Value	Notes
Test functions	315	Largest suite in the portfolio
Version	0.13.1	From the package __version__
Interfaces	CLI + Streamlit	Typer/Rich CLI and a Plotly UI
Inputs	Logs, LST, TFL plan	SAS logs and listings mapped to the plan
Output	HTML report	Health score with severity breakdown
Checks	Order/timing/coverage/empty/redundant	Rolled into the health score

Functional wins

01Scans an entire deliverable of SAS logs and listings and maps them to the planned TFL shells in one pass.

02Detects empty outputs, redundant outputs, run-order problems, and timing anomalies that are easy to miss by eye.

03Rolls all signals into a single health score so a reviewer gets a clear go/no-go read.

04Runs as a CLI for an automated pre-delivery gate or as a Streamlit app for interactive review.

Module dependencies

core

Python
typer
rich
pyyaml

streamlit
plotly

data

pandas
openpyxl

testing

pytest