DrugSynthAI
User Manual
AI-Governed Drug Discovery for Rare Diseases
DrugSynthAI is an AI-governed drug discovery platform that compresses the computational phases of drug development from years to minutes. It replaces manual target identification, molecular screening, and safety profiling with an 57-agent computational system (5 tiers: 11 domain pipeline, 15 platform validators, 6 orchestrator, 2 intelligence, 8 personalized medicine + 15 assistants) governed by stage gateStage GateA governance checkpoint between pipeline stages. All preconditions must be met and a StageDecisionRecord issued before advancement.s, kill conditions, and immutable audit trailAudit TrailA complete, immutable log of every decision, stage transition, and agent action in a campaign, written to YAML and SQLite by AuditEmitter.s.
The platform was built as a production research artifact under FxMEDUS LLC? with affiliation to Boston University's Department of Computer Science. The first domain applicationDomain ApplicationA therapeutic-area-specific configuration running on DrugSynthAI. A therapeutic-area-specific configuration running on DrugSynthAI., targeting mitochondrial therapeutics for rare diseases with no approved treatments — including Leigh syndrome, MELAS, MERRF, LHON, and CPEO.
Every compound recommendation is backed by a StageDecisionRecordStageDecisionRecordA cryptographically signed record that documents every stage transition in the pipeline. No agent can advance to the next stage without an SDR approved by the RunController. (SDR) — a machine-readable governance document that records which agent evaluated the candidate, which criteria were applied, and what score was assigned. No candidate advances without a valid SDR chain.
Intellectual Property Notice:? DrugSynthAI platform architecture, multi-objective scoring function, reinforcement learning optimization protocol, and novel compound structures are protected under US Provisional Patent Application 64/018,624?, filed March 27, 2026. Non-provisional filing target: March 27, 2027.
Start Here: 5 Minute Quick Guide
New to DrugSynthAI? This section gives you everything you need to understand the platform and see your first drug candidate in under five minutes. No prior drug discovery knowledge required.
Pick a Disease
Open the Dashboard and select any condition from the disease selector. Each one is mapped to real genetic targets from ClinVar and OMIM databases.
Run the Pipeline
Click Run Pipeline. All 57 agents activate in sequence: finding targets, designing molecules, scoring them, and optimizing the best candidates.
Explore Results
View designed molecules in 3D, inspect ADMET safety profiles, read the auto-generated IND filing package, and download the complete audit log.
(5 tiers)
(S00 to S10)
Pre-loaded
Documented
Who Is This For?
DrugSynthAI is designed for three primary user groups, each engaging with the platform at different levels of technical depth.
What Problem Does It Solve?
Drug discovery is among the most expensive and failure-prone processes in science. The industry average for bringing a drug to approval is 10–15 years? and $2.6 billion? — and 95% of candidates still fail in clinical trials.
For rare diseases, the problem is even more acute: patient populations are too small to attract commercial investment, academic labs lack the computational infrastructure to run systematic screening, and most targets have never been computationally validated.
How Does It Work?
DrugSynthAI consists of two layers: the platform? (governance, AI optimization, audit) and the domain applicationDomain ApplicationA therapeutic-area-specific configuration running on DrugSynthAI. A therapeutic-area-specific configuration running on DrugSynthAI. (disease-specific targets, scoring weights, constraints). The mitochondrial therapeutics program is the first domain applicationDomain ApplicationA therapeutic-area-specific configuration running on DrugSynthAI. A therapeutic-area-specific configuration running on DrugSynthAI. built on the platform.
Two-Layer Architecture
Stage gateStage GateA governance checkpoint between pipeline stages. All preconditions must be met and a StageDecisionRecord issued before advancement. controller, SDR authority, audit trailAudit TrailA complete, immutable log of every decision, stage transition, and agent action in a campaign, written to YAML and SQLite by AuditEmitter., RL optimizer, registry snapshot manager. Domain-agnostic.
21 mitochondrial drug targets, disease-specific scoring weights, ADMET thresholds for mitochondrial therapeutics.
S00–S10 Stage Gate Pipeline
Every campaignCampaignA complete drug discovery run from S00 (init) through S10 (IVVP packaging). Each campaign has a unique CMP-{8hex} ID and produces a governed artifact bundle. flows through 11 stages, each with defined entry criteria, exit criteria, and kill conditions. No candidate advances without a valid StageDecisionRecordStageDecisionRecordA cryptographically signed record that documents every stage transition in the pipeline. No agent can advance to the next stage without an SDR approved by the RunController. (SDR).
Init
Acquisition
Prep
Generation
Expansion
Docking
Safety
Screening
Prior Art
Scoring
IVVPIVVPIn Vitro Validation Protocol. The S10 stage where top-scoring compounds are selected for wet lab testing. Tier 1 = highest experimental priority.
7-Component Reward Function
Stage S09–S10 score every candidate using a weighted composite of seven components: binding affinity?, ADMET Tier A rate?, rescue mechanism compatibility?, structural novelty?, synthetic accessibility?, network perturbationNetwork PerturbationA measure of how strongly a drug candidate disrupts the disease-relevant signaling network. Higher perturbation = greater therapeutic potential. score, and ΔΨm mitochondrial accumulation?. Weights are calibrated for mitochondrial therapeutics and are IP-protected.
Page-by-Page Guide
Click any page card to expand its full reference guide.
- RL convergence chart (reward over iterations)
- Docking score scatter (ΔG vs composite scoreComposite ScoreA single 0-1 score combining all evaluation dimensions (binding affinity, ADMET, novelty, synthetic accessibility) weighted by the reward function.)
- ADMET tier distribution (Tier A / B / C)
- Score histogram (all 163 candidates)
- Reward radar (7-component breakdown)
- Agent fleet status grid (57 agents (42 autonomous + 15 assistants))
- KPI row: candidates, Tier A count, novelty %, top score
- Click any chart title → tooltip explains what you're seeing
- Hover chart points → candidate ID and scores
- Click KPI cards → navigate to relevant page
- Top-right: export campaign summary PDF
- Search diseases by name (e.g. "Parkinson", "MELAS")
- 3D DNA helix visualization loads gene associations
- Gene panel shows GDA scores, evidence type, OMIM ID
- Select 1–10 genes → step tracker advances
- Configure campaign name and target constraints
- Click "Launch Campaign" → enters Wizard/S00
GDA— Gene-Disease Association score (0–1)Evidence— genetic / literature / animal modelOMIM— OMIM disease identifierDrug targets— whether an approved drug existsDruggability— predicted binding pocket quality
- Stage progress bar (S00–S10, 11 steps)
- Active stage: entry criteria, running agents, exit criteria
- Live agent activity feed (rolling log)
- Kill condition counter (how many candidates filtered)
- Intermediate candidate count per stage
- SDR chain viewer (click any stage for its SDR)
- Final candidate count and score distribution
- Export SDR chain as JSON or PDF
- Navigate to Drug Candidates for detailed review
- Navigate to Analytics for statistical breakdown
- Rotate:? click + drag
- Zoom:? scroll wheel
- Pan: right-click + drag
- Atom inspection: hover for element/charge
- Reset view: double-click canvas
- Molecular formula and InChIKey
- MW, LogP, HBD, HBA (Lipinski)
- SAScoreSAScoreSynthetic Accessibility Score. Predicts how difficult a molecule is to synthesize in a real chemistry lab. Lower = easier to make (scale 1-10). (synthetic accessibility 1–10)
- Composite scoreComposite ScoreA single 0-1 score combining all evaluation dimensions (binding affinity, ADMET, novelty, synthetic accessibility) weighted by the reward function. and ADMET tier
- Primary target and docking energyDocking EnergyThe predicted binding energy (ΔG in kcal/mol) between a drug candidate and its target protein. More negative = stronger binding.
- Orchestrator (S00): campaign init, config validation
- Target Agents (S01–S02): target acquisition, structure prep
- Chemistry Agents (S03–S04): fragment selection, analog expansion
- Docking Agent (S05): Vina scoring, binding pose
- ADMET Agents (S06–S07): Lipinski, hERG, Ames, hepatotox
- Novelty Agent (S08): ChEMBL Tanimoto check
- Scoring + RL (S09–S10): composite scoreComposite ScoreA single 0-1 score combining all evaluation dimensions (binding affinity, ADMET, novelty, synthetic accessibility) weighted by the reward function., IVVPIVVPIn Vitro Validation Protocol. The S10 stage where top-scoring compounds are selected for wet lab testing. Tier 1 = highest experimental priority. selection
- Click any node → agent detail panel
- Node size = number of decisions made
- Edge color = data flow direction
- Green pulse = currently active
- Filter by agent type using the legend
ID— MITO_CPD_XXXX compound identifierScore— composite multi-objective score (0–1)ΔG (kcal/mol)— Vina docking energyDocking EnergyThe predicted binding energy (ΔG in kcal/mol) between a drug candidate and its target protein. More negative = stronger binding. (more negative = better)ADMET Tier— A (best) / B / C based on 5 dimensionsNovelty— Tanimoto similarity vs ChEMBL v34 (lower = more novel)SA Score— synthetic accessibility (1=easy, 10=hard)Target— primary protein target (gene symbol)
- Click column header to sort ascending / descending
- Click row → expanded detail panel with full ADMET breakdown
- Filter by ADMET tier using tier buttons
- Export button → CSV with all 32 fields
- IVVP button → export synthesis priority list
- IP Sentinel → run patent FTO check on selected candidates
- Active Programs — all campaigns with status and metrics
- IP Sentinel — patent FTO check against USPTO/EPO
- Competitive Landscape — approved drugs for same targets
- CYP Safety Matrix — CYP3A4/2D6/2C9 inhibition risk
- BBB Penetration — CNS accessibility predictions
- Patient Stratification — biomarker-linked subgroup analysis
- Compare two campaigns side by side
- Identify competitive differentiation for fundraising
- Check CYP liability before synthesis
- Assess program portfolio for investor review
- Export audit trailAudit TrailA complete, immutable log of every decision, stage transition, and agent action in a campaign, written to YAML and SQLite by AuditEmitter. for regulatory submission
- RL convergence — reward vs iteration (moving average)
- Score distribution — histogram all 163
- Docking scatter — ΔG vs composite, colored by tier
- ADMET radar — per-compound 5-axis profile
- PCA clustering — chemical space coverage
- Volcano plot — score vs novelty
- Pathway heatmap — target coverage by pathway
- Correlation matrix — all scoring components
- Tanimoto diversity analysis (internal + vs approved)
- Lipinski compliance breakdown (76.1% pass)
- Veber compliance (84.1% pass)
- Oral BA prediction distribution
- Novelty rate vs ChEMBL v34 reference
- "What is the druggability of PINK1?"
- "Which candidates have the best ADMET profiles?"
- "Summarize the campaign results"
- "What is the mechanism of [compound ID]?"
- "Compare LONP1 and PINK1 as drug targets"
- "What diseases are associated with POLG mutations?"
- Queries the live registry data (not static responses)
- Explains any term or score in the platform
- Generates comparative summaries across candidates
- Retrieves literature context for targets
- Answers governance questions about SDR chains
target_registry— 25 targets with druggability classcompound_library— 201 reference compoundsfragment_registry— 90 privileged fragmentsadmet_baselines— 32-field ADMET profiles, 201 recordsdefect_registry— 25 molecular defect entriesconstraint_policy— 5 hard + 3 soft constraintsdocking_results— AutoDock Vina run resultsnovelty_assessment— ChEMBL Tanimoto scoresivvp_v1— IVVP synthesis priority records
- Browse by registry name from dropdown
- Search across all registries by field value
- Click any record → full YAML view
- Copy record to clipboard
- Download individual registry as YAML or CSV
constraint_policy registry is the governance document that defines all kill conditions and soft constraints. The ivvp_v1 registry is the final output — the set of candidates selected for wet lab validation.- Profile: name, email, ORCID, institution
- Platform: active campaign selector, data sources
- Appearance: dark / light mode toggle, font size
- Notifications: campaign completion alerts
- Export full audit trail as JSON
- Export SDR chain for regulatory review
- Export campaign summary as PDF report
- Download all registry snapshots (ZIP)
Example Workflow: End-to-End
Walk through a complete drug discovery program from disease selection to synthesis priority list. This example targets Alzheimer's disease.
Example Workflow: Precision Medicine (Single Patient)
This workflow demonstrates the patient-specific generative therapeutics pipeline. A clinician or patient uploads molecular data. The platform infers the disease-specific regulatory circuitry, simulates interventions, and generates multi-modal therapeutic candidates under full governance.
Pricing
- Full S00-S10 pipeline
- All 11 platform pages
- 300 disease database
- SDR audit trail export
- IVVP synthesis report
- 25 AI messages/month
- 1 active campaign
- CC BY 4.0 data license
- Everything in Explorer
- Patient data upload (VCF, RNA-seq)
- Regulatory graph inference
- Multi-modal candidates (small molecules + peptides)
- Patient therapeutic report (PDF)
- HIPAA-compliant data handling
- 10 campaigns/month
- Or $2,499/mo unlimited
- Everything in Precision Medicine
- Unlimited campaigns
- 500 AI messages/month
- Antibody design module
- CRO dispatch packages
- Regulatory readiness reports
- Programmatic API access
- BYOK for unlimited AI usage
- Everything in Professional
- Unlimited all dimensions
- Multi-user org accounts (RBAC)
- Private deployment (on-premise/cloud)
- Custom disease domains
- White-label branding
- 99.9% uptime SLA
- 15% annual discount
All tiers include full governance audit trail. Patent Pending. For Enterprise pricing: jyborges@bu.edu — FxMEDUS LLC, Boston, MA
Precision Medicine Module
Patient-Specific Generative Therapeutics extends DrugSynthAI from population-level drug discovery to individualized therapeutic design. Available in the Precision Medicine tier and above.
| Agent | ID | Function |
|---|---|---|
| Data Ingest Agent | DIA_001 | Parses VCF, clinical reports, RNA-seq into standardized patient profile |
| GRN Inference Agent | GRN_001 | Builds patient-specific gene regulatory network from multi-omic data |
| Perturbation Agent | PRT_001 | Simulates gene knockdowns, enhancer silencing, pathway inhibition |
| Antibody Designer | ABD_001 | CDR loop design, humanization scoring against target epitopes |
| Antibody Validator | ABV_001 | Developability, immunogenicity, manufacturability assessment |
| Cell Type Resolver | CTR_001 | Deconvolves bulk expression into cell-type fractions |
| Enhancer Mapper | ENH_001 | Maps enhancer-promoter interactions using Activity-by-Contact model |
| Motif Scanner | MOT_001 | Scans for transcription factor binding sites (JASPAR, HOCOMOCO) |
Platform Outputs
Every report, export, and data package the platform generates. Click any item for a detailed explanation of its contents and use case.
25 disease gene targets with evidence scores
25 targets, 31 edges, signaling network
25 binding pockets with structural data
5 kill conditions (a safety kill condition through a toxicity kill condition)
Available under NDA — exact weights redacted in public versions
Campaign objective parameters
These endpoints return real-time data and are consumed by the platform UI. They can also be accessed programmatically for integration with external tools.
Production level, Vina status, receptor count
Entry counts for all 26+ registry files
Campaign metrics, KPIs, agent throughput
Chromosome, protein, function, GDA score
Vina binding energies for all candidates
MD stability rankings for top compounds
BRICS retrosynthesis feasibility scores
8-node signaling network impact scores
25 Tanimoto novelty comparisons
Full compliance status and audit summary
OpenTargets genetic evidence
ChEMBL bioactivity data
FDA pharmacovigilance signals
ClinicalTrials.gov active trials
AIDD-GOV Governance Standard
DrugSynthAI is the reference implementation of AIDD-GOV (AI Drug Discovery Governance), the first open governance standard for AI-driven drug discovery pipelines. The standard defines machine-readable schemas for every governance artifact the platform produces.
What AIDD-GOV Requires
| Schema | Level | Platform Implementation |
|---|---|---|
| StageDecisionRecord | 1 (Core) | Immutable SDRs with SHA-256 checksums, issued at every stage gate |
| AuditEvent | 1 (Core) | Append-only audit log across 6 categories, SDK event publishing |
| StageGatePipeline | 1 (Core) | S00-S10 ordered pipeline, no skip, SDR gating enforced |
| ConstraintPolicy | 2 (Standard) | Hard/soft constraints + 5 kill conditions, locked at campaign init |
| RewardArchitecture | 2 (Standard) | 7-component RL reward function, weights sum to 1.0, convergence tracked |
| ConvergenceCriteria | 2 (Standard) | Plateau detection, Mann-Whitney U test, min/max epoch bounds |
| ObjectiveDefinitions | 3 (Full) | 5 objectives with normalization, direction, and weight specification |
| ToxicityAlerts | 3 (Full) | 72 structural alerts (PAINS + Brenk), hard_block/soft_flag disposition |
| ExclusionZones | 3 (Full) | Patent exclusion zone schema, Tanimoto similarity boundaries |
| KillSwitch | 3 (Full) | 5 kill conditions evaluated per epoch, campaign-level emergency halt |
Blockchain Anchoring (Roadmap)
AIDD-GOV governance reports contain SHA-256 checksums for every StageDecisionRecord. These checksums form a verifiable chain of provenance from campaign initialization through final candidate nomination. A planned integration will anchor these checksum chains to a public blockchain, providing tamper-proof, third-party-verifiable evidence that no governance record was modified after issuance.
How it works: At campaign completion, the platform computes a Merkle root over all SDR checksums in the campaign ledger. This single hash is written to a public blockchain (target: Ethereum or Polygon). Anyone with the governance report can independently verify that every SDR matches the anchored Merkle root — without revealing any proprietary compound data or reward weights.
What this enables: A regulator, CRO partner, or disease foundation can verify that the governance trail presented to them is identical to the trail that existed at campaign completion. No trust in the platform operator is required — the blockchain provides the proof.
Regulatory Alignment
AIDD-GOV aligns with FDA Draft Guidance on AI/ML in Drug Development (2023), ICH Q8(R2) Quality-by-Design principles, and 21 CFR Part 11 electronic records requirements. The open specification provides a framework for demonstrating algorithmic accountability when AI methods are used in drug candidate nomination.
Governance Intelligence
DrugSynthAI includes eight governance-exclusive features that are structurally impossible for competitors to replicate. Each feature requires the governed pipeline infrastructure (immutable StageDecisionRecords, append-only audit trails, pre-declared constraints) as a prerequisite. Platforms that operate as black boxes cannot build these capabilities without first retrofitting their entire architecture.
1. Compound Provenance Waterfall
What it proves: Any compound is traceable from gene selection to nomination in one visual.
Select any drug candidate and see its complete governed history: which gene was selected and why, which binding pocket was analyzed, which fragment was chosen, how the analog was expanded, what the docking result was, how ADMET profiling classified it, whether it passed novelty screening, how RL optimization ranked it, and whether it was nominated for experimental validation. Every checkpoint shows the corresponding StageDecisionRecord with its SHA-256 integrity checksum.
Endpoint: GET /api/provenance/{compound_id}
Access: Drug Candidates page → select compound → Provenance button
2. AIDD-GOV Self-Assessment Report
What it proves: Machine-readable compliance proof, third-party verifiable.
Generates a structured compliance report mapping each of the 10 AIDD-GOV schemas to its implementation artifact in the platform. A regulator, CRO partner, or disease foundation can independently verify that the platform implements every required schema at the claimed conformance level. The report is exportable as JSON for automated compliance checking.
Endpoint: GET /api/aidd-gov/self-assessment
Access: Dashboard → AIDD-GOV Compliance card → click for full report
3. Regulatory Readiness Score
What it proves: Computed percentage of the FDA pre-IND package already generated.
Maps platform artifacts to FDA pre-IND submission requirements and computes a readiness percentage. For each requirement (target identification, lead characterization, ADMET summary, synthesis feasibility, novelty assessment, clinical protocol), the score shows whether the governed pipeline has already produced the necessary documentation. Items requiring wet lab validation (GLP toxicology, clinical pharmacology) are marked as pending with clear next steps.
Endpoint: GET /api/regulatory/readiness/{campaign_id}
Access: Dashboard → Regulatory Readiness gauge → click for full FDA checklist
4. Campaign Replay (DVR)
What it proves: Step-by-step playback of a governed campaign.
Replays an entire campaign execution from S00 initialization through S10 nomination, showing stage-by-stage timing, inputs, outputs, and SDR decisions. Functions like a DVR for drug discovery: pause at any stage, inspect the governance decision, examine the data that was evaluated, and see the exact criteria that were applied. The RL convergence trace animates as the replay progresses, showing how candidate scores improved across optimization iterations.
Endpoint: GET /api/replay/{campaign_id}
Access: Dashboard → Campaign card → Replay button
5. Governance Diff Engine
What it proves: Side-by-side comparison of two campaign governance trails.
Compares two campaigns stage by stage: how target selection differed, whether constraint policies changed, how RL convergence rates compared, and where scoring outcomes diverged. Color-coded diffs highlight improvements (green), regressions (red), and unchanged parameters (gray). Enables systematic analysis of how different configurations affect pipeline outcomes while maintaining governance traceability for both campaigns.
Endpoint: GET /api/governance/diff?campaign_a={id}&campaign_b={id}
Access: Analytics → Governance Analytics → Campaign Diff selector
6. Kill Switch Dashboard
What it proves: Real-time visualization of safety monitoring.
Displays all kill switch conditions defined for a campaign with their current values, thresholds, and headroom. Each kill switch (reward collapse, constraint violation rate, convergence failure, toxicity rate, resource exhaustion) shows a gauge meter indicating how far the current state is from the trigger threshold. Evaluation counts and trigger history are displayed for full audit transparency. A green shield badge confirms when all switches are in safe state.
Endpoint: GET /api/kill-switches/{campaign_id}
Access: Dashboard → Kill Switch Status card → click for full panel
7. Constraint Sensitivity Analysis
What it proves: Constraints are meaningful, not arbitrary.
For each hard constraint in the constraint policy, simulates what happens if the threshold is tightened by 10%, 25%, and 50%. Shows how many candidates would survive at each threshold level. This proves that constraints are calibrated to the actual compound library rather than set at arbitrary default values. A constraint where tightening by 50% eliminates 40% of candidates is meaningfully engaged with the data. A constraint where even 50% tightening changes nothing indicates headroom in the current library.
Endpoint: GET /api/sensitivity/{campaign_id}
Access: Analytics → Governance Analytics → Constraint Sensitivity chart
8. Cross-Campaign Meta-Analysis
What it proves: Aggregate learning across campaigns without exposing compound structures.
Aggregates statistical patterns across all governed campaigns: ADMET score distributions, molecular weight histograms, binding affinity ranges, RL convergence rates, constraint violation counts, and governance compliance rates. No individual compound data is exposed. This enables the platform to demonstrate learning effects (do later campaigns converge faster?) and quality trends (are ADMET profiles improving?) while preserving compound confidentiality for each campaign commissioner.
Endpoint: GET /api/meta-analysis
Access: Dashboard → Meta-Analysis card → Analytics → Cross-Campaign section
Why Competitors Cannot Build These
| Feature | Prerequisite | Competitor Status |
|---|---|---|
| Compound Provenance | Immutable SDR chain across all stages | No SDRs exist in any competitor platform |
| AIDD-GOV Self-Assessment | Published governance standard | No open standard exists outside DrugSynthAI |
| Regulatory Readiness | Governed artifacts mapped to FDA requirements | Black-box outputs cannot map to regulatory structure |
| Campaign Replay | Append-only audit trail with timestamps | No competitor logs decisions at this granularity |
| Governance Diff | Structured governance trails in both campaigns | Cannot compare what is not recorded |
| Kill Switch Dashboard | Pre-declared kill conditions with monitoring | No competitor publishes kill switch definitions |
| Constraint Sensitivity | Pre-declared constraints locked at init | Competitors modify constraints mid-run |
| Cross-Campaign Meta | Multiple governed campaigns with structured outputs | Black-box outputs cannot be aggregated structurally |
Real-World Evidence Integration
When you select a disease in the Discovery Lab, DrugSynthAI automatically fetches epidemiological and clinical evidence from four public databases in parallel, grounding your drug discovery campaign in real patient data before a single compound is generated.
Live Public Sources
Planned Registry Sources (Pending Data Sharing)
Infrastructure is ready. Institutional data sharing agreements required.
- UMDF Patient Registry — United Mitochondrial Disease Foundation longitudinal patient data
- NIH RDCRN — Rare Diseases Clinical Research Network multi-site cohorts
- MitoSHARE — International mitochondrial disease biobank
- EHR via FHIR R4 — De-identified electronic health records for real-world outcome correlations
Privacy and Data Safety
All evidence displayed in the Real-World Evidence panel is aggregate, population-level data only. No patient-identifiable information is fetched, stored, or displayed. ClinVar variant data is de-identified by definition. Orphanet and GARD data are fully public.
Evidence Panel (Discovery Lab)
When you select a disease in Step 1 of the Discovery Lab, the RWE panel automatically appears below the disease grid showing: prevalence class, total pathogenic variant count across your selected genes (ClinVar), HPO phenotype count, and inheritance pattern. The top 12 HPO phenotype terms are displayed as clickable chips. All four sources query in parallel — typical response time is under 3 seconds.
GOV-FM: Governance Foundation Model
GOV-FM is a self-improving governance quality scoring system built into DrugSynthAI. Every completed campaign generates a GovernanceCampaignRecord — a structured snapshot of governance metadata. The record is scored across 5 dimensions, gaps are identified, and recommendations are stored as governance memory for the next campaign. The loop compounds over time.
What GOV-FM Evaluates (5 Dimensions)
| Dimension | Weight | What Is Scored |
|---|---|---|
| D1 Decision Provenance | 25% | SDR chain completeness across all 11 stages (S00–S10); checksum coverage on each StageDecisionRecord |
| D2 Constraint Integrity | 20% | Whether constraints were declared before campaign start, if they were modified mid-run, kill condition count, and violation rate |
| D3 Optimization Transparency | 20% | Reward function documentation, weights summing to 1.0, pre-specified convergence criteria, trace recorded, and improvement demonstrated |
| D4 Audit Completeness | 20% | Coverage across 6 required audit categories (stage, job, artifact, policy, security, rl), event density per stage, and actor attribution rate |
| D5 Regulatory Alignment | 15% | FDA section coverage (ADMET, docking, toxicity, novelty, provenance, audit, governance), 21 CFR Part 11 compatibility, and compound provenance chain completeness |
Risk Thresholds
| Score | Risk Level | Implication |
|---|---|---|
| > 80% | LOW | Campaign is publication-ready; governance moat established |
| 61–80% | MEDIUM | Minor gaps present; address before regulatory submission |
| 36–60% | HIGH | Significant governance deficiencies; remediate before advancing |
| ≤ 35% | CRITICAL | Governance not established; do not advance to wet lab |
How Scoring Works — Phase 1 (Rule-Based)
GOV-FM Phase 1 uses a deterministic rule engine. Each dimension is scored 0–1 using conditional logic derived from the AIDD Governance Standards v1.0 and 21 CFR Part 11. Gaps are identified as specific violations; recommendations are actionable remediation steps. The composite score is the weighted average of all 5 dimension scores.
Phase 2 (planned): LLM-based semantic scoring of governance narratives, cross-campaign pattern learning, and automated gap prioritization using campaign outcome data.
The Self-Improving Loop
Governance Memory
The file data/gov_fm_training/governance memory persists across campaigns and contains:
last_campaign— ID of the most recently scored campaignlast_score— composite governance score (0.0–1.0)trend— FIRST_CAMPAIGN | IMPROVING | STABLE | DECLININGactive_recommendations— actionable steps from the last scoring runtotal_campaigns_scored— how many campaigns have been scored to date
Improvement Trend Tracking
| Trend | Condition |
|---|---|
| FIRST_CAMPAIGN | No previous score exists in governance memory |
| IMPROVING | Current composite > previous + 0.02 |
| STABLE | Within ±0.02 of previous score |
| DECLINING | Current composite < previous − 0.02 |
Data Privacy
GOvernanceCampaignRecord contains only governance metadata — never compound SMILES, molecular weights, patient data, or optimization reward weights. Training data is safe for institutional sharing under standard data use agreements.
Training Data Location
All training data accumulates at data/gov_fm_training/:
{campaign_id}.yaml— GovernanceCampaignRecord (training input){campaign_id} score data— GovernanceScore (training label)governance memory— active recommendations for next campaign
Dashboard Integration
The GOV-FM Score card on the Dashboard displays the composite score, risk level, trend indicator, 5-dimension progress bars, detected gaps, and the top recommendation for any selected campaign. Select a campaign from the dropdown to load its score in real time via GET /api/governance/score/{campaign_id}.
Synthesis & CRO Dispatch
DrugSynthAI closes the gap between in silico optimization and the first physical molecule. After campaign completion, any nominated compound can be routed through the synthesis planning pipeline to produce a complete CRO dispatch package — ready to send to a contract research organization for synthesis and biological testing.
ASKCOS Retrosynthesis (MIT)
ASKCOS routes include: step-by-step retrosynthetic disconnections, reaction template scores (success probability), literature precedent counts per reaction, commercially available starting materials (eMolecules 7M+ compounds, ZINC15 230M+ compounds), and estimated synthesis difficulty (LOW / MODERATE / HIGH / VERY_HIGH). Availability rate and price estimates are included in every synthesis route response.
CRO Brief Generator
The CRO Brief Generator produces a complete dispatch package from campaign data. A single API call generates everything a contract research organization needs to quote and execute.
Assay Protocol Library
The library (registries/assay protocol library) contains pre-defined protocols for 5 mitochondrial disease targets, plus three standard panels applied to all compounds.
| Target | Assays |
|---|---|
| DNM1L (DRP1) | GTPase activity (malachite green), mitochondrial morphology (confocal) |
| PINK1 | Kinase activity (ubiquitin S65-P), mitophagy flux (mito-Keima) |
| NFE2L2 (Nrf2) | Keap1-Nrf2 PPI displacement (FP), ARE reporter (luciferase) |
| NDUFV1 | Complex I activity (spectrophotometric), OCR (Seahorse XF) |
| SDHA | Complex II activity (DCPIP reduction), OCR (Seahorse XF) |
How to Use
API Endpoints Phase A
Patient Intelligence
Phase B closes the gap between optimized compounds and real patients. Every gene click in the Discovery Lab now fires the Patient Intelligence panel, connecting genotype data to candidate compounds, population estimates, and companion diagnostic specifications.
Genotype-to-Compound Matching
Given a patient's genetic variant, the platform automatically selects the correct rescue compound using a 4-tier matching hierarchy.
Population Estimation
For each variant, the platform estimates the global patient population using gnomAD allele frequencies, Orphanet prevalence data, and Hardy-Weinberg equilibrium calculations. The pre-computed registry covers 14 variants representing 54,204 patients globally.
Companion Diagnostic Specification
Each campaign generates a 3-tier companion diagnostic panel specification, defining the genetic test needed to identify eligible patients.
| Tier | Technology | Cost | TAT | Use Case |
|---|---|---|---|---|
| 1 | Targeted PCR | $250 | 7 days | Known variants, clinical screening |
| 2 | Gene Panel (NGS) | $1,500 | 14 days | All coding variants in target genes |
| 3 | Whole Exome (WES) | $3,500 | 28 days | Novel variant discovery, research cohorts |
Discovery Lab Integration
When you click a gene in the Discovery Lab, the Patient Intelligence panel fires automatically, displaying: matched compound candidates with rescue direction, estimated patient population for the selected variant, and the recommended diagnostic tier. This data flows directly into the CRO Brief and foundation proposal generator.
API Endpoints Phase B
Molecular Dynamics & Free Energy Perturbation
Phase C adds physics-based validation to every shortlisted compound. Molecular dynamics (MD) simulates how each compound behaves in a solvated protein environment over time, while free energy perturbation (FEP) estimates binding affinity as ΔΔG = ΔGcomplex − ΔGsolvent.
MD Simulation Pipeline
| Stage | Method | Duration / Parameters |
|---|---|---|
| Structure repair | PDBFixer | Missing residues, non-standard AA, hydrogens at pH 7.4 |
| Solvation | TIP3P + 150 mM NaCl | 10 Å padding, periodic box |
| Force field | AMBER14-all + tip3pew | PME, HBonds constraints |
| HMR | Hydrogen mass repartitioning | 4 fs timestep (2× standard) |
| Energy minimize | L-BFGS | 2,000 steps |
| NVT equilibration | Langevin Middle | 100 ps, 310 K |
| NPT equilibration | Langevin + Monte Carlo barostat | 200 ps, 1 atm |
| Production MD | NPT, 4 fs timestep | 500 ps, trajectory recorded every 4 ps |
Stability Scoring
Each compound receives a stability score (0.0–1.0) based on three trajectory metrics:
| Metric | Weight | Threshold |
|---|---|---|
| RMSD mean (lower → better) | 40% | < 0.20 nm = STABLE |
| RMSD std dev (lower → better) | 30% | < 0.10 nm = STABLE |
| H-bond occupancy (higher → better) | 30% | > 60% = good network |
Classification: STABLE (RMSD < 0.20 nm, H-bond > 60%) · FLEXIBLE (RMSD < 0.35 nm) · UNSTABLE (RMSD ≥ 0.35 nm).
Free Energy Perturbation
FEP uses alchemical λ-windows (12 equidistant steps, λ = 0 → 1) to perturb each compound between its bound (complex) and unbound (solvent) states. ΔΔG is computed using a descriptor-based proxy with BAR/TI scaffolding for full alchemical activation:
- FAVORABLE: ΔΔG < −1.0 kcal/mol (predicted binder)
- NEUTRAL: −1.0 ≤ ΔΔG ≤ +1.0 kcal/mol
- UNFAVORABLE: ΔΔG > +1.0 kcal/mol (predicted non-binder)
GPU Compute Platforms
| Platform | Device | Est. Speed |
|---|---|---|
| CUDA | NVIDIA GPU (T4, A100) | ~500 ns/day |
| OpenCL | Apple M2 Metal | ~120 ns/day |
| CPU | Multi-core CPU | ~15 ns/day |
| Reference | Single-thread fallback | ~1.5 ns/day |
The platform is auto-detected at runtime (CUDA > OpenCL > CPU > Reference). For cloud GPU runs, use the Export Colab button in the MD panel to download a pre-wired Colab notebook configured for T4/A100.
MD Panel — Discovery Lab
The MD Validation panel appears in the Discovery Lab (right column). For each shortlisted compound:
- Click Run MD to submit a preparation + simulation job.
- Results appear as a stability badge: STABLE / FLEXIBLE / UNSTABLE with RMSD, H-bond occupancy, and score.
- Click FEP Compare to rank shortlisted compounds by ΔΔG. The table sorts with FAVORABLE (green) on top.
- Click Export Colab to download the Jupyter notebook for the selected compound. Open in Google Colab and select Runtime → T4 GPU for production-quality MD.
API Endpoints
| Method | Path | Description |
|---|---|---|
| POST | /api/dynamics/prepare | PDBFixer + solvation + equilibration |
| POST | /api/dynamics/simulate | Production NPT MD + trajectory analysis |
| GET | /api/dynamics/result/{run_id} | Retrieve stored MD result |
| POST | /api/dynamics/fep | FEP ΔΔG for single compound |
| POST | /api/dynamics/fep/rank | Rank compound list by ΔΔG |
| GET | /api/dynamics/compute | Platform detection info |
| GET | /api/dynamics/colab/{campaign_id} | Download Colab notebook (.ipynb) |
Dashboard MD Validation Badge
The Campaign Dashboard shows a MD Validation badge summarizing the count of STABLE / FLEXIBLE / UNSTABLE compounds for the active campaign. Click the badge to expand the full MD results table with per-compound RMSD, stability score, and FEP ΔΔG. Compounds rated STABLE + FAVORABLE are highlighted as Stage Gate ready.
Delivery Engineering
Phase D closes the gap between optimized compounds and the target compartment. Nine of 25 mitochondrial targets sit inside the mitochondrial matrix behind a double membrane and a −180 mV electrochemical gradient. Every compound must GET THERE. Phase D solves the delivery problem computationally before synthesis.
Decision Tree: Target Location → Delivery Strategy
| Target Location | Delivery Requirement | Recommended Route | Strategy |
|---|---|---|---|
| Cytoplasmic | NONE | ORAL | Standard capsule/tablet |
| OMM surface | NONE | ORAL | Standard — no penetration needed |
| IMS / IMM | MINIMAL | IV or ORAL | Moderate lipophilicity sufficient |
| Matrix (TPSA <100) | PRODRUG | IV | Methyl/ethyl ester prodrug |
| Matrix (TPSA ≥100) | TPP+ CONJUGATION | IV | TPP+-C6-ester + sterile injectable |
| CNS / neuronal | BBB PENETRATION | INTRANASAL or LNP | Nasal spray or lipid nanoparticle |
TPP+ Conjugate Designer
Triphenylphosphonium (TPP+) cations accumulate 100–1000× in the mitochondrial matrix driven by the electrochemical gradient (ΔΨm = −180 mV). The Nernst equation gives:
Accumulation = 10(z·F·ΔΨ / 2.303RT) ≈ 850× at 37°C
| Linker Type | Length | MW Added | logP Change | Cleavage |
|---|---|---|---|---|
| Alkyl C2 | 2 carbons | ~91 Da | +5.5 | Ester (t½ 2h) |
| Alkyl C6 | 6 carbons | ~147 Da | +7.5 | Ester (t½ 2h) |
| Alkyl C10 | 10 carbons | ~203 Da | +9.5 | Ester (t½ 2h) |
| PEG-3 | 3 units | ~175 Da | +3.0 | Ester (t½ 2h) |
| Ether | C6 | ~147 Da | +6.3 | Non-cleavable |
TPP+ adds ~263 Da and +4.5 logP units. Alkyl-C6-ester is the default recommendation: optimal membrane partitioning, matrix esterase cleavage at pH 8.0 (t½ ~2 hours), releasing the unmodified parent compound.
Prodrug Designer — 7 Strategies
| Strategy | Masks | MW Change | logP Change | Activation Site | t½ |
|---|---|---|---|---|---|
| Ester (ethyl) | -COOH | +28 Da | +1.0 | GI/liver CES1/CES2 | 1.5 h |
| Methyl ester | -COOH | +14 Da | +0.7 | GI/liver CES1/CES2 | 1.0 h |
| Phosphate | -OH | +80 Da | −2.0 | Intestinal ALP | 0.5 h |
| Amino acid (Val) | -OH or -NH | +99 Da | −0.5 | PEPT1 → peptidases | 2.0 h |
| Carbonate | -OH | +72 Da | +1.2 | Plasma esterases | 3.0 h |
| Carbamate | -NH₂ | +58 Da | +0.8 | Plasma pH 7.4 | 4.0 h |
| TPP+ ester | -COOH / -OH | +347 Da | +5.0 | Matrix esterases pH 8.0 | 2.0 h |
Formulation Specifications — 5 Routes
| Route | Type | Key Parameters | Est. COGS/dose |
|---|---|---|---|
| ORAL | IR capsule | MCC + lactose, pH 6.8, 25°C/60% RH 24 mo | $0.50 |
| IV | Sterile solution | 0.9% NaCl, pH 7.4, 0.22 μm filtration, Type I vial | $15 |
| SC | Pre-filled syringe | PBS pH 7.0, PS-20 0.02%, 2–8°C 18 mo | $25 |
| INTRANASAL | Metered nasal spray | HPMC 0.5%, BKC 0.01%, pH 6.0, 100 μL/actuation | $8 |
| LNP | Lipid nanoparticle | 80 nm, ζ = −5 mV, PEG 1.5%, EE ~85%, −20°C 12 mo | $150 |
LNP composition: ionizable lipid 50% (ALC-0315 or DLin-MC3-DMA) + DSPC 10% + cholesterol 38.5% + PEG-DMG 1.5%. Near-neutral zeta potential (−5 mV) minimizes non-specific protein adsorption while maintaining BBB transcytosis.
CRO Brief Integration
The CRO Brief generator now includes a Delivery & Formulation section per compound. Instead of:
"Synthesize [COMPOUND] (MW 380, NDUFV1 stabilizer)"
The brief now reads:
"Synthesize [COMPOUND]-TPP-C6 (TPP+-alkyl-C6-ester conjugate, MW 690).
Linker: hexyl chain with ester cleavage site.
Predicted matrix accumulation: 850× at ΔΨm = −180 mV.
Ester hydrolysis t½: 2 hours (matrix esterases, pH 8.0).
Formulation: sterile injectable, 10 mg/mL in normal saline.
Excipients: NaCl 0.9%, Polysorbate 80 0.1%.
Sterilization: 0.22 μm filtration + aseptic fill.
Container: Type I borosilicate glass vial."
Delivery Design API
| Method | Path | Description |
|---|---|---|
| GET | /api/delivery/tpp/{compound_id} | Design TPP+ conjugate (linker_type, linker_length, linkage) |
| GET | /api/delivery/tpp/batch | Batch TPP+ for all matrix-targeted compounds in campaign |
| GET | /api/delivery/prodrug/{compound_id} | Design prodrug (strategy param) |
| GET | /api/delivery/prodrug/recommend/{compound_id} | Auto-recommend prodrug strategy |
| GET | /api/delivery/formulation/{compound_id} | Formulation spec (route, dose_mg params) |
| GET | /api/delivery/formulation/recommend/{compound_id} | Recommend formulation routes |
Dashboard Delivery Design Badge
The Campaign Dashboard Delivery Design badge shows the distribution across primary candidates:
- Oral: 5 compounds — cytoplasmic/OMM targets, no mitochondrial targeting required
- Intranasal: 1 compound — CNS-penetrant target requiring BBB bypass
- IV + TPP+: 2 compounds — matrix-targeted FAD/FMN binding sites
- IV + Prodrug: 1 compound — methyl ester strategy for TPSA reduction
- IV minimal: 1 compound — low TPSA, partial passive matrix uptake
Regulatory Intelligence
Phase E adds a regulatory document engine that transforms campaign data into FDA-formatted document drafts. A researcher clicks Generate Full IND Package for any compound and receives a complete draft package — Pre-IND meeting request, CMC modules, and nonclinical pharmacology summary — in seconds. Regulatory consultant review time: weeks, not months.
FDA IND Structure (21 CFR 312.23)
| Module | Name | Auto-Generated | Source |
|---|---|---|---|
| 2.4 | Nonclinical Overview | YES | nonclinical_summary.py |
| 2.6 | Nonclinical Written Summary | YES | nonclinical_summary.py |
| 3.2.S | Drug Substance (CMC) | YES | cmc_drafter.py |
| 3.2.P | Drug Product (CMC) | YES | cmc_drafter.py |
| 4.2/4.3 | Study Reports | POST-CRO | After experimental data |
| Pre-IND | Type B Meeting Request | YES | preind_generator.py |
Pre-IND Meeting Request (21 CFR 312.82)
The Pre-IND generator produces three deliverables per compound:
- Meeting request letter — FDA Type B format with sponsor details, indication, patient population, and orphan drug eligibility
- Briefing document — 9-section structured document: executive summary, product information, disease background, nonclinical summary, proposed program, CMC summary, clinical plan, regulatory strategy, proposed questions
- Proposed questions — 7–9 FDA-formatted questions covering nonclinical adequacy, CMC, Phase I design, safety pharmacology, adaptive trial use, and orphan designation
CMC Documentation (ICH CTD Module 3.2.S/3.2.P)
| Section | Contents | ICH Reference |
|---|---|---|
| 3.2.S.1 | Nomenclature, structure, general properties | ICH Q6A |
| 3.2.S.2 | Manufacture, process, control of materials | ICH Q7 |
| 3.2.S.3 | Characterization: NMR, HRMS, IR, X-ray | ICH Q2 |
| 3.2.S.4 | Specifications, analytical procedures | ICH Q6A |
| 3.2.S.7 | Stability: long-term (25°C/60%RH) + accelerated (40°C/75%RH) | ICH Q1A |
| 3.2.P.5 | Drug product controls: assay, uniformity, dissolution/sterility | ICH Q6A |
SMILES redaction policy: All regulatory documents display [REDACTED — disclosed under NDA only] in place of compound structures. The patent-pending IP is never exposed in document exports.
Nonclinical Pharmacology (Modules 2.4 + 2.6)
The nonclinical summary engine drafts:
- Primary pharmacology — target binding assay design, functional assay selection (target-specific: Complex I OCR for NDUFV1, mitophagy flux for PINK1, ARE-luciferase for NFE2L2…), in silico docking ΔG and composite score
- Secondary pharmacology — hERG patch clamp, CYP inhibition (3A4/2D6/2C9), Eurofins SafetyScreen44, kinase selectivity
- Safety pharmacology (ICH S7A) — cardiovascular, CNS (modified Irwin), respiratory
- Proposed toxicology program (5 studies) — single-dose MTD, 14-day rat, 14-day dog, Ames, in vivo micronucleus
- Proposed PK program (4 studies) — rat/dog single-dose PK, in vitro metabolic stability, plasma protein binding
Regulatory Designation Eligibility
| Designation | Basis | Status |
|---|---|---|
| Orphan Drug (ODD) | < 200,000 US patients — primary mitochondrial disease | ELIGIBLE |
| Fast Track | Serious condition with unmet medical need | ELIGIBLE |
| Rare Pediatric Disease | Primarily affects individuals aged 0–18 | ELIGIBLE |
| Breakthrough Therapy | Requires preliminary clinical evidence | PENDING |
API Endpoints
| Endpoint | Description |
|---|---|
GET /api/regulatory/preind/{id} | Pre-IND package JSON (letter + briefing + questions) |
GET /api/regulatory/preind/{id}/markdown | Downloadable Markdown export |
GET /api/regulatory/cmc/{id} | CMC package (3.2.S + 3.2.P + specs + stability) |
GET /api/regulatory/nonclinical/{id} | Nonclinical summary (2.4 + 2.6 + tox + PK) |
GET /api/regulatory/ind-package/{id} | Complete IND package (all three combined) |
Key Concepts Glossary
| Term | Definition |
|---|---|
| ADMET | Absorption, Distribution, Metabolism, Excretion, Toxicity. The five pharmacokinetic and safety dimensions evaluated for every drug candidate. Tier A = all five favorable; Tier C = one or more critical failure. |
| Campaign | A complete drug discovery run from disease selection through IVVP output. Each campaign has an immutable registry snapshot taken at S00 that governs all downstream evaluation. |
| ChEMBL | Open-access bioactivity database from EMBL-EBI with 2.4 million compounds and 15,000 biological targets. Used in S08 for Tanimoto-based novelty assessment of all candidates. |
| Composite Score | Weighted sum of 7 reward components computed at S09. Ranges 0–1; higher is better. IP-protected weight ratios are calibrated for mitochondrial therapeutics specifically. |
| ΔG (docking) | Predicted binding free energy from AutoDock Vina, in kcal/mol. More negative means stronger predicted binding. Values ≤ −8 kcal/mol are considered strong binders in the mitochondrial therapeutics context. |
| ΔΨm | Mitochondrial membrane potential (typically −180 mV). A critical domain-specific endpoint — compounds are evaluated for their predicted ability to restore or maintain ΔΨm in dysfunctional mitochondria. |
| FTO | Freedom-to-Operate. IP analysis confirming no patent conflicts exist for a compound structure. Platform IP Sentinel provides structural screening; formal FTO requires legal counsel. |
| Fragment | A small molecular building block (MW 100–300 Da, typically) used to construct drug candidates. The privileged fragment library contains 90 fragments targeting mitochondrial proteins. |
| GDA | Gene-Disease Association score (0–1). Quantifies the strength of evidence linking a specific gene to a disease. Sources: DisGeNET, Open Targets, ClinVar. Used in Discovery Lab gene ranking. |
| Governance | The set of platform-level rules, stage gate conditions, kill conditions, and SDR authority that control candidate advancement. Domain applications cannot modify governance rules — only the platform can. |
| hERG | Human Ether-à-go-go Related Gene. Encodes a cardiac potassium channel. hERG inhibition (IC50 <10µM) causes QT prolongation and is a hard kill condition (a cardiac safety kill condition). A common reason for late-stage drug failure. |
| IVVP | In Vitro Validation Protocol. The S10 output document listing top-priority candidates for wet lab synthesis and testing, with ranked rationale and synthetic accessibility scores. |
| Kill Condition | Hard safety rule (platform safety kill conditions) that instantly disqualifies a candidate regardless of its composite score. Includes PAINS alerts, hERG cardiotoxicity, reactive groups, Lipinski violations, and logP out of range. |
| Lipinski Rule of 5 | Criteria for oral drug-likeness: MW ≤500 Da, LogP ≤5, HBD ≤5, HBA ≤10. Violations predict poor oral bioavailability. DrugSynthAI enforces MW ≤500 and logP 1.5–5.5 as hard constraints. |
| LogP | Octanol-water partition coefficient. Measure of lipophilicity. The platform requires logP 1.5–5.5 for mitochondrial membrane permeability. Values outside this range are a hard kill condition. |
| NPS | Network Perturbation Score. Measures how broadly a compound affects the mitochondrial disease signaling network (not just its primary target). Higher NPS = more pathway coverage = higher reward component weight. |
| PAINS | Pan-Assay Interference Compounds. Structural fragments (catechols, rhodanines, reactive quinones, etc.) that produce artifactual activity in biochemical assays. Any PAINS alert triggers a safety kill condition kill condition. |
| PDBQT | Protein Data Bank, Partial Charges + Atom Types. Receptor file format used by AutoDock Vina for molecular docking. Produced from PDB structures by receptor preparation scripts at S02. |
| Three-dimensional cavity on a protein surface where a drug molecule binds. Identified at S02 using pocket detection (open-source) or SiteMap (commercial). Binding pocket coordinates define the Vina docking box. | |
| RL | Reinforcement Learning. AI optimization that iteratively improves the candidate pool by adjusting the generative model based on composite score feedback. Applied at S10. Convergence typically achieved by iteration 100. |
| SAScore | Synthetic Accessibility Score (1–10). Estimates how difficult a compound is to synthesize. 1 = simple natural product analog; 10 = de novo design requiring multi-step custom synthesis. Target: SA ≤5.0 for prioritization. |
| SDR | StageDecisionRecord. Machine-readable governance document (JSON) issued by the platform's SDR Authority for each candidate at each stage. Records: agent ID, evaluation criteria, score, outcome, timestamp. Immutable once issued. |
| Stage Gate | Checkpoint in the S00–S10 pipeline where candidates must pass defined entry criteria to advance to the next stage. Each gate has entry conditions, exit conditions, and kill conditions evaluated by the StageGateController. |
| Tanimoto | Structural similarity coefficient (0–1) based on molecular fingerprints. Below 0.35 = structurally novel vs reference compound. Used in S08 for ChEMBL novelty assessment. |
| Tier A / B / C | ADMET quality classification. Tier A: all five ADMET dimensions favorable (oral BA high/moderate, no hERG/hepatotox risk, Ames negative, BBB penetration predicted). Tier C: one critical failure. Tier B: borderline on ≥1 dimension. |
| Vina | AutoDock Vina. Open-source molecular docking program using a gradient optimization algorithm. The platform uses Vina for S05 binding affinity prediction. Typical runtime <2 minutes per compound-target pair. |