|
AI-Authored Source Code
DEV TOOL
|
Zero at runtime. AI (Copilot, Claude, ChatGPT) generates device source code during development. A human reviews, modifies, and commits. The shipped code is static — indistinguishable from human-written code. AI is an authoring assistant. |
QMSR (ISO 13485 §4.1.6 / §7.5.6)
IEC 62304
QMS obligation
No AI-specific submission requirement. FDA does not regulate development tools — but manufacturers must validate automated tools per QMSR requirements (formerly 820.70(i), now via ISO 13485 §4.1.6 for QMS software and §7.5.6 for production processes, incorporated by reference). The 510(k) submission sees only the output code, not how it was written. FDA’s position: validate the tool for its intended use, even if it’s ChatGPT.
|
ISO 13485 §7.5.2
IEC 62304 §8
QMS obligation
No specific MDR/IVDR submission requirement to disclose AI-authored code. However, IEC 62304 Section 8 (software configuration management) and ISO 13485 tool validation procedures apply. Notified Body auditors may ask about development tools during QMS audits.
|
AI tool validation under QMSR process validation requirements — demonstrate the tool produces outputs of acceptable quality with known risk of error. Code review processes must be robust enough to catch AI-generated defects (hallucinated logic, insecure patterns, incorrect assumptions). Standard IEC 62304 V&V applies to the code itself regardless of who (or what) wrote it. Document the AI tool in your SDP as part of the development environment.
|
Risk: AI may introduce subtle logic errors, insecure code patterns, or non-obvious dependencies that pass cursory review. Static analysis and MISRA/CERT compliance scanning become essential quality gates. Cybersecurity risk is indirect — the AI didn’t create a vulnerability intentionally, but it may reproduce common vulnerability patterns from training data. No ISO 81001-5-1 AI-specific requirements triggered.
|
|
AI-Generated / Executed Tests
DEV TOOL
|
Zero at runtime. AI writes unit tests, integration tests, or V&V test cases. May also generate test data or automate test execution. AI is a verification tool — its outputs are evidence used to justify safety, not part of the device itself. |
QMSR (ISO 13485 §4.1.6 / §7.5.6)
IEC 62304
QMS obligation
Gray area
Tool validation per QMSR requirements (ISO 13485 §4.1.6 / §7.5.6) is required — but here the stakes are higher. AI-generated tests are the evidence that goes into the submission. If the AI writes weak tests that miss defects, the V&V evidence is compromised. FDA sees the test results, not who authored them — but inadequate coverage is a submission risk.
|
ISO 13485 §7.5.2
IEC 62304 §5.7
QMS obligation
Gray area
IEC 62304 Section 5.7 (software system testing) defines the testing process and documentation requirements but not who/what writes the tests. AI-generated test cases must meet the same coverage and traceability requirements. Notified Body auditors increasingly ask about automation tools. Test adequacy reviews by qualified humans become critical.
|
AI tool validation under QMSR with heightened scrutiny — the tool’s output directly becomes regulatory evidence. Validate that AI-generated tests achieve required code coverage, boundary condition coverage, and requirements traceability. Independent review of test adequacy by a qualified human (not just “AI wrote the tests and they passed”). Document test generation methodology in the Software V&V Plan. Consider: can you demonstrate test independence if the same AI wrote both the code and the tests?
|
Risk is primarily about false confidence: AI-generated tests may achieve high coverage metrics while missing critical edge cases or negative test scenarios. Risk analysis per ISO 14971 should consider the hazard of inadequate V&V evidence. If AI generates test data (synthetic patients, simulated inputs), validation of that synthetic data against real-world distributions is a risk control. No direct cybersecurity implications unless AI accesses production or patient data during test generation.
|
|
🔧 QMS / Tool Validation ◀ — Disclosure Boundary — ▶ Regulatory Submission 📋
|
|
AI-Derived / Rule-Executed
LOCKED
|
None. ML selects thresholds during development. Production code is deterministic if/then rules. AI is a design tool only.
Gray area Not clearly “AI in the device” — but AI shaped the device’s clinical performance.
|
510(k) / De Novo
IEC 62304
Gray area
Standard SaMD pathway. No AI-specific FDA guidance triggered — the cleared device is rule-based. ML methodology documented as design rationale within 62304 lifecycle. Submission would not appear on FDA’s AI-enabled device list. Not required to state “AI is used in the device” since it isn’t — but a reviewer asking “how did you pick these cutoffs?” exposes the ML methodology.
|
Annex II/III
CS for SaMD
Gray area
Standard conformity assessment. Notified Body reviews clinical evidence for threshold selection. IVDR performance evaluation covers analytical/clinical validity of the cutoff values. AI Act likely does NOT apply since the marketed device does not use AI. But this is untested territory.
|
Document ML training methodology as design input rationale (DHF, not necessarily submission). V&V focuses on the locked thresholds — standard test set performance (sensitivity, specificity, PPV/NPV). Cross-validation of cutoff stability across subpopulations. Bias analysis on training data demographics. ML pipeline may require tool validation under QMSR if treated as automated design process.
|
Standard risk management per ISO 14971. Cybersecurity scope is narrow — protect data integrity of input values only. No model-specific attack surfaces. ISO 81001-5-1 applies to the software, not to an AI model. Risk: if thresholds were overfit to training data, real-world performance degrades — this is a design risk, not a cybersecurity risk.
|
|
Classical ML Models
LOCKED
|
Moderate. Trained model (SVM, random forest, logistic regression) performs inference. Decision boundaries are more complex than thresholds but the model is interpretable and frozen. |
510(k) / De Novo
AI/ML SaMD Guidance
IEC 62304
Falls under FDA’s AI/ML SaMD framework as a locked algorithm. Software documentation per 62304. Performance testing with predefined acceptance criteria. No PCCP needed.
|
Annex II/III
MDCG 2019-11
ISO 81001-5-1
GSPR compliance with clinical evidence for intended purpose. Algorithm description in technical documentation must explain feature selection and decision logic.
|
Full training data documentation — provenance, labeling process, class balance. Feature engineering rationale. Model selection justification. Hold-out and/or k-fold cross-validation. Subgroup performance analysis. Comparison to predicate/reference method. Software verification per 62304 unit/integration/system levels.
|
ISO 14971 risk analysis includes ML-specific hazards: feature drift, input out-of-distribution detection. Cybersecurity per 81001-5-1 covers data pipeline integrity. Adversarial input risk is low but should be assessed. SBOM includes ML frameworks/libraries.
|
|
Deep Learning Models
LOCKED
|
High. Full neural network (CNN, RNN, Transformer) with millions of frozen weights performs all inference. The AI is the product. Low inherent interpretability. |
510(k) / De Novo
AI/ML SaMD Guidance
Transparency Guidance
IEC 62304
FDA expects algorithm description with transparency measures (GMLP). Locked — no PCCP. But reviewers scrutinize robustness, generalizability, and explainability measures more heavily.
|
Annex II/III
MDCG 2019-11
AI Act (High-Risk, from 2027)
ISO 81001-5-1
Likely high-risk under EU AI Act if clinical decision-making. Requires conformity assessment, risk management system, data governance, and human oversight measures. Note: AI Act high-risk obligations for AI embedded in regulated products apply from August 2, 2027.
|
Everything from Classical ML plus: architecture description and layer-by-layer rationale. Hyperparameter tuning documentation. Training convergence evidence. Augmentation strategies. Robustness testing (noise, artifacts, edge cases). Explainability outputs (GradCAM, SHAP, attention maps). Multi-site clinical validation. Real-world performance monitoring plan.
|
Expanded risk analysis: adversarial inputs, data poisoning (training phase), model evasion, input manipulation. Cybersecurity must cover inference pipeline end-to-end. ISO 81001-5-1 Section 5.5 (secure deployment). Threat model includes model extraction and intellectual property risks.
|
|
Locked Ensembles / Pipelines
LOCKED
|
High + Complex. Multiple chained models (segmentation → detection → classification). Each component is frozen. Failure modes compound across stages. |
510(k) / De Novo
AI/ML SaMD Guidance
Transparency Guidance
IEC 62304
Each model component may need individual and system-level validation. FDA may request decomposed performance data. Architecture-level documentation of inter-model dependencies.
|
Annex II/III
AI Act (High-Risk, from 2027)
ISO 81001-5-1
Notified Body reviews pipeline architecture. Each model stage contributes to overall risk classification. EU AI Act transparency obligations apply to the composite system (from August 2, 2027 for AI in regulated products).
|
All deep learning documentation for each component, plus: pipeline architecture diagram and data flow. Component-level V&V (unit testing each model). Integration testing across model boundaries. Error propagation analysis — how does upstream model failure affect downstream outputs? System-level clinical validation of end-to-end performance.
|
Compound threat modeling: each model interface is an attack surface. Data integrity between pipeline stages. Failure mode analysis must consider cascading errors. Cybersecurity architecture documents must map trust boundaries between components. SBOM complexity increases significantly.
|
|
🔒 Locked ◀ — PCCP Boundary — ▶ Adaptive 🔄
|
|
Adaptive / Continuously Learning
ADAPTIVE
|
High + Evolving. Model updates post-deployment from real-world data. May retrain on site-specific data, recalibrate thresholds, or fine-tune weights. The algorithm changes over its lifecycle. |
510(k) / De Novo / PMA
PCCP
AI/ML SaMD Guidance
GMLP
IEC 62304
Predetermined Change Control Plan (PCCP) defines what can change, how changes are validated, and performance boundaries. FDA reviews the modification protocol, not just the initial model. PCCPs are available across 510(k), De Novo, and PMA pathways.
|
Annex II/III
AI Act (High-Risk, from 2027)
Significant Change Guidance
ISO 81001-5-1
Each model update is a potential significant change requiring Notified Body notification or re-assessment. EU AI Act requires ongoing monitoring, logging, and human oversight for high-risk AI (obligations for AI in regulated products from August 2027).
|
Everything from locked deep learning plus: PCCP documentation — SaMD Pre-Specifications (what will change) and Algorithm Change Protocol (how changes are controlled). Ongoing performance monitoring with statistical triggers for retraining. Re-validation framework. Version control and model registry. Drift detection methodology. Real-world performance reporting.
|
All prior cybersecurity concerns plus: retraining data pipeline as attack surface (data poisoning post-market). Integrity of update mechanism — malicious model injection. Monitoring for adversarial drift. ISO 81001-5-1 post-market cybersecurity management. Incident response for model degradation events.
|
|
LLM / Foundation Model
GENAI
|
Maximum. Billions of parameters. Generative, probabilistic, prompt-sensitive. May hallucinate. Non-deterministic outputs. Emergent capabilities not fully characterized. Regulatory framework is nascent. |
TBD / De Novo likely
AI/ML SaMD Guidance
GMLP
LDT considerations
No established FDA pathway specific to LLMs in SaMD. Likely De Novo. Extreme scrutiny on intended use scoping, output validation, hallucination mitigation, and human-in-the-loop requirements. FDA is actively developing guidance.
|
AI Act (High-Risk, from 2027)
AI Act (GPAI, from Aug 2025)
Annex II/III
ISO 81001-5-1
EU AI Act GPAI provisions apply to foundation models (in force since August 2025). If used in medical devices, dual obligations: GPAI transparency + high-risk conformity (high-risk obligations for regulated products from August 2027). Notified Bodies have limited precedent for review.
|
Unprecedented documentation burden. Foundation model provenance and training data disclosure (to extent known). Prompt engineering documentation as design controls. Output validation framework — how are hallucinations detected and mitigated? Guardrails architecture. Red-teaming and adversarial testing at scale. Clinical validation must account for non-determinism. Human oversight and override protocols. Ongoing output monitoring and feedback loops.
|
Maximal attack surface. Prompt injection as a first-class threat. Data exfiltration via crafted prompts. Model jailbreaking. Supply chain risk for foundation model provider. Cybersecurity architecture must isolate LLM from clinical data stores. ISO 81001-5-1 + emerging NIST AI RMF. Continuous red-teaming post-market.
|