Disclaimer
This template is designed for companies developing devices that incorporate machine learning algorithms.
Why is this template necessary?
Due to the “black box” nature of many ML algorithms, their functionalities cannot typically be verified in the same way traditional software code can be validated through code reviews. Instead, these models must be rigorously tested using independent test datasets. This document serves as a repository for all validation results, such as sensitivity, specificity, accuracy, ROC curve values, and discussions of outliers or edge cases.
Important Note:
There is often a disconnect between the definitions of “validation” and “verification” in data science versus regulatory affairs. Be mindful of these differences to avoid confusion.
1. Overview
This document outlines the specification of the machine learning (ML) model and its testing methodology.
Algorithm validation results should also inform and be referenced in your clinical evaluation or performance evaluation report.
Regulatory References:
Standard | Relevant Sections |
---|---|
ISO 13485:2016 | 6.2; 7.3.2; 7.3.7; 4.2.3; 5.2; 7.2.1; 7.3.3; 4.1; 7.3.6 |
ISO 14971:2019 | 4.3; 5.2 |
ISO 82304:2016 | 6.1 |
ISO 62366-1:2015 | 5.1; 5.2 |
ISO 62304:2006 | 5.3; 8.1.2; 5.2 |
ISO/IEC 24028:2020 | 9.8.1; 10.5; 9.8.2.1; 9.8.2.2 |
Other Relevant Documentation:
- SOP Software Development
- Software Development and Maintenance Plan
- SOUP List
2. Development Resources
This section complements the device’s software development and maintenance plan.
2.1 Developer Team
Name | Role | Qualifications |
---|---|---|
(…) | (…) | (…) |
2.2 SOUP / Frameworks
Name | Version |
---|---|
PyTorch | (…) |
2.3 Data
Count | Description |
---|---|
e.g., annotated heart rate dataset from wearables |
2.4 Development Planning
- Intended Purpose: Outline the model’s intended task and use case.
- Clinical Environment: Reference the clinical evaluation plan and describe the clinical context of use.
- Software Architecture: Describe how the model integrates into the system with reference to a software architecture diagram.
- Quality Requirements: Define and justify quality metrics (e.g., sensitivity, specificity, accuracy) and benchmark them against the state-of-the-art.
3. Data Management
3.1 Data Acquisition
Document data sources, inclusion/exclusion criteria, potential biases, and data protection measures. Estimate the dataset size required and reference relevant QMS processes.
3.2 Data Annotation
Explain how data was labeled and ground truth determined. Include labeling requirements, qualifications of annotators, and quality assurance measures. Reference QMS processes if applicable.
3.3 Data Pre-Processing
Detail steps like anonymization, handling outliers, or splitting datasets for training, validation, and testing. Reference QMS processes where relevant.
4. Model Training and Description
Describe the model type (e.g., CNN), provide a diagram of the architecture, and detail the training process, including feature selection and hyperparameters.
5. Model Evaluation
Outline the testing methodology, including dataset specifics, evaluation metrics, and acceptance criteria.
6. Conclusion
Summarize evaluation results, demonstrating alignment with predefined quality standards and the device’s intended use. Address potential risks, biases, and limitations.