Introduction
ChatGPT, the artificial intelligence (AI) chat technology developed by OpenAI, has impressed the world with its amazing features that nearly match human performance. In January 2023, Microsoft announced that it will invest billions in OpenAI to make further improvements in this technology and other areas of AI (Microsoft 2023). By 2030 AI is projected to be worth US$15.7 trillion in the global economy and will most certainly impact healthcare (Medhealthreview 2022). As VP of Engineering at IMVARIA I am excited to be involved with this current revolution in AI and its application to improving healthcare outcomes for all.
The media hype and excitement associated with AI in healthcare often ignores the major challenge with implementing AI in the digital healthcare space: regulatory review. That has been the experience for the IMVARIA team as we have been working hard towards FDA clearance for AI-derived software as a medical device (SaMD). The International Medical Device Regulators Forum (IMDRF) defined SaMD as “software intended to be used for one or more medical purposes that perform these purposes without being part of a hardware medical device”. Development of SaMD requires additional testing for third-party or open-source software that SaMD utilizes, formally labeled software of unknown provenance (SOUP), and extensive documentation under the ISO 13485:2016 quality management system that is otherwise not required for non-regulated software development. Compared to non-regulated software development, this extra testing and documentation can easily double the time it takes to bring a SaMD to market. We welcome innovative efforts to improve regulatory processes in the assessment of medical software technologies (Torous, Stern, and Bourgeois 2022) and harmonize regulation across regulatory bodies (Mazur, n.d.). Regardless, knowing that regulation will always be a component of SaMD development, at IMVARIA we have tightly integrated our regulatory documentation with the software continuous integration/continuous delivery (CI/CD) system. This process, which we term computer-aided regulatory documentation engineering (CARDE), leverages software engineering principles and tools to streamline the regulatory documentation process. At IMVARIA the goal in implementing CARDE is to make the regulatory documentation and submission process as efficient and organized as possible so that we can focus our efforts on developing critical AI-based digital biomarkers. In the following sections, I will describe CARDE, as implemented at IMVARIA, in more detail.
Version Control
Modern software engineering relies heavily on version control, a component of software configuration management, to manage revisions to computer software source code and associated documentation. Multiple revisions to individual files are recorded and maintained with a timestamp and the author making the change. Modern version control systems have become very powerful, allowing the following:
- Tracking changes to files is handled automatically, and many software tools exist for visually viewing the changes from one revision to the next.
- Concurrent revisions to files by multiple individuals are allowed, and tools are provided for merging the changes and handling merge conflicts (simultaneous changes to the same line on a file).
- Distributed version control in which the complete history of revisions is mirrored on each developer’s computer to allow for offline work and faster local operation (the internet is only required for synchronization to the central repository).
- Because all revisions are tracked, reverting to a previous revision and identifying and fixing mistakes is relatively easy.
At IMVARIA we use the popular git distributed version control system and GitLab for hosting the central repository. The use of version control is critical for implementing CARDE.
Templates
Software version control systems are designed to manage changes to source code files written in plain text. Therefore to leverage the powerful version control features, our regulatory documents are stored in folders with our SaMD source code written in plaintext as either Markdown format or reStructuredText. Somewhat analogous to template metaprogramming, the regulatory documents are stored as form templates, with expected headings, subheadings, and placeholders for product-specific text, in our global quality management system. When development for a SaMD product begins, these documents are copied into the product version control repository and “instantiated” with the product specific design inputs, design outputs, user manual, medical device file, etc. When viewing the documents online using a web-browser, GitLab will automatically display the Markdown and reStructuredText files with formatting applied. GitLab also provides online editing tools for quick editing. Also, many integrated development environment (IDE) software tools provide editors with split views which allow editing on one side and real-time formatted viewing on the other such that the end-user has the same what you see is what you get (WYSIWYG) experience that you might encounter in traditional word processors. By taking the time to create these semi-automated templates, we generate time savings as we develop and document SaMD products with key regulatory frameworks in mind.
Automation
Storing the regulatory documentation with the SaMD source code allows us to combine SaMD testing and final documentation generation as part of CI/CD. CI/CD ensures that incremental changes to software are continuously and automatically built and tested. Typical CI/CD practices mandate that code changes be tested on a development branch before merging into the “main” or “master” branch. GitLab, as well as other popular online software version control repositories like GitHub, provide an online user interface (UI) for reviewing requests for merging a development branch into the main branch. These UIs provide fields for developers to make comments on code and regulatory document changes and, critically for SaMD quality management, for key individuals within the company to approve or reject the changes. At IMVARIA the following regulatory documentation tasks are handled automatically as part of the CI/CD pipeline:
- Tabulated SOUP test reports are downloaded and “included” or “imported” into the designated SOUP form template.
- Tabulated SaMD test reports, generated downstream in the CI/CD pipeline, are automatically imported into the appropriate regulatory form template.
- Each form revision history is generated automatically using the git commit history. Each commit lists the changes made, the change author, and the date the change was made.
- HTML versions of the regulatory documentation are automatically generated and published online for immediate private, secure internal review.
- PDF versions of the regulatory documentation, suitable for submission to a regulatory body, are automatically generated and uploaded to private secure cloud storage.
Automation reduces mistakes due to human error and decreases the time required for editorial review.
Reuse
Reusability in software engineering refers to designing software such that it may be easily reused in multiple software applications. Reuse relies on the principle that larger, more-complex systems may be created from multiple less-complex smaller components. In many instances, the smaller, less-complex components are common among many systems and can be reused. Reuse reduces maintenance costs because changes and fixes are applied only in the smaller component, so systems that rely on that component automatically inherit the changes. The alternative is that changes and fixes must be applied at all locations. Often in such scenarios, a location is inadvertently missed. At IMVARIA we’ve discovered that regulators expect certain portions of text to be present in multiple locations within the regulatory documents. The temptation is to simply copy and paste those portions of text to satisfy these requests. For small portions of text, copy and paste may be fine. It may also make sense to make such portions a part of the form template. However, for larger portions of text not suitable for inclusion in a form template, copy and paste creates a maintenance nightmare. In these instances, we maintain these portions in a separate file and then include them using special Markdown and reStructuredText formatting directives. This ensures that editing is only required in the single file and that it may be reused repeatedly wherever needed. Also, we anticipate that the adoption of reuse will be beneficial for regulatory submission to international regulatory bodies where the regulatory form templates will be different from those required by the FDA. We expect large portions of text to be common and will leverage reuse as much as possible.
Conclusion
Adoption of AI and machine learning (ML) in healthcare will increase rapidly through the end of this decade. Perhaps during that time, chat technology like ChatGPT will improve to the point that the generation of required regulatory documentation is an afterthought? In the meantime, at IMVARIA we’re adopting practices like CARDE to ensure that regulatory requirements are not an overly burdensome impediment to rapid applications of ML in medicine.
By Don C. Bigler, Ph.D. VP of Engineering IMVARIA