2006 IGT Workshop Validation Breakout
Home < 2006 IGT Workshop Validation Breakout
A Discussion Outline with embedded questions: Validation of Image Guided Therapy Systems (DRAFT..please correct and expand)
Questions to be answered in the report-out session:
- Identify 3 main challenges in this area
- Identify 3 specific problems that can be solved by a collaborative effort between academic and industry partners
- Identify a problem that the NCIGT can help address in the next year
- David Holmes (holmes.david3 at mayo dot edu)
- Guy Shechter (guy.shechter at philips dot com)
- Steve Pizer (pizer at cs dot unc dot edu)
- Bob Galloway (Bob.Galloway at vanderbilt dot edu)
- Kirby Vosburgh (kirby at bwh dot harvard dot edu)
- Basic requirements for new procedures and systems
- Reliability and consistency
- Appropriate ep procedure time
- Capital cost
- Operating cost
- Operational efficiency
- No new complications
- Such as “pump brain” for bypass surgery
- What is being Validated?
- Procedure outcomes
- Surrogates for outcomes
- Procedure outcomes
- Classes of Validation
- As used for robots that work with humans
- Correct performance of clinical task
- Meeting hardware/software specifications on the bench
- In test settings
- In clinical practice
- Characterization of Error modes
- Response/recovery plans
- Task Analysis;
- Task kinematics
- Lin approach
- CIMIT approach
- Display Physiology
- CIMIT/NASA model
- Task kinematics
- Forum for discussing, gaining consensus
- Focused workshops
- Working groups
- Publications (where?)
- Similar communities to serve as models
The conundrum of specifications
- Prototypes and products are built to meet design goals, which are represented by specifications. In developing new techniques, there is an implicit assumption (which should be verified under use-testing, as described below) that meeting the specifications will create a tool or system that enables superior clinical results.
- Specifications are developed through a “requirements elicitation” process. However, therapeutic tasks are complex, and the new system can only be characterized in limited ways. There is a tendency to equate grater precision with improved clinical outcomes, which is not always true. Thus specifications may be too tight for a particular clinical need…but on the other hand, just having operator acceptance is too low a standard.
- After bench tests meet specification, new systems are typically evaluated in more realistic settings to determine
- Operating Range
- Fault Modes
- Peri-system Compatibility
Initial user evaluation
- Comparative studies may be undertaken, successively, thorough retrospective analysis, simulators, phantoms, animal models, and human subjects. Present generations of simulators are insufficiently realistic to provide much assurance that, say, a new device design is better than an old one for a complex task. Animal models provide much more realistic test conditions, but suffer from the obvious differences in anatomy and physiology when serving as surrogates for humans. Thus some level of human testing will be necessary.
- Various groups are using techniques developed in other fields to characterize system performance. Several studies of simulators for lapraoscopic surgery training have been conducted. More recently, tests have been made under actual OR conditions in animal or human models: For example, the Hager group at JHU has analyzed the kinematic data in the DaVinci system, and the Vosburgh group at CIMIT/BWH has studied the performance kinematics and also the display utility in laparoscopic and endoscopic systems.
- At this level also, various possible system error modes can be delineated and avoidance, mitigation, or response plans developed.
- The standard method for validation of a new therapy is its performance relative to standard practice. Almost always, a prospective clinical trial is necessary to validate a new approach. As examples of the level of effort traditionally required, consider
- The studies by the Berger and colleagues for validating new methods for the treatment of hybrid astrocytoma. These took five years, and were well supported with clinical infrastructure.
- The Scottish study of 107 liver resections (pub: Gut). In this work, the fraction of liver left after various procedures was measured. The study was helped by the fact that liver resections very indicative near term outcomes.
- In comparison with testing new surgical therapies, drug or vaccine trials have defined end points: markers or direct measurements such as tumor size. Controls may be easily implemented through placebos, which are much simpler than sham surgery. Drug trials, of course, are primarily interested in finding side effects…but for devices the standard has been lower. Surgical side effects (complications) are limited in number and somewhat predictable.
- Clinical outcomes are hard to measure, and proper control groups are difficult to establish. It is often challenging to develop adequate patient numbers to give statistical power, particularly for identifying rare unsafe conditions. Additionally, multi-site studies are needed for eventual FDA approval.
- This complexity may drive the adoption of a partitioned approach, in which anecdotal analysis is combined with statistically valid tests on lower dimensional factors. A model is then required to combine these dissimilar observations. Thus, as was stated: “one needs standard deviations but also the estimate of the number of dimensions”
- Also, investigators will be well served to find creative ways to study multiple approaches simultaneously, so that some level of serial analysis may be precluded.
The way forward
- Between the cultural extremes of the bench engineer or scientist and the practicing clinician, we should build teams that can move us toward a unified philosophical approach and a mutually agreed representative paradigm for effective validation. This will not be static, but rather improved over time.
- We suggest that RFAs be more explicit in the requirement for validation plans. Also, that the NIH and the broader research and industrial community work together toward standard models for these plans, and criteria for their evaluation.
- The open sharing of data and validation techniques will assist this effort, and permit useful discussion on the optimum approaches.
Forums for discussing, gaining consensus
- The National Center for IGT should take the lead to address the validation protocol question. Further session like today’s and standing working gourps would be helpful.
- There is a possible role for NIST in establishing standards and phantoms and procedures for validating subsystems
- Publications validating the validation methodology should be encouraged. These might best be accomplished by dual publications in surgical Specialty and Clinical Engineering or Medical Physics Journals. We need named champions with clout to bring this off.
Breakout report presentation (Kirby Vosburgh)