Ranica Arrowsmith, Associate Editor05.08.14
This year, Stanford University opened a “bad science” lab. The Meta-Research Innovation Center at Stanford, or METRICS, was established to “advance excellence in scientific research.” That is a polite way of saying that the center will make it its mission to expose shoddy scientific studies that depend on statistics manipulation, skewed sample groups, and other such tactics that create biased results. The center will essentially be researching research.
Founder John P. A. Ioannidis, M.D., D.Sc., is the C.F. Rehnborg professor in disease prevention, professor of medicine, professor of health research and policy, and professor (by courtesy) of Statistics at Stanford. He has made it his life’s mission to assess biases, replication, and reliability of research findings in biomedicine and other fields. In 2005, he published a controversial article titled “Why Most Published Research Findings Are False,” which discussed “study power and bias, the number of other studies on the same question, and, importantly, the ratio of true to no relationships among the relationships probed in each scientific field.”
He concluded that it is impossible to attain a gold standard in research, as it is impossible to eliminate all elements of bias, chance, sample size and so on. However one of the major conclusions he came to was the need for “better powered evidence, e.g., large studies or low-bias meta-analyses.” He noted that large-scale evidence should be targeted for research questions where the pre-study probability is already considerably high, so that a significant research finding will lead to a post-test probability that would be considered quite definitive. He also said that large-scale evidence is also particularly indicated when it can test major concepts rather than narrow, specific questions.
Fast forward five years to 2010, and the U.S. Food and Drug Administration (FDA) released a guidance document encouraging “adaptive designs” for drug and biologics trials. This means that instead of being overly cautious and selecting a larger sample size for your clinical trial than the FDA requires and sizing it down as needed, you select a sample size and trial parameters, and adapt the trial as time goes by until all requirements are met.
“Adaptive design is defined as a multistage study design that uses accumulating data to decide how to modify aspects of the study without undermining the validity and integrity of the trial,” Vladimir Dragalin, Ph.D., senior vice president of software development at Reston, Va.-based contract research organization Aptiv Solutions Inc. recently told Technology Networks. “By validity, we mean the minimization of statistical bias by using the correct statistical methods—and by integrity, we mean the minimization of operational bias through the use of appropriate trial execution technologies and working procedures, including the use of operational firewalls and independent data monitoring committees.”
In 2008, Howard S. Hochster M.D. published an article in Gastrointestinal Cancer Research titled “The Power of ‘P’: On Overpowered Clinical Trials and ‘Positive’ Results.” Overpowered trials are the opposite of adaptive clinical trials (ACTs), where trials test on a very large sample size in order to ostensibly create better data. However, while undersized trials can result in type I errors, i.e. false positives, oversized trials can result in type II errors, i.e. false negatives because statsitical differences are too small. Hochster used the trial for perioperative FOLFOX chemotherapy for resectable liver metastases vs. surgery alone as an example of an overpowered trial, and indeed, as a 2004 BMJ article claimed, cancer trials often are overpowered. “Very little attention is paid to the pitfall of overpowering and thereby making a type I error more likely: finding an association which is not clinically important (but only statistically significant),” read the BMJ article. (1)
“When you set up a clinical study you set it up with a sig number of patients depending what type of product it is in order to make sure you guarantee meeting the requirements of the FDA or a European Union Notified Body (NB),” explained Bernard Sweeney, senior vice president of medical devices for Reston, Va.-based contract research organization Aptiv Solutions Inc., to Medical Product Outsourcing. “Consequently you select a higher number of subjects, and depending on the probability, reduce it slightly. If you use an adaptive design, you have the ability after a period to analyze the different arms of the group, and if you’re finding it’s been very positive and the products has been performing well, better than expected against the existing standard of care, then you have the ability to reduce the number of patients particularly in the standard of care, and apply to the FDA or NB much earlier for approval. Conversely, if its not working, you have a greater ability to stop the trial. It gives you the chance to readjust the number of patients to suit the outcomes whilst you’re in it and therefore under most situations reduce the number of patients needing to be treated.”
ACTs could easily be called “right-sized” trials, because they are focused on streamlining and efficiency. Aptiv Solutions is a strong advocate for ACTs, and encourages its clients to consider this approach to medical device and drug trial design. The company provides software called ADDPLAN, a tool designed to assist with the planning, simulation, and evaluation of adaptive clinical trials. The FDA has placed a unique stamp of approval on the software by purchasing ADDPLAN licenses.
Sweeney told MPO that medical products are becoming more and more complex both in terms of diseases treated and etiologies, and entering larger markets. The FDA is encouraging the move towards ACTs in order to limit the amount of patients in trials, but also because ACTs allow researchers to concentrate on small subgroups within the trial and figure out why a treatment may not be working or is working in ways differently than expected.
“Let’s say a trial has 85 patients. You’ve got to wait until those 85 patients are on board and have been treated,” Sweeney continued. “But if you use an adaptive design, you have the ability of evaluating let’s say after 60 patients. Depending on those outcomes you may well find that the last 20-25 patients are not necessary. Or you may find that if you do something differently, you will have to increase patient numbers. But its all about optimizing patient numbers which is why the FDA is so encouraging of ACTs. With device trials, you are very likely to be able to reduce the number of patients, because most devices are basically engineered so you can predict early on more likely the outcome than with something that is a therapeutic agent for example.”
1. http://www.bmj.com/rapid-response/2011/10/30/are-cancer-trials-frequently-overpowered
Founder John P. A. Ioannidis, M.D., D.Sc., is the C.F. Rehnborg professor in disease prevention, professor of medicine, professor of health research and policy, and professor (by courtesy) of Statistics at Stanford. He has made it his life’s mission to assess biases, replication, and reliability of research findings in biomedicine and other fields. In 2005, he published a controversial article titled “Why Most Published Research Findings Are False,” which discussed “study power and bias, the number of other studies on the same question, and, importantly, the ratio of true to no relationships among the relationships probed in each scientific field.”
He concluded that it is impossible to attain a gold standard in research, as it is impossible to eliminate all elements of bias, chance, sample size and so on. However one of the major conclusions he came to was the need for “better powered evidence, e.g., large studies or low-bias meta-analyses.” He noted that large-scale evidence should be targeted for research questions where the pre-study probability is already considerably high, so that a significant research finding will lead to a post-test probability that would be considered quite definitive. He also said that large-scale evidence is also particularly indicated when it can test major concepts rather than narrow, specific questions.
Fast forward five years to 2010, and the U.S. Food and Drug Administration (FDA) released a guidance document encouraging “adaptive designs” for drug and biologics trials. This means that instead of being overly cautious and selecting a larger sample size for your clinical trial than the FDA requires and sizing it down as needed, you select a sample size and trial parameters, and adapt the trial as time goes by until all requirements are met.
“Adaptive design is defined as a multistage study design that uses accumulating data to decide how to modify aspects of the study without undermining the validity and integrity of the trial,” Vladimir Dragalin, Ph.D., senior vice president of software development at Reston, Va.-based contract research organization Aptiv Solutions Inc. recently told Technology Networks. “By validity, we mean the minimization of statistical bias by using the correct statistical methods—and by integrity, we mean the minimization of operational bias through the use of appropriate trial execution technologies and working procedures, including the use of operational firewalls and independent data monitoring committees.”
In 2008, Howard S. Hochster M.D. published an article in Gastrointestinal Cancer Research titled “The Power of ‘P’: On Overpowered Clinical Trials and ‘Positive’ Results.” Overpowered trials are the opposite of adaptive clinical trials (ACTs), where trials test on a very large sample size in order to ostensibly create better data. However, while undersized trials can result in type I errors, i.e. false positives, oversized trials can result in type II errors, i.e. false negatives because statsitical differences are too small. Hochster used the trial for perioperative FOLFOX chemotherapy for resectable liver metastases vs. surgery alone as an example of an overpowered trial, and indeed, as a 2004 BMJ article claimed, cancer trials often are overpowered. “Very little attention is paid to the pitfall of overpowering and thereby making a type I error more likely: finding an association which is not clinically important (but only statistically significant),” read the BMJ article. (1)
“When you set up a clinical study you set it up with a sig number of patients depending what type of product it is in order to make sure you guarantee meeting the requirements of the FDA or a European Union Notified Body (NB),” explained Bernard Sweeney, senior vice president of medical devices for Reston, Va.-based contract research organization Aptiv Solutions Inc., to Medical Product Outsourcing. “Consequently you select a higher number of subjects, and depending on the probability, reduce it slightly. If you use an adaptive design, you have the ability after a period to analyze the different arms of the group, and if you’re finding it’s been very positive and the products has been performing well, better than expected against the existing standard of care, then you have the ability to reduce the number of patients particularly in the standard of care, and apply to the FDA or NB much earlier for approval. Conversely, if its not working, you have a greater ability to stop the trial. It gives you the chance to readjust the number of patients to suit the outcomes whilst you’re in it and therefore under most situations reduce the number of patients needing to be treated.”
ACTs could easily be called “right-sized” trials, because they are focused on streamlining and efficiency. Aptiv Solutions is a strong advocate for ACTs, and encourages its clients to consider this approach to medical device and drug trial design. The company provides software called ADDPLAN, a tool designed to assist with the planning, simulation, and evaluation of adaptive clinical trials. The FDA has placed a unique stamp of approval on the software by purchasing ADDPLAN licenses.
Sweeney told MPO that medical products are becoming more and more complex both in terms of diseases treated and etiologies, and entering larger markets. The FDA is encouraging the move towards ACTs in order to limit the amount of patients in trials, but also because ACTs allow researchers to concentrate on small subgroups within the trial and figure out why a treatment may not be working or is working in ways differently than expected.
“Let’s say a trial has 85 patients. You’ve got to wait until those 85 patients are on board and have been treated,” Sweeney continued. “But if you use an adaptive design, you have the ability of evaluating let’s say after 60 patients. Depending on those outcomes you may well find that the last 20-25 patients are not necessary. Or you may find that if you do something differently, you will have to increase patient numbers. But its all about optimizing patient numbers which is why the FDA is so encouraging of ACTs. With device trials, you are very likely to be able to reduce the number of patients, because most devices are basically engineered so you can predict early on more likely the outcome than with something that is a therapeutic agent for example.”
1. http://www.bmj.com/rapid-response/2011/10/30/are-cancer-trials-frequently-overpowered