Tag Archives: Evidence based medicine review

Evaluating the results of a Systematic Review/Meta-Analysis

by Michael Turlik, DPM1

The Foot and Ankle Online Journal 2 (7): 5

This is the second of two articles discussing the evaluation of systematic reviews for podiatric physicians. This article will focus on publication bias, heterogeneity, meta-analysis analytic and sensitivity analysis. A recent article related to plantar foot pain will be critically evaluated using the principles discussed in the paper.

Key words: Evidence-based medicine, review article, meta-analysis.

This is an Open Access article distributed under the terms of the Creative Commons Attribution License.  It permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. ©The Foot and Ankle Online Journal (www.faoj.org)

Accepted: June, 2009
Published: July, 2009

ISSN 1941-6806
doi: 10.3827/faoj.2009.0207.0005

In the event that the primary studies selected for the systematic review are so dissimilar (heterogeneity) that it is ill-suited to combine the treatment effects, the systematic review will end with a table describing all of the articles abstracted. The table should contain each individual reference with the abstracted information to include: the results of the study as well as, the quality evaluation of the article done by the authors of the systematic review. The results of a systematic review are qualitative rather than quantitative (meta-analysis). The evaluation of individual randomized controlled trials has been covered earlier in this series. [1,2,3] The authors in the narrative results section should explain why the studies were unable to be combined into a pooled estimate of effect (meta-analysis).


The results of a systematic review are a function of the quantity and quality of studies found during the review. The conclusion of a systematic review may be that after reviewing the published studies the clinical question cannot be answered and that there is a need for a larger, or a more rigorous study design to answer the clinical question. [4,5]

This article is the second and final article explaining systematic reviews/meta-analysis. The first article evaluated the internal validity of a systematic review. [6] The purpose of this article is to explain the results section of a meta-analysis using a recent meta-analysis of extracorporal shockwave therapy (ESWT) for mechanically induced heel pain [7] as a guide.

A meta-analysis uses statistical techniques to combine data from various studies into a weighted pooled estimate of effect. Meta-analysis overcomes small sample sizes of primary studies to achieve a more precise treatment effect. In addition, meta-analysis is thought to increase power and settle controversies from primary studies. When not to perform a meta-analysis: the studies are of poor quality, serious publication bias is detected or the study results are diverse.

Publication Bias

Reporting bias can be defined as the author’s inclination not to publish either an entire study or portions of the study based upon the magnitude, direction or statistical significance of the results. [8] A type of reporting bias is publication bias, which refers to the fact that the entire study has not been published.

Systematic reviews which fail to search and find unpublished studies which report negative results may lead to an over estimation of the treatment effect.

Small trials with negative results are unlikely to be published and if they are may be published in less prominent journals.

Large studies which report positive results may receive a disproportionate amount of attention. They may be actually published more than once. This is the opposite of publication bias. Therefore, it is important for the authors performing a meta-analysis to eliminate duplicate publications otherwise the treatment effect will be overestimated.

A common method to search for publication bias is to construct a funnel plot (Fig 1, 2). A funnel plot for evaluation of publication bias is a scatter diagram of randomized controlled trials found as a result of the systematic review in which the treatment effect of the intervention appears along the X axis while the trial size appears along the Y axis.

Figure 1  Hypothetical funnel plot which does not show publication bias.

Figure 2  Hypothetical funnel plot which does show publication bias.Figure 2 here

The precision of estimating a treatment effect from a clinical trial increases with increasing sample size and event rate. Smaller studies show a large variation in treatment effect at the bottom of the funnel plot. When no publication bias is present the graphical representation reveals an inverted funnel (Fig 1).

When publication bias is present typically it will be noticed that smaller studies are missing which do not favor the intervention typically the lower right-hand side of the plot resulting in an asymmetrical presentation (Fig 2). It is difficult to evaluate publication bias in the meta-analysis using a funnel plot if the study is composed of a small number of trials with small sample sizes. [9] The reader is referred to the following references for a more complete explanation of the subject matter. [10,11]

Returning to our article evaluating ESWT for mechanically induced heel pain,[7] in the methods section the authors state that they will use a funnel plot to evaluate for publication bias. A funnel plot could not be found when reviewing the figures in the results section. At the end of the article the authors discuss in the narrative of the study their findings regarding publication bias. The authors were unable to recognize the existence of small, unpublished studies showing no statistically significant benefits. As a result it is likely that the treatment effect found many overestimate the actual treatment effect.


It is common to expect some variability between studies. However, if the variability between studies is significant the inference of the meta-analysis is decreased, and it may no longer may make sense to pool the results from the various studies into a single effect size.

There are two types of heterogeneity, clinical and statistical. [12] Are the patient populations, interventions, outcome instruments and methods similar from study to study (clinical heterogeneity)?

Are the results similar from study to study (statistical heterogeneity)? Large differences in clinical heterogeneity improves generalizability however, may produce large differences in results which weakens any inference drawn from the study.

Clinical heterogeneity is best evaluated qualitatively. It is a clinical judgment based upon the reader’s understanding of the disease process. The reader needs to ask the following question; is it likely based upon the patient populations, the outcomes used, interventions evaluated and methodology of the study that the results would be similar between studies? If the answer to this question is no then a meta-analysis does not make sense. If the answer to this question is yes the authors should proceed to evaluate statistical heterogeneity.

Statistical heterogeneity can be evaluated both qualitatively and quantitatively. Qualitative evaluation involves developing a forest plot of the point estimates and corresponding 95% confidence intervals of the various primary studies selected for pooling (Fig 3). Are the point estimates from the various primary trials similar from study to study and do the 95% confidence intervals about the point estimates overlap? If the answer is yes, there is not significant heterogeneity and a pooled treatment estimate makes sense. For example, in the forest plot from the ESWT study [7] (Fig 3) although the point estimates do not all favor the intervention they are fairly close to each other. In addition, there appears to be overlap of the 95% confidence intervals for all of the studies. The conclusion one should reach is that there is not significant heterogeneity in this systematic review and therefore one should proceed to pool the data. In contrast, when the point estimates are not grouped together and the 95% confidence levels do not overlap then significant heterogeneity exists and that data should not be pooled.

Figure 3  Results from ESWT study7 (presented in forest plot).

Statistical heterogeneity can also be evaluated by statistical tests. [13] The two common tests are Cochran’s Q and the I2 statistic. Cochran’s Q is the traditional test for heterogeneity. It begins with the null hypothesis that the magnitude of the effect is the same across the entire study population. It generates a probability based upon the Chi squared distribution. The test is underpowered therefore; p > 0.1 indicates lack of heterogeneity. I2 is a more recent statistical test to evaluate for heterogeneity. [14] The closer to zero I2 is the more likely any difference in variability is due to chance. Less than 0.25 is considered mild, between 0.25 and 0.5 is considered moderate greater than 0.5 is considered a large degree of heterogeneity.

The options for systematic reviews which demonstrate significant heterogeneity are the following: do not perform a meta-analysis, perform a meta-analysis using a random effects model, explore and explain heterogeneity of the study [15] using sensitivity analysis / meta-regression.

The authors of the ESWT study [7] present in the results section in narrative and table format clinical characteristics of the primary studies.

In addition, they presented the point estimates and 95% confidence intervals of the primary studies in a forest plot with results from Cochran’s Q. as well as, I2 (Fig 3). Their conclusion is that there was not significant heterogeneity present and therefore pooling of the data was appropriate.

Meta-Analytic Models

The two different models used to combine data in a meta-analysis are random effect and fixed effect. [8] Both involve calculating a weighted average from the results of the primary studies. The larger the study the more impact it will have on the combined treatment effect. The fixed effect model assumes data between studies are roughly the same and any differences are due to random error. There are different fixed effect tests which can be used depending upon the type of data and the precision of the studies included. The random effects model is used when heterogeneity is encountered in the primary studies and offers a more conservative estimate. The main method is the DerSimonian Laird test. The random effects model provides less weight to larger studies and has larger confidence intervals generated about the effect size. The estimates of effect should be similar between fixed effect and random effect models if the studies do not show heterogeneity. If there is significant heterogeneity the results will differ sometimes greatly. If the meta-analysis combines different types of outcomes the results may be reported as an effect size. An effect size less than 0.2 indicates no effect greater than 0.2 indicates a small effect, greater than 0.5 indicates a moderate effect greater than 0.8 indicates a large effect.

The results of the meta-analysis should be presented as a summary point estimate with 95% confidence intervals. The authors of the meta-analysis should place the results in a clinical perspective and determine if the results are clinically significant.

The authors of the ESWT study [7] chose to use a fixed effect model to pool the data from the primary studies. The authors presented their findings in the results section using figures (Fig 3) and text. The pooled estimate of a 10 cm VAS scores for morning pain at 12 weeks with 95% confidence intervals is reported. The authors conclude that the pooled estimate although statistically significant in favor of ESWT is not clinically significant.

Sensitivity Analysis

Sensitivity analysis is often carried out in meta-analyses to evaluate potential sources of bias. For example, do the results of the meta-analysis vary with trial quality, trial size, type of intervention, patient characteristics, outcome or any other variable usually determined a priori. As with any other type of subgroup analysis precautions should be undertaken when interpreting their results. [8]

The authors of the ESWT study [7] performed a sensitivity analysis comparing the results as a function of study quality. When only the trials which were judged to be a higher quality were used in the meta-analysis the results failed to reveal a statistically significant result. This is consistent with the concept that trials which lack methodological rigor overestimate the treatment effect of interventions. The authors conclude that the meta-analysis performed does not support the use of ESWT in the treatment of mechanically induced heel pain.


1. Turlik M: Evaluating the internal validity of a randomized controlled trial. Foot and Ankle Online Journal. 2 (3): 5, 2009.
2. Turlik M: How to interpret the results of a randomized controlled trial. Foot and Ankle Online Journal. 2 (4): 4, 2009.
3. Turlik M: How to evaluate the external validity of a randomized controlled trial. Foot and Ankle Journal 2 (5): 5, 2009.
4.Edwards J: Debridement of diabetic foot ulcers. Cochrane Reviews, http://www.cochrane.org/reviews/en/ab003556.html. Accessed 2/23/09.
5. Valk G, Kriegsman DMW, Assendelft WJJ: Patient education for preventing diabetic foot ulceration. Cochrane Reviews.
http://www.cochrane.org/reviews/en/ab001488.html. Accessed 2/23/09.
6. Turlik, M. Evaluation of a Review Article. Foot and Ankle Journal 2:, 2009.
7. Thomson CE, Crawford F, Murray GD: The effectiveness of extra corporeal shock wave therapy for plantar pain: a systematic review and meta-analysis. Musculoskeletal Disorders 6:19, 2005.
8. Guyatt G, Drummond R, Meade M, Cook D: Users’ guides to the medical literature. New York, McGraw-Hill Medical, 2008.
9. Egger M, Davey Smith G: Bias in meta-¬analysis detected by a simple, graphical test. BMJ 315: 629 – 634, 1997.
10. Sterne JAC, Egger M, Davey Smith G: Systematic reviews in health care: Investigating and dealing with publication and other biases in meta-analysis. BMJ 323: 101 – 105, 2001.
11. John PA, Ioannidis J, Trikalinos T: The appropriateness of asymmetry tests for publication bias in meta-analyses: a large survey. 176 (8): 1091 – 1096, 2007.
12. Hatala R, Keitz S, Wyer P, Guyatt G: Tips for teachers of evidence-based medicine: 4. Assessing heterogeneity of primary studies in systematic reviews and whether to combinetheir results. CMAJ 172: 661 – 665, 2005.
13. Fletcher J: What is heterogeneity and is it important? BMJ: 334: 94 – 96, 2007.
14. Higgins JPT, Thompson SG, Deeks JJ, Altman DG: Measuring inconsistency in meta-analyses. BMJ 327:557 – 560, 2003.
15. Ioannidis J, Patsopoulos NA, Rothstein HR: Reasons or excuses for avoiding meta-analysis in forest plots. BMJ 336: 1413 – 1415, 2008.

Address correspondence to: Michael Turlik, DPM
Email: mat@evidencebasedpodiatricmedicine.com

1 Private practice, Macedonia, Ohio.

© The Foot and Ankle Online Journal, 2009

Evaluating the Internal Validity of a Randomized Controlled Trial

by Michael Turlik, DPM1

The Foot and Ankle Online Journal 2 (3): 5

This paper discusses the important elements to look for when evaluating a randomized controlled trial for internal validity. At the end of the paper the article analyzes a randomized controlled trial found in the first article of this series. This is the second in a series of articles introducing practicing podiatric physicians to evidence-based medicine.

Key words: EBM, evidence-based medicine, randomized controlled trial

This is an Open Access article distributed under the terms of the Creative Commons Attribution License. It permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. ©The Foot and Ankle Online Journal (www.faoj.org)

Accepted: February, 2009
Published: March, 2009

ISSN 1941-6806
doi: 10.3827/faoj.2009.0203.0005

Is this treatment effective? The best study to answer this question is a randomized controlled trial (RCT). Once a RCT is found to answer the therapeutic question of interest the reader must decide whether or not the authors took all of the important steps to minimize bias. The strength of the inference that we can take from the trial is based upon how well the authors have planned, executed and reported safeguards to minimize bias in their study. The authors could reach an invalid conclusion to the question either because of bias or random error. The evaluation of the safeguards implemented by the authors is referred to as the internal validity of a RCT. Methodological quality is a continuum not a dichotomy and is often in the eye of the beholder.

Even if the authors have described in sufficient detail all of the methods which we would commonly associate with minimizing bias the results of the study may be affected by random error (chance), which is why we can never be sure exactly where the truth may lie. This is the second in a series of articles about evidence-based medicine (EBM) for practicing podiatrists. This article will help podiatric physician in evaluating the internal validity of a RCT.


Returning to the initial article of this series,1 you will recall that we had found two articles2,3 to help us in evaluating whether the use of insoles with magnets were an effective treatment for reducing pain from diabetic neuropathy. In the initial article of this series a reference was made to a PhD who advocated the use of magnetic insoles for the treatment of symptomatic diabetic neuropathy. In addition, she had developed and was marketing a new type of magnetic insole for this purpose.

The person presenting this information could be said to be biased toward the use of magnetic insoles specifically her design in the treatment of symptomatic diabetic neuropathy. Bias can be defined as an opinion or feeling that strongly favors one side in an argument or one item in a group or series; predisposition, prejudice.

However, when we consider bias in a RCT, we specifically refer to non-random systematic errors in the design or conduct of the study. Bias is described as any process at any stage of the study which produces results which deviate systematically from the truth. Bias in RCT’s is usually not intentional but it is pervasive and insidious. There are many specific types of bias associated with a RCT to include: selection, measurement, and analysis bias. The result of bias in a clinical study is to overestimate the treatment effect as a result, the intervention may appear to work when it really doesn’t. Methods in clinical trials which are used to minimize bias include the following: randomization, concealment allocation, blinding, and intention to treat analysis.


Case reports and case series while popular in the podiatric literature cannot be used to demonstrate treatment efficacy/effectiveness. These study designs lack a control group and as a result are best used to generate rather than test hypotheses of clinical efficacy/effectiveness. Case-control and cohort studies are observational studies which utilize a control group however, the question always is: Are the two groups similar enough? Is the treatment effect seen in these studies due to the intervention or the difference in prognostic factors between the groups? It is well accepted that non randomized controlled trials exhibit a greater treatment effect than randomized controlled trials. [4] The only method currently available to provide two groups which are similar for known and unknown prognostic factors is randomization.

Selection bias is the term used to explain the preferential allocation of participants with similar prognostic factors to the same arm of the clinical trial. Random allocation is a study method which minimizes selection bias.

Random allocation can be defined as a process by which all participants have the same chance of being assigned to either treatment arm. One of the first things to evaluate when reading the methods section of a RCT is to determine if the authors of the study generated a randomized sequence consistent with the preceding definition. The following methods are not consistent with random allocation methods: date of birth, hospital chart number, alternate selection, date of entry, or days of the week. Unlikely methods of random allocation in clinical trials are: flipping a coin, rolling dice, or choosing colored balls out of a bag. Typically random allocation in clinical trials will be achieved by the use of a random number table or a computer-generated series of numbers. Another question which should be addressed in the methods section by the authors is who generated the randomization sequence? It is best when an independent third-party not associated with the study generates the sequence. In a study of RCT’s published in popular podiatric medical journals only two of the nine studies described a random allocation process. [5]

Concealment allocation

After the randomization sequence is generated the list may be given to the investigator responsible for enrolling participants in the study. This is referred to as unconcealed participant allocation. The investigator may steer participants to certain treatment arms based upon prognostic factors either consciously or unconsciously. Concealment allocation can be defined as the process by which the physician is blinded to the randomized sequence which was generated. The person who enrolls participants in the trial should not be the same person who generates the allocation sequence.

In RCTs where concealment allocation has not been utilized there is an over estimation of treatment effect compared to trials which conceal the allocation sequence. The increase treatment effect may be 20 to 30%. [6] The average bias associated with lack of adequate concealment allocation was less for outcomes which were objectively evaluated (death, ulcer closure) than subjectively evaluated (pain, patient reported outcomes).

The description the authors used for concealment allocation is usually found in the methods section of the paper. For example, a common description might be “…a neutral third party has generated a series of sequentially numbered opaque sealed envelopes (SNOSE) containing the randomization sequence for each participant to be opened at the time the investigator enrolls the participant in the study.” As a result the investigator is blinded to the treatment arm to which the participant will be enrolled.

It is very common today to have a centralized allocation process when the investigator enrolls a participant in the study he or she will call an off-site location to determine the next group to which the subject is enrolled. A similar system is also used by computer over the internet. Sometimes the allocation sequence may be kept in the pharmacy which will be contacted by the investigator prior to enrolling the participant. Adequate concealment allocation helps to limit selection bias. In a study of RCTs published in common podiatric medical journals none of the nine studies described a process for concealment allocation. [5]

Were patients in the treatment and control group similar with respect to known prognostic factors? Problems with the randomization allocation sequence and the concealment allocation process may result in an imbalance in baseline prognostic factors. If both of these steps have been followed and there still remains an imbalance in some important prognostic factor it will be assumed that this is due to chance rather than bias. The larger the study the less likely this is to occur.


Blinding in a clinical trial can be defined as withholding information about treatment allocation from those who could potentially be influenced by this information. Un-blinded studies exhibit an increased treatment effect compared blinded studies. [7] In the methods section the authors should describe in some detail who was blinded, how they were blinded, and the success of blinding.

Who was blinded? Certainly participants and investigators can be blinded. Less commonly recognized is that data collectors and analysts should be blinded. Participants should be blinded because they may use other effective interventions, may report symptoms differently, or may drop out if they perceived they have received a placebo therapy. Investigators should be blinded because they may prescribe effective co-interventions, influence follow-up, or patient reporting. Data collectors and analysts should be blinded because they may exhibit differential encouragement during performance testing, exhibit variable recordings of outcomes, or differential timing and frequency of outcome measurements.

How was blinding achieved? In the case of medication the placebo should be the same size, shape, color and taste as the therapeutic intervention. When using a gold standard a double dummy process should be employed. When using a sham procedure both instruments should look alike, sound alike, have the same lights and duration. Separate waiting rooms may be necessary for each treatment arm to prevent interactions between groups. Sometimes the therapeutic intervention under investigation precludes the investigators and/or participants from being blinded however, it is difficult to understand why data collectors and analysts cannot be blinded.

There is no universal agreement upon how to assess blinding8 or even if it should be assessed. Often times study authors will ask investigators and participants to guess at their treatment allocation and report the results. Some would suggest looking for bias generating consequences instead of contamination and co-interventions.

Measurement bias is defined as inaccurate measurement due to either the accuracy in the measurement instrument or bias based upon study expectations of participants and investigators. Blinding will help to limit measurement bias. In a study of RCT’s published in popular podiatric journals only two of the nine trials described a process for blinding. [5]

Intention to Treat Analysis

Intention to treat (ITT) analysis can be defined as the strategy for the analysis of randomized controlled trials to ensure all participants are compared in the groups to which they were originally randomly assigned. Although this sounds simple it is difficult to understand and often confused in the literature. In general, all patients who were randomized at the beginning of the trial must be accounted for during the analysis. Certainly there are some exceptions [9] however, failure to account for all the participants at the conclusion of the trial will result in analysis bias, overestimating the treatment effect. ITT preserves the prognostic balance in the treatment arms achieved by randomization and increases generalizability.

During the course of the study participants may elect not to participate after they were randomized or change treatment arms for various reasons. In ITT analysis they should always be analyzed into the group they were originally allocated regardless of the treatment received. In addition, during the course of the study participants may be lost to follow-up.

It is well-established that participants who drop out of a study have a different prognosis than those who remain. [10] There are many different methods to infer or impute lost study results for example, last value carried forward, or worse case scenario. Unfortunately these are only estimates for missing data. ITT prevents conscious or unconscious attempts to influence study results by excluding participants.

Results in a RCT may be presented in two different manners. Per protocol indicating that the results only include patients who have successfully completed the trial. ITT analysis will be reported indicating that all participants are accounted for. Per protocol analysis answers the question what will happen if my patients all comply with the treatment intervention (explanatory). Intention to treat analysis answers the question what will happen in real life using this treatment intervention (pragmatic). If there is a difference between per protocol analysis and ITT analysis, the loss of follow-up has been large and the inference from the study is reduced. Intention to treat analysis is a more pragmatic and a more conservative estimate of treatment effect and minimizes a type I reporting error.

When reviewing the methods section of a RCT to evaluate intention to treat analysis the reader should look for how intention to treat was performed. It is unacceptable for the authors simply to state that the data was analyzed on an intention to treat principle without explanation. In a study of RCT’s published in common podiatric journals four of the nine papers reported that the data was analyzed on an intention to treat basis. [5]

Magnets in the treatment of diabetic neuropathy

In the first article of this series [1], a RCT3 was identified which evaluated the usefulness of magnetic insoles in reducing the symptoms of painful diabetic neuropathy. Using the information presented earlier the internal validity of this article was critically analyzed.

Randomization: The authors in the methods section describe an equal random allocation procedure utilizing a computer. It is unclear who actually generated the sequence and how it was accomplished. In addition, randomization was stratified by center and gender.

Concealment allocation: In the methods section the authors report that neither the participants or investigators were aware of the treatment allocation. However, they did not elaborate as to the method. It may be that utilizing the computer a centralized allocation process was used.

Baseline comparison: Although the active treatment group and the sham group appear to be similar with regards to baseline characteristics, the groups appear to be dissimilar when baseline outcome measurements are compared. The baseline outcome measures for the sham group appear to be worse than the intervention group.

Blinding: The authors state that the sham and active magnetic insoles were identical with regards to the appearance, consistency and weight. In the event the insole did not fit the shoe and needed to be trimmed the authors described a process by which an uninvolved third party would make the adjustments. In addition, all data was submitted blindly to an uninvolved third-party for analysis. In the results section the authors report their efforts on assessing the effectiveness of the methods used in blinding the investigators and participants. There was no indication that any contamination or co-interventions were detected. However, in the methods section there was no indication that the authors were attempting to measure contamination or co-interventions.

Intention to treat analysis: The results of the study were not analyzed on an ITT basis. Furthermore, the author states that participants with incomplete data were excluded from analysis.
In the results section the authors discussed dropouts and their decision not to analyze the data on an intention to treat basis. The authors chose to use four different primary outcomes each with different numbers of participants.

Summary of internal validity: Based upon the methods and results section of the paper it is clear that the authors attempted and succeeded in blinding participants, investigators and data analysts. It is less clear as to the method of random allocation and concealment allocation. The authors made no attempt to analyze the data utilizing the intention to treat principle.


1. Turlik M: Introduction To Evidence-based Medicine. Foot and Ankle Journal 2: 4, 2009.
2. Pittler M, Brown EM, Ernst E: Static magnets for reducing pain: systematic review and meta-analysis of randomized trials. CMAJ 177: 736 – 742, 2007.
3. Weintraub MI, Michael I. Weintraub, Wolfe GI, Barohn RA, Cole SP, Parry GJ, Hayat G, Cohen JA, Page JC, Bromberg MB, Schwartz SL, Magnetic Research Group: Static magnetic field therapy for symptomatic diabetic neuropathy: a randomised, double-blind, placebo-controlled trial. Arch Phys Med Rehabil 84 (5): 736 – 746, 2003.
4. Moore A, McQuay H: Bandolier’s Little Book of Making Sense of the Medical Evidence. Oxford University press, Oxford England, 2006.
5. Turlik M., Kushner D, Stock D: Assessing the Validity of Published Randomized Controlled Trials in Podiatric Medical Journals. JAPMA 93 (5): 392 – 398, 2003.
6. Wood L: Empirical evidence of bias in treatment effect estimates in controlled trials with different meta-epidemiological study interventions and outcomes BMJ 336 : 601 – 605, 2008.
7. Poolman R: Reporting of Outcomes in Orthopaedic Randomized Trials: Does Blinding of Outcome Assessors Matter? J Bone Joint Surg 89A: 550 – 558, 2007.
8. Hro´bjartsson A. trials taken to the test: an analysis of randomized clinical trials that report tests for the success of blinding. International Journal of Epidemiology 36 (3): 654 – 663, 2007.
9. Fergusson D, Aaron SD, Guyatt G, Hébert P: Post-randomisation exclusions: the intention to treat principle and excluding patients from analysis. BMJ 325: 652 – 654, 2002.
10. Altman D: Clinical trials. In: Practical statistics for medical research. London: Chapman & Hall, 1991.

 Address correspondence to: Michael Turlik, DPM
Email: mat@evidencebasedpodiatricmedicine.com

1 Private practice, Macedonia, Ohio.

© The Foot and Ankle Online Journal, 2009

EBMR: Comparison of Negative Pressure Wound Therapy using Vacuum-Assisted Closure with Advanced Moist Wound Therapy in the Treatment of Diabetic Foot Ulcers

Evidence Based Medicine Review

Blume P., Walters J., Payne W., Ayala J., and Lantis J.

Diabetes Care 31:631-636, 2008

Michael Turlik, DPM

The Foot & Ankle Journal 1 (12): 5

This is an Open Access article distributed under the terms of the Creative Commons Attribution License.  It permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. ©The Foot & Ankle Journal (www.faoj.org)


This was a multi-center randomized controlled efficacy trial of 342 diabetic subjects who were followed for a minimum of 112 days. The study was performed across 37 diabetic foot clinics and hospitals principally in the United States. The primary outcome was complete ulcer closure. There was a thorough description of the method of randomization as well as, concealment allocation. Subjects and investigators were not blinded and it was unclear if data collectors and analysts were blinded. Safety and effectiveness analysis was conducted by the company sponsoring the study.


A sample size calculation was carried out with the expectation of a 20% difference between groups (Absolute Risk Reduction). Enough subjects were enrolled in the study to satisfy the sample size calculation. Baseline data examination reveals no difference between groups. Data from the primary outcome were analyzed as intention to treat as well as, per protocol. Efficacy of the intervention was reported as statistically significant both for intention to treat and per protocol analysis favoring the vacuum assisted closure method. Point estimates were reported for the primary outcome but 95% confidence intervals were not reported.


The manufacturer of the wound closure vacuum assisted device (KCI) supported the study. It was not indicated if the study was registered. The primary author has received payments from the manufacturer for speaking engagements. No other disclosures were noted for other authors or investigators. The article has been marked as an advertisement by the publisher.


The study contains several well described methodological techniques to limit bias, randomization, concealment allocation, and intention to treat analysis. However, due to the nature of the study investigators and subjects were unable to be blinded.

Although unblinded studies are associated with an increased treatment effect this is less likely when the primary outcome is objective such as resolution of an ulcer as opposed to soft measurements such as patient reported outcomes. [1] However, no mention was made regarding blinding of data collectors and analyzers.

The data from the study was analyzed by the company funding the study. This may be perceived as a potential source of bias, it is more reassuring to the reader when the data is analyzed by a neutral third party. The results of the primary outcome were presented as intention to treat and per protocol. It appears the author chose to assign the worst case scenario for the data lost to follow-up for the ITT analysis. Both methods were statistically significant however differed in their point estimate. Was this study clinically significant? The authors expected a 20% difference (ARR) between groups when they calculated their sample size. If the 20% difference is to be accepted as a clinically significant result then the result of the primary outcome using the intention to treat analysis was not clinically significant but only statistically significant. The per protocol analysis was both clinically and statistically significant. Furthermore, it is difficult to analyze the results with only point estimates and not 95% confidence intervals (CI). Why 95% CI were reported for secondary measures and not the primary outcome was unclear.

There appears to be a fairly high loss to follow-up in both arms of the study (approximately 30% per treatment arm). The prognosis for subjects lost to follow-up is thought to be different than the patients who remain in the study. [2] This loss of data may compromise the randomization sequence. Did the loss of follow-up effect the results of the study? The strength of the inference drawn from the study is modified by the magnitude of the difference between the intention to treat and per protocol analysis. It would have been instructive for the reader if the authors addressed this point during their discussion of the results.

Interpretation the study’s results would be better understood with a clear clinically important difference stated by the authors and with 95% CI reported about the point estimate of the primary outcome.

Using the data from this study 95% CI can be calculated for the Absolute Risk Reduction (ARR) and Number Needed to Treat (NNT) for both the intention to treat and per protocol analysis. [3] (table 1)

Table 1

The ARR only exceeds 20% during the per protocol analysis. The lower end of the 95% CI for both ITT and PP is greater than 0 which is consistent with a statistically significant result. Although the point estimate (ARR) for the intention to treat analysis is less than 20% , a risk reduction of more than 20% cannot be ruled out by evaluating the upper end of the 95% CI and would suggest a larger study is necessary or less loss of follow-up.

The NNT is a more clinician friendly metric to access efficacy in studies with dichotomous outcomes. The NNT for both are similar 5 (PP) and 8 (ITT) however, the upper limit of the 95% CI or worse case scenario is 12 (PP) and 24 (ITT). This appears to be a large difference.

Although the use of the vacuum assisted closure appears to be more efficacious the magnitude of the effect is unclear and the inference reduced. It is up to the reader to determine if the loss to follow-up, lack of blinding and lack of clinical significance reduces the inference of the results of this study.

In addition, since the study was designed as an efficacy rather than an effectiveness study, generalizing the results to clinical practice should be undertaking with caution.

The safety data were presented as treatment related rates at six months. However, the trial evaluated treatment until day 112 or ulcer closure by any means. It would be informative to the reader to review the data on safety prior and post intervention termination. There have been two meta-analysis published this year for vacuum assisted closure and diabetic foot ulcers this year. [4,5]


1. Woods L, Egger M, Lotte Gluud L, Schulz KF, Jüni P, Altman DG, Gluud C, Martin RM, et al. Empirical evidence of bias in treatment effect estimates in controlled trials with different interventions and outcomes: meta-epidemiological study. BMJ 336 (March): 601 – 605, 2008.
2. Montori VM, Guyatt GH. Intention-to-treat principle. Can Med Assoc J 165 (10): 1339 – 1341, 2001.
3. Graphpad. http://www.graphpad.com/quickcalcs/NNT1.cfm Accessed 11/3/2008.
4. Gregor S, Maegle M, Sauerland S, Krahn JF, Peinemann F, Lange S. Negative pressure wound therapy a vacuum of evidence? Arch Surg 143 (2): 189 – 196, 2008.
5. Bell G, Forbes A. A systematic review of the effectiveness of negative pressure wound therapy in the management of diabetes foot ulcers. Int Wound J 5 (2): 233 – 242, 2008.

© The Foot & Ankle Journal, 2008