Anesth Analg 2003;97:1549-1551
© 2003 International Anesthesia Research Society
LETTERS TO THE EDITOR
The Importance of Methodology
L. Mario Vaquero, MBBS,
Francisco J. Sánchez-Montero, MBBS, and
Clemente Muriel, MD
Servicio de Anestesiología y Reanimación, Hospital Universitario de Salamanca, Spain
To the Editor:
We have read Schumann et als article (1) and have several questions regarding their methodology and results.
First, the formulation of their primary objective doesnt seem very clear. Three elements are needed to formulate an objective (2): first, the study factor, which corresponds to the exposition or interest intervention; second, the criterion of evaluation or variable of response that the effect or the association are to be measured with; and finally, the study population or individuals with whom measurements will be done. Likewise, from their primary objective, which "was to incorporate multimodal analgesia including preemptive antinociception into the analgesic regimen of patients undergoing open gastric bypass surgery," we deduce it should be expressed by the content of the work and according the criteria above mentioned as "to determine if the postoperative multimodal analgesia (study factor) produces better results than the classical analgesia (study factor) on the pain control (evaluation criterion) of the patients undergoing gastric surgery for obesity treatment (population)."
Second, in the Methods section in their article, we cannot find the estimated number of individuals needed to be included in the study to determine the desired effect, or, citing an example, an important previous calculation to obtain results (3). We do see an attempted objective in that the "primary end-point of this study was to assess the analgesic efficacy of a multimodal approach compared with the standard unimodal approaches in patients undergoing gastric bypass surgery."
In the Results section, the authors point out significantly that "VAS pain intensity scores between groups differed in the PACU and at 36 h after surgery" on analyzing the visual analog scale (VAS) variable. But what is the value that they expected to find as a difference: 6 points, at 0 h between the group of analgesia with morphine (Group C) and the group with epidural analgesia (Group B), or 1 point, which is the difference after 36 h between the group with local analgesia (Group A) and the one with epidural analgesia (Group B)? For which of the two values is the sample size done? And therefore, what is the clinically important difference? Although a difference of 1 point is statistically significant, does it have a clinical importance in this case? These questions are difficult to answer if the calculation of the sample size is not previously done and the expected or clinically relevant result is estimated.
On the other hand, the authors subdivided Group A into patients who responded and those who didnt, and they applied statistical techniques to compare the results of a subgroup where the average sample size is 10 individuals, at each time period. They found significant differences among groups that range from 0 to 6 points. Would it not be more suitable, considering the minimal power of a test with such small sample sizes, to correctly identify this difference as such and not to apply statistical tests that only have a relative validity and that can confuse the reader by saying the results are "statistically significant"?
The authors have also determined differences generically among analgesia groups, and statistical tests have been applied one after another to see whether one variable or the other was different in a significant way. We cannot see in any case that they have taken into account the so-called "submachine gun effect," which increases alpha error as many times as the test is applied and that can be corrected by the Bonferroni or Newman-Keulss methods (4). If these methods had been used, this could have been avoided.
Finally, as we have commented above, it would be rash to conclude that the infiltration of the incision can be an effective component of a multimodal analgesia for obese patients undergoing an open gastric surgery. These results are only shown for a small number of patients, and the real validity of the results and their clinical importance are ignored; therefore, it seems to us too risky to highlight that as the first conclusion of the work.
Despite our comments, which we hope can help authors to improve the quality of articles published in this publication, we would like to sincerely congratulate the authors for the originality of their work.
References
- Schumann R, Shikora S, Weiss JM, et al. A comparison of multimodal perioperative analgesia to epidural pain management after gastric bypass surgery. Anesth Analg 2003; 96: 46974.[Abstract/Free Full Text]
- Argimón Pallas JM, Jiménez Villa J. Métodos de investigación aplicados a la atención primaria de salud. Barcelona: Ediciones Doyma, 1991.
- Wittes J. Sample size calculations for randomized controlled trials. Epidemiol Rev 2002; 24: 3953.[Free Full Text]
- Martín Andrés A, Luna Castillo JD. Bioestadística para las ciencias de la salud. Barcelona: Ediciones Norma, 1990.
Response
Roman Schumann, MD,
Scott Shikora, MD,
Jocelyn M. Weiss, MPH,
Heinrich Wurm, MD,
Scott Strassels, PharmD, and
Daniel B. Carr, MD
Tufts University School of Medicine and Tufts-New England Medical Center, Boston, MA
In Response:
We thank Vaquero et al. for their interest in our paper (1), their detailed and thoughtful comments, and their compliments about its originality. We write now to address their concern about the methodology of this trial. We regret their difficulty in following our presentation and welcome the opportunity to reiterate the salient features of our study.
With respect to the first question posed, the three arms of our study evaluated two interventions (Groups A and B) for pain control after open gastric bypass surgery and compared them with a standard unimodal approach (controls, Group C). To quote from our paper, "The primary end-point of this study was to assess the analgesic efficacy of a multimodal approach compared with the standard unimodal approaches in patients undergoing gastric bypass surgery. To do so, we assessed visual analog scale (VAS; 010) pain scores every 6 h at rest, postoperative morphine consumption in the two groups (A and C) treated with PCA, patient satisfaction, and length of stay." It is unclear to us how we could better state the objective, intervention, or response variables. As should be apparent from Figure 1 in our article, multimodal analgesic regimens appeared to produce better results than unimodal, traditional interventions with respect to the response variable of VAS pain intensity.
As to their second question about sample size, we agree with the assertion of Vaquero et al. that power analyses and sample size calculations are indispensable. We did not describe this preliminary portion of our study because of space limitations and welcome the opportunity to do so now. In our power analysis, the primary end-point for the sample size calculation of this study was the average pain score (VAS at rest) over the first 24 h following the operation. We carried out an initial, pretrial sample size calculation based on observations of 16 patients with various postoperative analgesic regimens following gastric bypass operations in our institution. These data showed that mean VAS pain scores during the first 24 h had a standard deviation of 1.6. At the time we were aware of an emerging literature on the patient-rated significance of declines in pain intensity. Farrar had reported that patients with chronic cancer-related (2) or non-cancer pain (3) of moderate intensity, defined as from 4 through 6 on a 010 scale, deemed a one third decline in their pain intensity to be meaningful. Such a decline could be as small as 1.3 points if the initial pain intensity is 4. We have subsequently observed that patients with acute postoperative pain report a similar decline in pain intensity to be meaningful (4). A standard power calculation formula (5) to determine the number of patients to allocate into each treatment arm is:

|
Taking , the standard deviation, to be 1.6; the minimum difference to be detected between the mean value for pain intensity in the intervention versus control groups as 1.3; and f( ,ß) corresponding to an 80% power to detect change at the 0.05 level of significance (= 7.9 from Table 9.1 in reference 5), one finds that approximately 24 patients are required for enrollment in each group. If we conservatively estimate that we would require accrual of 25% higher numbers to overcome attrition due to all causes, we conclude that at least 30 patients per arm are needed. If anything, when one considers that analgesic trials are commonly underpowered (6), we feel that our approach was, statistically speaking, on the conservative side.
Next, Vaquero et al. suggest that the presentation of a P-value for the subgroup analysis of responders from Group A, compared with Groups B and C, has only "relative validity". The Kruskal-Wallis test statistic that we used, however, is based on the 2 statistic, that is a valid measure (7) for determining the significance level when the sample size for each of the groups being compared is 5 or greater. Our data met this criterion. The authors further recommend that we should have adjusted the P-values to account for multiple comparisons. Comparisons of VAS score among the three treatment groups were performed using ANOVA or the Kruskal-Wallis test, that inherently account for multiple comparisons. Rather than individually compare results from pairs of the treatment groups, we obtained the overall P-values for each time point and present the corresponding graphs for the readers interpretation. We disagree with the implication of Vaquero et al. that multiple statistical methods were used to hunt for statistical significance. The tests used for all outcomes were well defined in the study protocol, which was submitted to and approved by our institutional human investigation review board prior to the start of the trial. We used and presented the ANOVA and Kruskal-Wallis tests for each time point separately. We could have evaluated the observations at all five time points in a single regression model, which would have obviated the need for multiple comparisons, but that would have altered the interpretation of the outcome measure. Applying the Bonferroni method of adjustment would have overcompensated for the potential inflation of P-values since this already conservative method assumes that the outcome measures are independent (8).
Whether a clinically important difference exists based on initial trial data is not always clear. P-values that are statistically significant are not always clinically significant. However, based upon the results of our study, we stopped using postoperative epidural analgesia in our institution for open gastric bypass surgery, when this was still the predominant mode of analgesic management. Instead, we have shifted to local wound infiltration for all open and laparoscopic bariatric procedures with good subsequent success from a clinical practice (not research) viewpoint.
The feasibility and appropriateness of subgroup analysis within clinical trials has been well recognized and debated in the scientific literature (9,10). As is frequently the case in clinical trials, we tested certain hypotheses that were clearly stated in advance of the trial, but then found some patterns in the data that we did not anticipate, that interested us. We ourselves, in our paper, called for "reexamination [of the findings] in a larger-scale replication study." In summary, we agree with Vaquero et al. that methodology is at the core of all meaningful science and look forward to their and others reexamination of our findings in future studies.
References
- Schumann R, Shikora S, Weiss JM, et al. A comparison of multimodal perioperative analgesia to epidural pain management after gastric bypass surgery. Anesth Analg 2003; 96: 46974.
- Farrar JT, Portenoy RK, Berlin JA, et al. Defining the clinically important difference in pain outcome measures. Pain 2000: 88: 28794.[Web of Science][Medline]
- Farrar JT, Young JP, LaMoreaux L, et al. Clinical importance of changes in chronic pain intensity measured on an 11-point numerical pain rating scale. Pain 2001; 94: 14958.[Web of Science][Medline]
- Cepeda MS, Africano JM, Polo R, et al. What decline in pain intensity is meaningful to patients with acute pain? In: Dostovsky JO, Carr DB, Koltzenburg M, eds. Proceedings of the 10th World Congress on Pain, Progress in Pain Research and Management. Vol. 24. Seattle: IASP Press, 2003: 6019.
- Pocock SJ. The size of a clinical trial. In: Clinical trials: a practical approach. New York: John Wiley & Sons, 1983: 12341.
- Goudas LC, Carr DB, Bloch R, et al. Management of Cancer Pain. Evidence Report/Technology Assessment No. 35.(Prepared by the New England Medical Center Evidence-Based Practice Center under Contract No. 290-97-0019). AHRQ Publication No. 02-E002. Rockville, MD: Agency for Healthcare Research and Quality. October 2001.
- Fischer LD, van Belle G. Biostatistics: a methodology for the health sciences. New York: John Wiley & Sons, 1993.
- Rothman KJ. No adjustments are needed for multiple comparisons. Epidemiology 1990; 1: 436.[Medline]
- Oxman AD, Guyatt GH. A consumers guide to subgroup analyses. Ann Intern Med 1992; 116: 7884.
- Assmann SF, Pocock SJ, Enos LE, Kasten LE. Subgroup analysis and other (mis) uses of baseline data in clinical trials. Lancet 2000; 355: 10649.[Web of Science][Medline]
|