| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
|
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Clinicians need a simple, fast, reliable, and inexpensive way of identifying the evidence base relevant to their clinical practice. It is often believed that the only way to identify all relevant evidence is to perform hand-searches of the literature to supplement computer searches; this is complex and labor intensive. However, most of quality randomized controlled trials cited in systematic reviews in pain medicine are listed in computer databases. We performed two studies to investigate the efficiencyin terms of sensitivity, specificity, and precisionof three computer search strategies: Optimally Sensitive Search Strategy, which is used by the Cochrane Collaboration; RCT.pt, a standard MEDLINE strategy; and DBRCT.af, which is a new single-line computer algorithm based on the assumption that double-blinded, randomized controlled trials would be indexed with "double-blind," "random," or variations of these terms in MEDLINE and EMBASE. DBRCT.af was found to be highly sensitive (97%) in identifying quality randomized controlled trials in pain medicine. The precision (ratio of randomized controlled trials to the number of nonrandomized trials identified) was 82%, and the specificity in excluding the nonrandomized controlled trials was 98%. We conclude that clinicians can now use DBRCT.af to update and conduct de novo systematic reviews in pain-relief research. IMPLICATIONS: Quality evidence about what is good clinical practice in pain treatment is buried in the medical literature among large quantities of other information. This article describes how any clinician with access to the Internet can identify those quality studies reliably, quickly, and inexpensively.
"It is surely a great criticism of our profession that we have not organized a critical summary, by specialty or subspecialty, adapted periodically, of all relevant randomized controlled trials." (1) Good medicine is based on scientific evidence. A comprehensive database of randomized controlled trials (RCTs) in medicine would provide the essential data for evaluation of current therapy and pinpoint areas in need of further research. The Oxford University Pain Relief Unit has created a database of RCTs in pain-relief medicine. This database was created by a refined MEDLINE search strategy from 1966 to 1990 and by an exhaustive hand-search of more than 1,000,000 pages from 40 biomedical journals, encompassing all the major anesthetic/analgesic journals published from 1950 to 1990 (2). This database appears in the Cochrane Library and is maintained by the Cochrane Pain, Palliative and Supportive Care Collaborative Group (3). The database is updated with electronic searches in MEDLINE, EMBASE, and the Cochrane Library by using a combination of free text words and subject heading terms. Additional reports are identified from reference lists of retrieved reports, review articles, and specialist textbooks. This database contains a large number of potential articles that constituted the starting point from which many systematic reviews have resulted. However, hand-searching is inefficient and is not feasible on an individual basis. The Oxford Unit concedes that more than 97% of the time spent by hand-searchers was in looking at irrelevant material (2). The large effort, time, and cost of this method restrict the method to large organizations, and even these find it difficult to find the resources to update reviews that have already been performed. This labor-intensive searching method has not changed significantly over the past two decades, in which advances in information technology have allowed unprecedented access to medical information. This, along with the increase in awareness of proper indexing of RCTs, suggests that it is time to explore the possibilities of computer search strategies as the sole method of identification of RCTs in the medical literature. However, reports on the sensitivities of computer strategies vary from 50% to 80% (4) compared with the methods involving hand-searching. It has also been found that the more sensitive the computer strategy is made, the less precise it becomes; estimated precisions range from 2% to 82% (4). The latter leads to large numbers of irrelevant articles concealing a smaller number of articles that might contribute information to a review. These observations make it difficult to justify preparing systematic reviews with a computer strategy alone. These reports, however, do not take into account the quality of the RCTs. It has been well demonstrated that compared with double-blinded RCTs (DBRCTs), the treatment effects are exaggerated if the RCTs are unblinded or if the concealment of randomization is unclear or inadequate (5). It is generally accepted that systematic reviews should be conducted with high-quality RCTs. On the basis of this premise, it is not so important to identify all RCTs, but rather to identify all high-quality RCTs. There is evidence that most quality RCTs are listed in computer databases. Knipschild (6) conducted an exhaustive search for vitamin Crelated RCTs and found 61 eligible trials, from which only 22 were computer listed. Interestingly, after grading the trials according to methodological soundness, Knipschild concluded that there were 15 good-quality trials and that all but one were computer listed. There is a similar pattern in systematic reviews pertaining to methods of pain relief. This has led to the investigations reported here on the efficiency of computer strategies in identifying these computer-listed RCTs. Two studies were performed to define the sensitivity, precision, and specificity of three computer search strategies: the Optimally Sensitive Search Strategy (OSSS), which is used by the Cochrane Collaboration; RCT publication type (RCT.pt), which is a standard MEDLINE strategy; and DBRCTs in all fields (DBRCT.af), which is a new single-line computer algorithm based on the assumption that DBRCTs would be indexed with "double-blind," "random," or variations of these terms in MEDLINE and EMBASE.
Three computer strategies were used to identify RCTs in pain-relief research. The OSSS is a comprehensive, 29-line computer strategy (Table 1). It is contained in the Cochrane Collaboration reviewers handbook (4,7). It was developed with the intention to cast a wide net to identify as many eligible RCTs as possible.
The United States National Library of Medicine introduced the MEDLINE field of RCT.pt in 1991 (Table 1) (8). The RCTs are identified by trained searchers and coded appropriately into MEDLINE. RCT.pt is the only computer search strategy that has been studied and described in print. RCT.pt is contained within the OSSS and DBRCT.af (detailed below) and contributes to the sensitivity of these two computer strategies when used in MEDLINE. The RCT.pt field does not exist in EMBASE, an important electronic resource that is often used to conduct systematic reviews. The advanced OVID search labels (Ovid Technologies© 20002001) within MEDLINE and EMBASE were used to develop a single-line computer algorithm (911): (double-blind$ or random$).af. This strategy assumes that DBRCTs will be indexed with "double-blind," "random," or variations of these terms. This algorithm (abbreviated to DBRCT.af) identifies specific terms with unlimited truncation (e.g., "double-blinded," "double-blinding," "randomly," "random allocation," "randomized," "randomised," "randomization," "randomisation," and "randomized controlled trial"). Conversely, DBRCT.af excludes study designs other than DBRCTs, RCTs, and double-blinded but non-RCTs and, hence, should not be indexed with the above-mentioned terms. Additional strategies will need to be developed to identify these other studies, which may be better to inform about rare events, diagnosis, or prognosis. DBRCT.af identifies these terms in all fields (.af). The fields of most importance are the title, abstract, subject, and publication type. The simple program shown in Table 1 creates a database of eligible RCTs in MEDLINE and EMBASE. Lines 2 and 3 limit the database to human articles; the same limitation phrases were used in OSSS. These three strategies were evaluated in two studies to determine their relative specificity, precision, and sensitivity in identifying RCTs in the field of pain medicine.
Study 1: Morphine.af Study
Hard copies of the 1000 articles were obtained. Two authors (TKFC and ET) read the methods section to classify the articles into RCTs or non-RCTs. Once this was completed, the OSSS and DBRCT.af were applied to the database containing these articles. The specificity and precision of each of the methods was then calculated as shown in the example for DBRCT.af in Table 2 (Lines 5 and 6).
Study 2: Comparison with Published Reviews
Hard copies of the original systematic reviews and the referenced RCTs were obtained. The referenced articles were first categorized into those listed in MEDLINE and EMBASE and those that were not. The computer-listed RCTs were then checked to see whether they were contained in the databases generated by the respective computer search strategies: that is, in Line 29 of Table 1 and in Line 3 of Table 1. These three databases were stored as citations in Reference Manager and Microsoft Excel files.
The sensitivity of each of the computer search strategies was calculated in two ways)
Finally, analyses that contained more than three RCTs were extracted from each review. The methods of analysis used by the Oxford Unit to draw their conclusions were repeated with the RCTs identified by the three computer strategies to see whether the same conclusions could be drawn. Regarding the quantitative analyses, the following data were extracted: the clinical setting, treatment groups, patient numbers, duration, mode and dose of treatment, mean and derived pain-relief outcomes (TOPAR, SPID, VASTOTPAR, VASSPID), and dichotomous pain-relief outcomes. By applying verified equations developed by McQuay and Moore (12), derived outcomes were calculated (50%maxTOTPAR) to facilitate calculations of relative risk, numbers needed to treat, and corresponding 95% confidence intervals by using a random effect model. Statistical significance was assumed in the relative risk when the 95% confidence intervals did not include 1 and in the numbers needed to treat when the 95% confidence intervals did not overlap. Regarding the qualitative analyses, the following data were extracted: the clinical setting, treatment groups, patient numbers, duration, mode and the dose of treatment, and pain outcomes. Trials were considered as positive or negative according to the original authors statistical estimate, with P < 0.05 considered as statistically significant. The treatment effect was considered to be present when positive trials outnumbered the negative trials. No distinctions were made between listed and nonlisted RCTs in this evaluation. Analyses containing fewer than four RCTs were deemed inconclusive.
Study 1: Morphine.af Study A total of 36,780 morphine-related human articles published between 1980 and 1996 were identified by the morphine.af search of MEDLINE and EMBASE (Line 4; Table 2). The period of 19801996 was chosen to correspond with the Oxford morphine-related RCTs, and this set of morphine-related articles contained all of the 32 computer-listed RCTs analyzed in the 2 Oxford reviews that studied morphine RCTs (13,14). Most of the morphine-related articles (4:1) were listed in EMBASE compared with MEDLINE. This proportion was preserved in the experimental sample with 800 articles selected randomly from EMBASE and 200 from MEDLINE. The random numbers were evenly distributed to 2.7% (SE ± 0.1%) of articles published per year. From the 1000 morphine-related articles, there were 97 RCTs and 903 non-RCTs (approximately 1 RCT per 10 articles identified; Table 3). The OSSS improved the precision to 32% (3 RCTs per 10 articles), and DBRCT.af improved the precision to 83% (8 RCTs per 10 articles). Both the OSSS and DBRCT.af were highly specific in excluding the non-RCTs. The OSSS excluded 705 (specificity 78%), while DBRCT.af excluded 886 (specificity, 98%). Conversely, the OSSS failed to exclude 198 non-RCTs compared with DBRCT.af, which failed to exclude 17 non-RCTs. Thus, this translates into 91% less work if one uses DBRCT.af to identify RCTs for a systematic review instead of OSSS, because it is better able to exclude non-RCTs. The precision and specificity of RCT.pt could not be compared with the OSSS or DBRCT.af because RCT.pt cannot be used in EMBASE and hence has a denominator of only 200 articles. Of the 17 non-RCTs that were incorrectly identified by DBRCT.af, 6 were reviews that discussed RCTs, 5 were double-blinded but nonrandomized trials, 4 were surveys that included a random sample, 1 was a letter that commented on an RCT, and 1 was a case study that was indexed with "double-blind procedure" in the key words section.
Study 2: Comparison with Published Reviews Fifteen reviews, nine pertaining to acute pain management, (1422) five to chronic pain management (19,2326), and one pertaining to both (27), fulfilled the inclusion criteria (Table 4). Eight reviews were excluded: one nonpeer-reviewed publication on the effectiveness of transcutaneous electrical nerve stimulation in chronic pain and two reviews that included nonrandomized and unpublished data (28,29), along with five unpublished reviews relating to epidural corticosteroids for sciatica, spinal cord stimulators for back pain, steroid injections for shoulder disorder, the use of dihydrocodeine in postoperative pain, and the use of topical capsaicin in chronic pain. A total of 288 clinical RCTs were referenced, of which 284 (99%) were listed in MEDLINE or EMBASE and 4 were unlisted (3033). The sensitivity in identifying all RCTs was 98%, 95%, and 65% for the OSSS, DBRCT.af, and RCT.pt, respectively (Table 4). The sensitivity in identifying MEDLINE- and EMBASE-listed RCTs was 99.6%, 96%, and 65% for the OSSS, DBRCT.af, and RCT.pt respectively (Table 4). The distribution of publication dates of those RCTs in consecutive 5-yr epochs is presented in Figure 1. This analysis shows that most of the RCTs used by the McQuay and Moore study (12) were contained within the more recent strata. The sensitivities of the 3 computer strategies in identifying computer-listed RCTs over the same 5-yr epochs are plotted in Figure 2; all three strategies improved with later epochs. It can be seen that OSSS was 100% sensitive for publications after 1970 and that DBRCT.af was >95% sensitive for publications after 1980, whereas RCT.pt remained <90% sensitive for publications in all epochs.
From the 15 systematic reviews, there were 30 analyses containing more than 3 RCTs (Table 5): 17 quantitative and 13 qualitative analyses. DBRCT.af was 97% sensitive in identifying 236 of the 244 MEDLINE/EMBASE-listed RCTs and was 96% (236 of 246) sensitive in identifying all the RCTs used by McQuay and Moore (12) in their reviews (range, 75%100%; mode, 100%; 21 of 30). When the same analyses performed by McQuay and Moore were applied to the RCTs identified by DBRCT.af, the conclusions drawn were not altered significantly. Twenty-one analyses (14 quantitative and 7 qualitative analyses) drew precisely the same conclusions because DBRCT.af identified all the RCTs analyzed by the McQuay and Moore. The remaining three quantitative analyses drew the same conclusions with minor, insignificant differences in the calculated relative risks and numbers needed to treat (17, 17, and 18). All but one of the six qualitative analyses drew the same conclusions with 85%95% of the original patient numbers (2, 15, 22, 27, and 27). In the remaining qualitative analysis, DBRCT.af identified three of the four original RCTs, and, hence, the analysis was deemed inconclusive because of the small number of trials identified (20). The data of the three quantitative and six qualitative analyses were extensive and can be viewed on our Web site: http://www.med.monash.edu. au/anesthesia./
The OSSS identified all of the computer-listed RCTs included in the 30 analyses. As a result, 28 of the 30 analyses drew precisely the same conclusions. The remaining two analyses drew the same conclusions with 85% of the original patient numbers (Table 4) (23,27). There were insufficient data to reproduce the analyses with the RCTs identified by RCT.pt because of its poor sensitivity.
Study 1: Morphine.af Study The first step in organizing a critical summary and periodic update of pain-relief research is to identify the relevant RCTs. Ideally, this should be done quickly and with high sensitivity and specificity. Morphine-related articles were used to test the precision and specificity of computer search strategies because morphine is one of the oldest and most commonly used analgesics worldwide. It is the "gold standard" with which all analgesics are compared, and there are many scientific publications pertaining to morphine. This study showed that 90% of the published morphine-related articles are non-RCTs. It would require a large amount of effort, time, and cost to exclude these by hand before a systematic review. To identify all eligible RCTs, it is necessary to read all of the extracted articles to exclude the non-RCTs. Computers are far more efficient in performing such monotonous tasks if the compilation of the databases can be relied on for correct classification. The efficiency of search strategies can be measured by calculating specificity and precision. Specificity (the ability to exclude non-RCTs) is a better measure than precision (the ratio of RCTs and non-RCTs identified). Although both measures depend on the number of non-RCTs identified, the number of RCTs identified, regardless of their quality, confounds the calculated precision. For example, if one calculates precision with all eligible RCTs as the numerator (as is the usual practice), then a strategy that identifies more poor-quality RCTs will appear more precise. Conversely, if one calculates precision with the high-quality RCTs that are normally included in the systematic reviews as the numerator, then all strategies are imprecise. To the best of our knowledge, we are not aware of the specificity of a search strategy ever being reported, because the number of articles excluded has not been reported. However it is the low specificity that prevents clinicians with limited resources from performing systematic reviews. Even in institutions with large resources, most are reluctant to update reviews already published because so much labor, time, and cost is required in separating the quality trials that can go into the analysis from those that need to be excluded. The specificity of DBRCT.af (98%) translates into substantial savings. It cost our department approximately $A10,000 to obtain the 903 non-RCTs identified in this study, to read them, and to exclude them by hand. Using the OSSS reduced this to approximately $A2,000 to obtain the 198 non-RCTs identified by that strategy, and this was reduced further to approximately $A200 for the 17 non-RCTs caught by using DBRCT.af. The effort and time spent to obtain, read, and exclude these articles are in the same proportions. Remembering that these estimates pertain to a random sample consisting only 2.7% of the 36,780 morphine-related articles, the actual savings are considerable. Having made these savings, it is important to know whether the results of the search identify enough quality RCTs, such that the subsequent systematic review can come to a meaningful conclusion. This was addressed by comparing the results of the computer searches with established high-quality systematic reviews in pain medicine.
Study 2: Comparison with Published Reviews The results of this study showed that most computer-listed RCTs were identified solely by computer strategies such as the OSSS (99.6%) and DBRCT.af (96%), but not with RCT.pt (65%). The OSSS and DBRCT.af were sufficiently sensitive, such that when the RCTs identified by them were used to reproduce the Oxford analyses, the conclusions drawn were representative of the conclusions published by Oxford Unit. This is a very important result and leads one to ask whether it is necessary to include all RCTs to conduct a systematic review. It is true that when one conducts a clinical trial, only a sample of the entire patient population is examined. Scientific principles of randomization and blinding reduce bias sufficiently so that the studied sample is representative of the entire population. In conducting systematic reviews, it may be sufficient to accept a representative sample of RCTs rather than to strive to capture and include all of these. The results from this study provide support for this suggestion. An important trend in the indexing of RCTs over the past 30 years was revealed in Figure 2, when the sensitivities of the three computer search strategies were compared. The sensitivity of RCT.pt from 1966 to 1997 showed the efforts by the US Library of Medicine and the Cochrane Collaboration in retrospectively indexing MEDLINE articles. In essence, the sensitivity of RCT.pt was static from 1971 to 1985 and improved thereafter. In comparison, the sensitivity of DBRCT.af progressively increased from 1966 to 1980 and remained more than 95% thereafter (Fig. 2). Although RCT.pt is contained in DBRCT.af (and OSSS) and contributes to its sensitivity, it is clear that RCT.pt did not contribute to the progressive increase in sensitivity over the 1974 to 1985 period. This period can be thought of as the "pre-Cochrane era," that is, before Cochrane criticized the medical profession for not using RCTs effectively. The progressive increase was more likely because authors who conducted RCTs recognized the need to index their publications with the appropriate terms. The DBRCT.af strategy was developed with the assumption that DBRCTs would be indexed with these terms. Despite the efforts in retrospective indexing, the sensitivity of RCT.pt did not reach 90%. In part, this may be because RCT.pt is limited to searching MEDLINE. It is also likely that retrospective indexing does not identify all RCTs. Other publication-type strategies, such as controlled clinical trial.pt and clinical trial.pt, were not investigated (8). These two strategies are similar to RCT.pt in that they rely on trained searchers to code the articles appropriately. Unlike RCT.pt, however, they are not contained in DBRCT.af. Because DBRCT.af was highly sensitive and specific, it did not seem relevant to investigate these two strategies. In conclusion, the two studies reported in this article demonstrate that the OSSS and DBRCT.af are highly efficient compared with the Oxford Strategy. Their routine use can lead to considerable savings in effort, time, and cost. In light of these results, we recommend that these computer strategies be used to periodically update the 15 Oxford reviews. The consistent efficiency demonstrated in these quality reviews of wide-ranging topics suggests that this may apply throughout the specialty of pain-relief research. We also believe that a sufficient number of quality RCTs can be identified by the OSSS or DBRCT.af to conduct de novo systematic reviews to give reasonably robust answers in the efficacy of pain-relief treatments. In considering poorly reported data, such as adverse events, one needs to be cautious in drawing conclusions with limited information. A reasonable approach may be to search with DBRCT.af and progress to more extensive strategies if more information is required. Further work is required to confirm that the conclusions drawn by using computer strategies alone are representative of the true clinical picture. The application of DBRCT.af in medical specialties other than pain relief would require validation. In ophthalmology, for example, the term "masked" is preferred over "blinded." In this regard, DBRCT.af would rely on identifying the term "random" or variations thereof to identify DBRCTs. In contrast to the OSSS, DBRCT.af was designed not to include all possible terms. It was intended that in maintaining the simplicity of DBRCT.af, the message to improve the indexing of articles filters through as researchers read and conduct reviews with DBRCT.af.
We acknowledge the Oxford University Pain Relief Unit for conducting the systematic reviews, without which this article would not be possible. We thank OVID Technology for their technical assistance. We also thank Professor Chris Silagy (since deceased) and Steve McDonald of the Institute of Public Health & Human Services Research, Monash University, Monash Medical Centre, for their invaluable support and advice.
This article has been cited by other articles:
|
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|