| ||||||||||||||
|
|
|||||||||||||
The Israeli Board Examination Committee in Anesthesiology, The Israel Center for Medical Simulation (MSR), Sheba Medical Center, Tel Hashomer, Israel; Tel Aviv University Sackler School of Medicine and the National Institute for Testing & Evaluation, Jerusalem, Israel
Address correspondence and reprint requests to Haim Berkenstadt MD, Director of Neuroanesthesia, Department of Anesthesiology and Intensive Care, Deputy Director, The Israel Center for Medical Simulation, Sheba Medical Center, Tel Hashomer, Israel. Address e-mail to berken{at}netvision.net.il.
| Abstract |
|---|
|
|
|---|
| Introduction |
|---|
|
|
|---|
In an attempt to assess to what extent residents in anesthesiology are competent in the "shows how" stage, some institutions have introduced the use of high-fidelity medical simulation. The inter-rater reliability (4) and construct validity (5,6) of simulation-based scenarios have been demonstrated, and a multiinstitutional study has validated simulation-based scenarios as an effective tool for the evaluation of residents (7).
Nevertheless, an international survey found that only 7%14% of simulation centers are using advanced simulation for competence evaluation (8), and there have been no reports of the implementation of simulation-based performance assessment in high-stakes board examinations in anesthesiology. Worldwide, board examinations in anesthesiology are predominantly based on the traditional paradigm of oral examinations and multiple-choice questionnaires. For the most part, these examinations evaluate the cognitive aspects of the profession but cannot appraise performance and practical skills.
Acknowledging the fact that the Israeli Board Examination in anesthesiology lacked a performance evaluation element and that this element had not been a substantial part of training programs in the country, the Israeli Board of Anesthesiology Examination Committee (IBAEC) decided to explore the potential of adding a simulation-based objective structured clinical examination (OSCE) component to the Board Examination process. We describe the unique process whereby OSCE has been incorporated into the Israeli board examination in anesthesiology.
| Methods |
|---|
|
|
|---|
The content of the examination was defined and developed according to the steps and criteria recently described by Newble (9). First, given the relative benefits of medical simulation and the capabilities of the available simulation platforms in anesthesia, clinical conditions that residents nearing the end of their training are required to handle competently were defined on the basis of expert opinion. The second step of the process involved the definition of tasks for each of the clinical conditions. The tasks were selected to represent minimum requirements, which were decided upon on the basis of more than 80% consensus among the members of the examination task force using a variation of the Delphi technique (7).
In the third step of the process, the tasks were incorporated into 5 15-min, hands-on simulation-based examination stations in the OSCE format:
1. Trauma management: An emergency room environment and Sim-Man (Laerdal, Stavenger, Norway) simulator were used and examinees were expected to evaluate and treat a trauma casualty according to advanced trauma life support (ATLS) guidelines.
2. Resuscitation: As in the trauma station, using a Sim-Man (Laerdal, Stavenger, Norway) simulator, examinees were expected to evaluate and treat a patient according to advanced cardiac life-support (ACLS) guidelines.
3. Operating room crisis management: A full-scale simulated operating room and the High Fidelity Patient Simulator (METI, Gainesville, FL) were used. In the scenario, examinees were called into the operating room to help a junior resident encountering a problem during anesthesia; for example, hypertension after the induction of general anesthesia.
4. Mechanical ventilation: Using a ventilator and an artificial lung, the examinees were asked to adjust the mechanical ventilation in response to changes in lung compliance or the results of arterial blood gases. In the Israeli medical system, these tasks are performed by physicians, not by respiratory therapists.
5. Regional anesthesia: Using a standardized/simulated patient (a role-playing actor) the examinee had to demonstrate familiarity with the relevant surface anatomy, place of needle insertion, needle direction, and amount of local anesthetics injected while performing regional anesthesia block. Complications induced by the procedure, including convulsions and pain during the injection of local anesthetics, were also demonstrated by the actor.
The Scenarios
Two alternative 15-min scenarios were developed for each examination, and their difficulty was compared during analysis of the examination results. The scenarios were piloted on junior attending anesthesiologists before their implementation in the actual examination. To overcome the limitations of the simulation platforms, the "patient's" verbal responses were conveyed, if necessary, by the examiners, and information on the quality of breath sound and/or arterial pulse was provided by the examiners according to the scenario scripts and the examinees' actions. For example, information on neck vein distension was given only if the examinee declared, "I am looking for neck vein distension." A sample excerpt from a scenario is presented in Appendix 1.
The assessment of examinee performance in each scenario was based on a checklist comprising 1220 items (example in Appendix 1). The checklist was developed according to a rigid format, on the basis of "done"/"not done," although a set of criteria for a well-performed task was provided to include a measure of quality. The checklist included tasks and sequencing of performance but no non-technical skills. All actions included in the checklist were weighted equally, except for critical actions. During the examination, two independent examiners completed each checklist, and the examination committee collected the data for further analysis. Examinees received a "pass" score on the scenario if they successfully performed 70% of the station's checklist items, including all critical actions/items. Examiners were also asked to grade the examinees' decision making and situational awareness, as well as their manual abilities, independently and holistically on a scale of 1 to 4, with 1 indicating insufficient performance and 4 indicating excellent performance.
Examinee Orientation and Preparation
Information on the examination and the list of tasks were sent to the examinees 6 wk in advance of the examination. Standardized and structured exposure of examinees to the examination environment, the simulation devices, and scenarios, was conducted 34 wk before the examination. This orientation took place at the examination site (The Israel Center for Medical Simulation), lasted 2 h, and offered each examinee hands-on practice in a simulated demonstration scenario. In addition, errors that occurred frequently during test periods were reported to the directors of all residency programs immediately after every test period and to the candidates 6 wk before the next test period.
Preparation of Examiners: Training of Raters
The raters were senior anesthesiologists, recommended by their department chairs and selected by the examination committee. Most of the raters did not have any previous experience in medical simulation but received standardized and structured exposure to the examination environment, the simulation devices, scenarios, and the assessment checklists before the examination. In addition, raters were instructed to help the examinees by providing them with relevant information on findings during physical examinations and helping them to prepare medication or perform resuscitation. Help or information was provided only when the request was specific and clear. Such help was restricted to information regarding the patient's symptoms and certain physical examination variables that were not self-evident from the mannequin (i.e., breath sounds, arterial pulsation).
Evolution of the Examination:
The examination has been administered 4 times: April 2003 (34 examinees), October 2003 (26 examinees), April 2004 (21 examinees), and September 2004 (23 examinees). New scenarios were developed for each of the administrations, and the tasks incorporated in each scenario were retained without major changes.
The first two periods of the examination (April and October 2003) were defined as an interim period. In this period, 2 domains of the examination, trauma management and resuscitation, made up 20% of the final board examination pass/fail results, whereas the other 80% were based on the traditional oral examination. The results from the other three domains were used for supporting a pass/fail decision, mainly in cases of borderline results on the oral part of the examination.
Subsequent to this interim period, passing the simulation-based OSCE examination became a prerequisite of the traditional oral board examination. In the third test period (April 2004), candidates were examined in 4 of 5 domains, and in the last test period (September 2004) all 5 clinical domains were included. In the last test period, the candidate failed an examination domain if both examiners evaluated the holistic performance as insufficient, regardless of the checklist score.
The checklists completed by the examiners were analyzed using SAS software (SAS, Cary, NC). For each item in each of the scenarios, the error rate, incongruence rate and mean difficulty were calculated. For each examinee and for each scenario, the proportion correct value was calculated for all items included in the checklist (Total), for the critical items included in the checklist (Critical), and for the general evaluation (Mean general).
Using Spearman's correlation, the correlation between the proportions of correct items across all items, the proportion of correct critical items, and attainments on the global rating, were calculated. The inter-correlations among the five simulation stations for the proportion of correct scores and global rating scores were calculated as well. The correlation between the OSCE component and the results of the oral board examination were assessed for the third and fourth test periods. The internal consistency of the scenarios was assessed using Cronbach's
statistics. An overall Kappa (inter-rater agreement coefficient) was also calculated.
The examinees completed feedback questionnaires regarding the difficulty of each of the scenarios and their subjective ability to express their knowledge in comparison with conventional oral examinations. The distribution of answers was calculated.
| Results |
|---|
|
|
|---|
For the 104 examinees who participated in the 4 test periods, the correlation between the total examination score and the score for the critical items was 0.54 (P < 0.001). The correlation between the total score and global rating was 0.76 (P < 0.001) and between the critical items and the global rating it was 0.48 (P < 0.001).
The inter-rater correlations for total, critical, and global scores were: 0.80, 0.81, and 0.75, respectively. The overall inter-rater Kappa agreement coefficients were 0.71, 0.76, and 0.62 for the total, critical, and global scores, respectively. The inter-correlations among the 5 OSCE examination stations were significant (P < 0.01) only between trauma and ventilation for the total score (r = 0.31; n = 63) and between resuscitation and regional or operating room for the global score (r = 0.42 and 0.29; n = 64 and 104, respectively). The internal consistency of the scenarios assessed using Cronbach's
coefficient was fairly low (0.350.45 for 4 stations from different examination terms).
For the fourth examination period (n = 17), the correlation between the total simulative OSCE examination score or the mean OSCE global rating and the success rate in each of the 8 different subjects of the oral board examination did not reach statistical significance.
According to the subjective questionnaire, most participants found the examination station difficulty level reasonable or difficult (Fig. 1a) and most of them preferred this method to a conventional oral examination (Fig. 1b).
|
| Discussion |
|---|
|
|
|---|
A major part of the examination development process was the standardized preparation of the examiners. The involvement of simulation experts inspired the decision to involve the examiners as participants in the simulation instead of leaving the rating to observers. The decision was based on the experience that rating performed by standardized patients is as good as faculty rating (11) and was supported by the examiners, who preferred to be actively involved and helped to compensate for the strict checklist instructions.
Unlike prior investigations (12), we describe an examination process and not a prospective controlled study. As a consequence, the results presented are influenced by the number of participants in each of the examination terms, the need for more than one scenario in each term and the need to change the scenarios from term to term. The limited number of participants in the examination, together with the other confounding variables, led to a low reliability estimate (not including inter-rater reliability), which was accepted by IBAEC as an inherent part of the examination. Reliability is a measure of the reproducibility or consistency of a test and represents the consistency of candidates' performance within and across cases. To assess this variable more accurately, generic tasks from various clinical conditions and scenarios should be incorporated, data from more examination periods should be collected, and the number of stations should be increased.
Even with these statistical limitations, psychometric evaluation conducted by experts from the National Institute for Testing and Evaluation was a major contributor to the objectivity of the examination and demonstrates the value of cooperation with experts in this field. For example, the difficulty of the various scenarios was assessed to ensure "inter-scenario" reliability within a given domain, which is critically important in high-stakes examinations. Moreover, incongruencies between examiners highlighted inadequate definition of accepted performance, leading to improvements being made in the checklists for future examinations. The incidence of common mistakes performed during the examination was calculated and information was shared with the chairs of the training programs and the examinees themselves.
Other valuable psychometric information included the correlation between the different assessment variables, the inter-correlation between the stations, and the correlation between the OSCE-based examination and the conventional oral board examination. The correlation between the proportion of correct items across all items included in the checklist and the global rating (0.72) supported the conclusion that the evaluation techniques are similar but not identical. Similar correlations between a checklist scoring system and global scoring system have been demonstrated in anesthesia-related scenarios (11). The low inter-correlations among the 5 stations support the conclusion that this examination has a limited degree of generalizability. This limitation might be overcome by increasing the number of stations but will also increase the cost of the examination process.
Comparison of success rates in the newly developed OSCE and the conventional oral board examination demonstrated a low correlation between the 2 modalities. Previous publications described the correlation between OSCE and written knowledge tests to be as high as 0.72 (13). Others documented the correlation between simulation-based assessment and written tests at 0.19 (14), between simulation-based assessment and faculty assessment (0.37), written examinations (0.44), and mock oral examinations (0.47), in the case of residents in anesthesiology (7). Thus, these mediocre correlations are commensurate with the assertion that different examination modalities assess different aspects of performance and, yet, are related to each other. Recently, the European Board Vascular Examination incorporated different complementary assessment modalities, including evaluation of technical skills using a model of the sapheno-femoral junction, clinical case analysis, scientific evaluation of a publication, and overall clinical experience (15).
The model presented in this manuscript is not necessarily feasible for implementation in other countries. The large number of candidates that the American Board of Anesthesiology examines annually renders OSCE logistically problematic. However, the National Board of Medical Examiners in the United States has recently introduced the simulation-based clinical skills examination (16), reflecting a major shift in the field of medical accreditation and licensure toward acknowledging the crucial role of performance assessment as an important component of professional accreditation.
The process of incorporating OSCE-driven modalities in the certification of anesthesiologists in Israel is still incomplete and continuous evaluation and assessment is being done. We hope that this new format of examination will play formative (training) and summative (testing) roles, involving the anesthesiology board, anesthesia departments, the participating examiners and the examinees. The examination development process induced a critical appraisal of the current training and assessment paradigm and led to exploration, definition, and prioritization of the critical clinical skills expected from a residency graduate. The examination also provided a rare glimpse at the authentic products of Israeli residencies, highlighting areas of strengths and weaknesses that could serve as guidelines to future modifications in the residency curriculum and practice.
The present process may evolve in the future not only as a constructive form of feedback for residency programs and means of establishing simulation-based training as part of the national residency curriculum but also toward the adoption of full-scale simulation-based accreditation. Future changes might also include the assessment of communication skills, leadership, and teamwork in the operating room (17).
The authors thank all members of the Israeli Board Examination Committee in Anesthesiology and members of the Task force of the Israeli Board Examination Committee in Anesthesiology
| Appendix 1: Example of a Resuscitation Scenario Component and its Checklist |
|---|
|
|
|---|
The examiner is a junior intern in the emergency room who calls on the anesthesiologists for help. The patient is a 70-yr-old male with known heart failure admitted to the emergency room (ER) with shortness of breath.
The Simulator
Sim-Man simulator, in the sitting position, is connected to a monitor including noninvasive arterial blood pressure, electrocardiogram (ECG) and pulse oximeter. One IV line is inserted; no oxygen is given.
Scenario
Upon the anesthesiologist's (the examinee's) arrival at the ER, the examiner presents him/herself as a junior intern and the other examiner as an ER nurse. The examiner gives the anesthesiologists the patient's medical history and refers him/her to the monitor. On the monitor: heart rate 120 bpm, oxygen saturation 90%, and arterial blood pressure, measured 10 min previously, at 150/100. Respiratory rate is 26.
Expected Performance and Examiner Response:
1. Address the patient: the patient responds and complains of shortness of breath.
2. Physical examination: the examiner will provide the information that the neck veins are dilated and rales are present on both lung fields, in accordance with the anesthesiologist's performance.
3. Give oxygen: The "nurse" (an Examiner) will give oxygen and set up the oxygen flow on request.
4. Ask for 12-lead ECG and chest radiograph: The examiner will give the anesthesiologist both items on request and ask for interpretation.
Part of a real checklist used by the examiners: scoring is binary according to the instruction given in the "comments."
|
| Footnotes |
|---|
| References |
|---|
|
|
|---|
This article has been cited by other articles:
![]() |
W. Dunn and J. G. Murphy Simulation: About Safety, Not Fantasy Chest, January 1, 2008; 133(1): 6 - 9. [Full Text] [PDF] |
||||
![]() |
G. K. Lighthall and J. Barr The Use of Clinical Simulation Systems to Train Critical Care Physicians J Intensive Care Med, September 1, 2007; 22(5): 257 - 269. [Abstract] [PDF] |
||||
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|