On April 6, 2009 the National Registry of EMTs (NREMT) was informed by the training division of the Washington DC Fire Department of a possible examination compromise by some of their personnel occurring at the College of Southern Maryland in LaPlata, MD. The NREMT immediately contacted Pearson VUE, the NREMT examination delivery contractor, and an investigation was undertaken by all parties. This report summarizes the outcome of the investigation jointly conducted by the DC Fire Department, the National Registry of EMTs, Pearson VUE, and the College of Southern Maryland.
Certification and licensure in Emergency Medical Services (EMS) are designed to protect the public by assuring practice by only those possessing the knowledge, skills and abilities to deliver safe and effective care. Proper patient care is the responsibility of every entity that has a role in the delivery of EMS. In Washington DC, this requires coordinated and cooperative oversight and enforcement by many agencies including the Washington DC Department of Health, the city administrators and council Members, National Registry of EMTs and most importantly those who deliver the service, the DC Fire Department, its administration and medical direction and the individual care givers themselves; the firefighters. When one link in this chain of responsibility fails, citizens of a community will question effective delivery of EMS. It isnotthe intent of this report to analyze past events or past delivery of EMS in the District of Columbia but to focus on an investigation regarding allegations of "cheating" on the National Registry of EMTs cognitive (computer based) examination. The authors of this report, the staff of the National Registry of EMTs, will first focus on the investigation completed, report the findings and explain how computer adaptive testing provides security, uses data forensics, and other processes to assure that only valid certification outcomes are reported to licensing agencies.
It is important for the public to know this investigation was initiated by the District of Columbia Fire Department. Great credit should be bestowed upon them for bringing this to the attention of the NREMT. The DC Fire Department, under the leadership of Assistant Chief Milton Douglas of the Internal Affairs Department, completed a thorough investigation that resulted in a lengthy report. The District of Columbia Police Department investigated every rumor spread about any candidate suspected of being involved and those candidates were subsequently interviewed by police officers. Among the activities accomplished by the police officers were:
- Visits to the test center to interview test proctors and review the physical testing facility to better understand the process and security measures taken by the College of Southern Maryland.
- The interviewing of 16 firefighters who had knowledge of the "cheating" allegations and/or were reported by others to have been "cheating" on the NREMT examination.
- Coordinating findings with Pearson VUE Examination Security Officials and NREMT Officials.
- Completion of a lengthy report outlining specifics of all activities accomplished during the investigation.
Here are a few direct quotes from the report:
- "Rumors of cheating on the NREMT exam and discovery of the "cheat sheet" in a Training Academy textbook resulted in this investigation."
- The "cheat sheet" is basically just a sheet with notes on it. (NREMT staff who are familiar with all items in the test bank confirmed these notes were not directly connected to NREMT items and were not an attempt to reconstruct test items).
- "This investigation sought to determine if any illicit usage of a "cheat sheet" or any other materials occurred by DCFEMS (District of Columbia Fire/EMS) members sitting for the NREMT exam, particularly at the LaPlata testing site. No direct evidence of any such illicit use was uncovered." This was confirmed under separate investigations by Pearson VUE and the NREMT. Although some "cards" were found in December of 2008, these cards were found behind a specific computer after the data center closed. Only one candidate was seated at that computer during the entire day and that candidate failed the attempt of the examination. The cards were destroyed by the computer center and never mailed to the NREMT for review. It is clear that some irregular activity did occur at this center but we were unable to detect any evidence that passing scores were obtained by cheating.
- It is the finding of this investigation that no widespread cheating occurred by DCFEMS members on the NREMT exam administered by Pearson VUE at the LaPlata Testing Center on the Campus of the College of Southern Maryland."
Simultaneously, while the DC Police Department was conducting its investigation, the NREMT and Pearson VUE also conducted separate and independent investigations. Pearson VUE completed the following activities and found these outcomes:
- All authorized test proctors at the College of Southern Maryland were interviewed. No irregular activity was reported other than finding "note cards" during clean-up following one day's administration of examinations in December of 2008 as mentioned above.
- Signatures of all DCFEMS members who registered and took more than one test were reviewed. Results indicated hand written signatures for first, second and any subsequent attempt by any candidate matched all other attempts by the same candidate.
- Pearson VUE candidates, who took tests that were not EMS related, were surveyed. No person reported seeing any irregular behavior by DCFEMS personnel, any proctor, or any other person at the test center.
- All testing procedures at the LaPlata test center were reviewed in accordance with Pearson VUE requirements. No irregular behavior or violations of test procedures were found.
Pearson VUE and the NREMT jointly reviewed examination data regarding all candidates from DCFEMS as well as other EMS candidates who tested at either LaPlata or other Pearson test centers near or around the Washington DC area. This review included:
- Pass/fail results of all attempts of the examination taken by DCFEMS members and others at the LaPlata test center and comparison of those with national and regional test center results. It was determined the pass/fail outcomes of DCFEMS personnel who tested at LaPlata indicated a higher failure rate on the examination when compared to DCFEMS personnel testing at other locations. If efforts to "cheat" at the LaPlata center did occur, the results indicated the efforts failed.
- Mean ability scores for all candidates from DCFEMS who took the examination. (Mean ability score is the best statistic used to determine outcome on the test from one version or attempt to another.) There were no dramatic changes in mean ability scores except for one candidate from DCFEMS who will be discussed later in this report.
- Test behaviors of all candidates from DCFEMS who took the examination. (Test behavior relates to numbers of seconds it takes a candidate to choose an answer for any individual item on the test. Test behavior should be "normal," meaning if the national average time to answer a specific item is 33 seconds then candidates who are beyond a standard deviation of that time per item can be identified and investigated. The analysis of time-per-item can be compared among DCFEMS members and between DCFEMS members and those who are from other departments in the DC area, region, state or nation. Time to answer items can be compared between version test banks and between individual attempts on the examination. So time per item can be compared from first, second and subsequent administrations of the examination by candidate, test center, education program, state or the nation. When the time per item is greatly exaggerated by a candidate on one item compared to thousands of others who answered the same item, an assumption can be made that some irregular behavior could have occurred when that candidate answered the item. Perhaps the candidate was looking up the answer via some outside source i.e. finding it on a cheat sheet or in a textbook smuggled into the test center; maybe the candidate fell asleep (which we suspect happens when one spends up to 1,300 seconds to answer a single item.) After reviewing tens of thousands of spread sheet result lines for all DCFEMS members and representative samples from others across the nation, only one DCFEMS candidate was determined to have irregular test behavior. This is the same candidate who had an unusual improvement in mean ability score and will be discussed later in this report.
- The NREMT, in conjunction with Pearson VUE, surveyed all EMS testing candidates who tested at LaPlata during the same time as DCFEMS personnel. No candidate for EMS certification from departments who were not members of the DCFEMS reported seeing any irregular behavior at the LaPlata test center
A close review of the single candidate who demonstrated irregular test behavior and a rise in mean ability level was conducted with the following outcomes:
- The candidate took the examination four times in order to pass. The first time the candidate took the examination the average item response time was 83 seconds. Eighty-three seconds per item is outside of the normal time and indicates irregular test behavior, such as slow reading, doesn't know content, not confident in answers, etc. The candidate failed the examination with the minimum number of items (didn't do well on the examination).
- The candidate took the examination a second time. On the second attempt the candidate was subjected to an entirely new test bank of items. (Each test bank has 1,100 live items for scoring and 400 pilot items). The average time the candidate took for each item was 38 seconds, (near the national norm). The candidate again failed the examination over an entirely new test bank.
- The candidate took the examination for a third attempt. Again, because the candidate waited a long time between attempts, the NREMT had already switched the entire test bank again so this candidate had no opportunity to see the same items that were on the first and second attempts. During this attempt the candidate's item response time was within the national norm. The candidate failed the third attempt.
- The candidate took the examination for a fourth attempt and passed! During this attempt the candidates test pool was the same as was used for the third attempt but every item seen on the third attempt was "masked" and not seen or presented to the candidate on the fourth attempt. During this attempt the candidate took 84 seconds per item, again outside of the national norm. The candidate was a minimum-length passing candidate. Research was conducted regarding the candidate's test preparation and the candidate's individual item time response. Although the candidate's mean ability level rose substantially during the fourth attempt, it was determined to prepare for the fourth attempt, the candidate repeated the entire EMT course, rather than completing just the required refresher course. It was concluded that the candidate's ability level rose substantially because of two factors. First the candidate learned what was required to be an EMT by re-taking the entire course and secondly, the candidate likely learned it doesn't matter how fast one completes a computer adaptive test (CAT) but it does matter how many individual test items one answers to rise above the entry-level standard; the pass/fail score. There were no irregular time intervals for items answered, such as one item being answered in 32 seconds and the next taking 140 seconds. The candidate's overall time per item was slower than normal.
The NREMT reported to the Fire Chief of the Washington DC Fire Department that all scores reported on DCFEMS members were valid.
The cost of this investigation completed by Pearson VUE, the NREMT and the DC Fire Department was substantial. Many firefighters came under suspicion, rumors harmed their reputations, they had to be interviewed as part of this investigation, it was embarrassing and in the end nothing was found to substantiate these rumors either by direct interview by the Police Department of those involved or via computer adaptive testing data forensics. More substantial evidence should have been obtained before this investigation was launched and became public. However, once the allegations of "cheating" were released in the Washington DC press and seen on television in the District area the investigation and its expenses became necessary. The NREMT was confident that the allegations would prove to be false because it is much more difficult to compromise a pool of items than to compromise a single printed test form. The following characteristics of a computer adaptive test (CAT) made the NREMT confident that changing outcomes for individuals who "cheat" is much more difficult to accomplish:
- An adaptive test is unique for each individual based upon the candidate's computer calculated ability level as the test progresses. No two candidates receive the same examination. Asking questions or looking at another person's computer screen does not help because every candidate has a unique test.
- When a candidate repeats an examination due to failure every item seen by the candidate on a previous attempt is "masked," or not presented to the candidate again.
- Security at examination sites is well maintained by Pearson VUE. At Pearson operated sites video cameras are in place to record candidate action.
- A test pool contains 1,500 test questions administered to candidates based upon their ability and is unique to each candidate. Memorization of that number of items is highly unlikely.
The NREMT moved from pencil-paper examinations to a computer adaptive testing model in 2007 because of both perceived and real test security issues experienced in EMS testing. A CAT examination is expensive to support and requires thousands of calibrated test items to build a bank and thousands of candidates in order to gather new item statistics. States that use pencil-paper examinations or linear-based computer examinations delivered on-demand (when candidates want to take the examination), cannot attest valid scores on their EMS examinations due to security issues. Data forensics similar to those conducted above can not be duplicated for pencil-paper or linear based computer examinations. If the goal is to protect the public, then those who issue licenses and/or allow practice, should utilize testing approaches ensuring the highest level of security and validity, like a computer adaptive testing (CAT) model.
The NREMT concluded only valid scores were presented to the DC Health Department for candidates applying for licensure to work as DCFEMS. Cooperation between Pearson VUE, the DC Fire Department Internal Affairs and the NREMT during this investigation demonstrated the desire to present only valid scores for licensure. We could find no credible evidence of cheating that improved a test score. We will continue to encourage personnel to inform the NREMT in cases where suspected cheating may have influenced a test score. As with any investigation, this one will be re-opened if credible evidence is presented. Should anyone provide any credible evidence regarding cheating on any NREMT delivered examination, the NREMT remains interested in that evidence. Using computer adaptive testing provides enhanced investigative techniques not available when pencil-paper examinations or linear based examinations are utilized. The NREMT wants to present only psychometrically sound, legally defensible and valid scores for licensure and certification.