Evaluating the quality of clinical research is challenging. For some, reading a research article in a scholarly journal is like trying to read a foreign language. As such, many EMS providers rely on expert summaries, or they blindly agree with what their colleagues say. News flash: Sometimes the internet is wrong, and sometimes “experts” are just trying to sell you something. This article will teach you seven essential questions to ask when reading about cutting-edge advances in EMS. Decide for yourself if the hottest research should change your practice.
A Fictitious Example
How often do you read a headline like “Wonder Drug has been clinically proven to reduce mortality when given to blunt or penetrating trauma patients within six hours of injury”? Should your department start carrying Wonder Drug? How do you separate truth from hype?
Understanding clinical research is the answer. Gone are the days when the oldest or loudest person in the room dictates patient care with statements like, “Because we’ve always done it that way,” or, “I used that on a patient once and it worked great.” Research enables us to determine, objectively, what’s best for our patients and how to advance clinical practice.
However, the reality is that not all research is created equal.
Imagine you’re reading an article about the newest Wonder Drug for trauma. You’re going to ask seven questions to determine if this is a practice-changing article or something that belongs in the trash. For each question, two opposing sentences are provided as examples: one indicates high-quality research (✔) and one indicates low-quality research (✘).
Question 1: Are you reading a peer-reviewed primary research article?
✔ A recent article in the New England Journal of Medicine suggests Wonder Drug helps stop bleeding after trauma.
✘ In my expert opinion, Wonder Drug truly is the greatest thing since sliced bread.
JEMS is a great way to hear about cutting-edge advances in EMS, but you need to understand it’s an example of secondary literature, which means you’re reading a summary or someone’s opinion. If you’re thinking about changing your practice, you need to read the original article describing the research, called the primary literature. The editors of JEMS ensure every article contains references that cite research and studies, so it’s easy for you to go back and read the primary literature. The difference is similar to reading the CliffsNotes version of a book vs. reading the actual book.
Even more important, however, is the issue of peer review. A peer-reviewed article contains information that was vetted by a group of scientists who weren’t involved in the research. This process is similar to being judged by a jury of your peers in a court of law. It tends to separate the truth from the hype. Only the primary article is peer-reviewed, while secondary literature generally isn’t.
In the above examples, the first is telling you where to find the primary article. PubMed (www.pubmed.gov) is an easily searchable online index of primary peer-reviewed research articles, as well as review articles published in secondary literature, supported by the National Institutes of Health. PubMed will also find similar articles on the topic. Unfortunately, articles by “experts” hoping to sway you to buy their product sometimes neglect to tell you about articles that shine a negative light on their product, so look around for alternative points of view.
Some of the articles indexed on PubMed only give you limited access without a subscription to the journal that published the article. If you conduct your search through a library, you should be able to use their institutional subscription to access it. (If you have trouble, check with the reference librarian). Alternatively, ask your medical director to get you the article, or contact the EMS division at the closest university medical center.
Now that you have your hands on the primary research article, you can begin to assess its quality.
Question 2: Is there a control group and an intervention group of patients?
✔ We compared the in-hospital mortality rates of patients who got Wonder Drug to those who didn’t.
✘ We looked at in-hospital mortality rates of the first 50 patients given Wonder Drug by our department.
A fundamental aspect of high-quality research is that there’s a comparison between at least two groups of patients. In the first example, one group received standard medical care plus Wonder Drug (i.e., the intervention group), and the other group didn’t get Wonder Drug, but did receive standard medical care (i.e., the control group). The only difference between the two groups was Wonder Drug. Thus, if the intervention group does better or worse, that’s likely because of the effect of Wonder Drug.
Now look at the second example. There’s only one group! If those first 50 patients all survived, was that because of Wonder Drug? What if all 50 patients died? We have no way of knowing what occurred specifically due to the administration of Wonder Drug. Maybe they all lived because standard medical care is excellent. Maybe they all died because their traumatic injuries were so severe. Ultimately, we can’t say anything with certainty. This research design is called a “case series,” and it shouldn’t change your practice.
Question 3: Is the control group similar to the intervention group?
✔ We randomly assigned half of all patients to receive Wonder Drug, while the other half were given a placebo.
✘ We used injury severity scoring to match patients who received Wonder Drug to those who didn’t.
The gold standard in clinical research is the randomized controlled trial. Patients randomly get either Wonder Drug (the intervention) or a placebo (the control). (See Figure 1.) This powerful technique ensures that the two groups were identical up until the moment of randomization. However, this research design is complex, expensive and takes many years to complete. As such, these studies are unfortunately uncommon in EMS.
Figure 1: Wonder Drug randomized controlled trial group distribution
An easier way to study Wonder Drug is to conduct an observational study. Like the randomized controlled trial, there are two groups, with one receiving Wonder Drug and one not. However, scientists no longer randomly assign the intervention and the control. They simply “observe” what happens. At first glance, that seems like no big deal. However, sicker patients are simultaneously more likely to require multiple interventions, including Wonder Drug, and more likely to die. Now if the Wonder Drug group does poorly, was that because of Wonder Drug or because those patients were sicker? If they do better, was that because of Wonder Drug or because of another intervention?
Scientists try to fix this problem by using complex statistics that compensate for these group imbalances, such as injury severity scoring. However, it’s far from perfect. For example, a gunshot wound to the heart gets the same score as a car crash with a shattered kidney. Do you think those are the same injury? Ultimately, this leaves you with some residual doubt regarding the effect of Wonder Drug.
Question 4: Are the data from a high-quality, dependable source?
✔ Upon arrival, paramedics were asked to fill out forms indicating signs of shock and time of drug administration relative to injury time.
✘ We reviewed the past two years of EMS run sheets to determine signs of shock and time of drug administration relative to injury time.
Just because data exists doesn’t mean the data are correct. Your EMS run sheets are probably perfect, but what about your partner’s run sheets? Have you ever seen an alert and orientated patient who had a pulse oximeter reading of 20 and a respiratory rate of 99? Can you say “typo”? These innocent errors can wreak havoc on clinical research. Resources and procedures dedicated to collecting the exact data needed for the research are best. Run sheets are notoriously inaccurate because they weren’t designed to collect the precise information needed for clinical research.
Data should come from a high-quality and dependable source, but determining what’s high quality can be challenging. For example, even the National EMS Information System (NEMSIS), a database often used for research purposes, is based on run sheets. Thus, the NEMSIS user manual acknowledges that data may be biased due to inconsistences in how clinical variables are measured and reported.
The central issue here is whether the data are collected prospectively or retrospectively. Prospective means that the research project was designed, and then scientists started collecting data specifically for the purposes of that research. Retrospective means the data already existed for some other purpose, and the project was designed as an afterthought. Demand prospective data if you’re thinking of changing your practice.
Question 5: Are you convinced the intervention caused the outcome?
✔ Trauma mortality was 10% at 1 g of Wonder Drug, 7% at 2 g, and 4% at 3 g.
✘ Trauma mortality decreased in patients given Wonder Drug with MegaClot Gauze and Hemorrhage Pants.
There’s a big difference between association and causation. Association means two things are related; causation means that one thing actually caused the other to occur. What you really want to know is causation, but often we can only demonstrate association.
Here’s an example: Patients transported from the scene of a motor vehicle collision by helicopter are more likely to die than patients transported by ambulance. However, do you think the helicopter and its personnel actually caused the patient to die? Hopefully not! It was probably the severity of the car crash and the patient’s initial injuries that caused the death. Helicopters are just associated with severe car crashes.
This distinction is critically important. Let’s say you incorrectly assume that helicopters cause death. If you want to reduce the number of deaths, then a logical step would be to never dispatch a helicopter. However, if you work an hour away from the nearest trauma center, what do you think will happen to the number of deaths? The number will probably go up—the exact opposite of what you wanted to occur. Good luck explaining that to your chief.
Back to Wonder Drug and question No. 5. The first example sentence is demonstrating causation. Specifically, it’s showing a dose-response relationship: As the dose of Wonder Drug goes up, mortality goes down. That’s strong evidence that Wonder Drug is directly causing fewer people to die. What about the second example? Wonder Drug seems to be associated with fewer deaths, but did it cause fewer deaths? Perhaps it was the MegaClot Gauze, or maybe the Hemorrhage Pants. We can’t know for sure, and that’s a problem. If you can come up with an alternative explanation for the results, you probably shouldn’t change your practice.
Question 6: Will your patients do better if you change your practice?
✔ Trauma mortality decreased from 10% to 4% in patients who received Wonder Drug (p < 0.001).
✘ Volume of IV fluid bolus required was significantly lower in patients who received Wonder Drug (p < 0.001, absolute difference 6%).
At first glance, this sounds like an obvious question, but this issue can get quite tricky. Both examples are demonstrating a statistically significant improvement in patient outcomes because of Wonder Drug. That’s what “p < 0.001” tells you. Essentially, the improvement is legitimate. Both examples also demonstrate a 6% change in that outcome. However, what outcome was different between the two groups? In the first example, mortality decreased by 6%. Impressive! More patients survived. In the second example, the volume of fluid bolus required decreased by 6%. If your patient still dies, do you think anyone will care that you gave a smaller fluid bolus?
Just because an outcome is statistically significant doesn’t mean it’s clinically significant. In the second example, the difference is legitimate, but relatively pointless. Operational issues aside, if you’re going to change your practice, the improvement should be something your patient actually cares about, like not dying.
Question 7: Are the study patients similar to your patients?
✔ Patients of all ages who suffered either blunt or penetrating trauma by any mechanism were included in this study.
✘ Elderly women who suffered blunt trauma from motor vehicle collisions as restrained rear-seat passengers in luxury cars were included in this study.
The first example has a broad range of patients who are likely similar to your patients. The technical term for that is “generalizable.” The second example, however, is demonstrating selection bias. The study results are skewed because only certain patients got Wonder Drug, namely grandmothers with multiple comorbidities. Thus, can you apply the results of this study to the gang member with a gunshot wound? How about the young driver of an economy car who was unrestrained? Probably not.
That being said, it’s critical to note that the answer to this final question depends entirely on where you practice. No one can answer this question for you. If you’re a paramedic who practices near six retirement communities in Florida, then the second example may actually be more applicable to you than the first.
This issue gets even trickier when evaluating non-EMS research. What if Wonder Drug was only given to patients after admission to the intensive care unit in the first example? Are those hospitalized patients similar to your prehospital patients? The answer to that question is clear as mud.
How often is your interest piqued by a cutting-edge device, drug, technique or idea that’s reported to be bigger, better, faster and shinier? Don’t be convinced by a 140-character Internet conclusion!
Challenge yourself to read the primary article and decide for yourself. Ensure the research was peer-reviewed, compared at least two groups of patients that were similar to each other, used high-quality data, and demonstrated that the intervention caused an outcome that matters to patients who are similar to the patients you treat in your community.
If you fully understood that last sentence, congratulations! You’re ready to evaluate clinical research.