Many thanks to Liam and Matt for hosting this blog on what many would argue to be a pretty boring / difficult topic! Hopefully it will provide a slightly different perspective on some of the difficulties in being ‘Evidence Based’.
Firstly, before we get started, a bit of background about me, and a disclaimer!
I am currently working as a rotational MSK physiotherapist within the NHS in Liverpool and have been doing this for a number of years. I have developed a strong interest in critical thinking and how this can be utilised in both consuming research and implementing potential findings into clinical practice. I have recently taken to Instagram to spout my thoughts out loud and save my girlfriend and colleagues from relentless physio talk, if you are like minded come visit me @jmortonphysio. However, whilst I comment on issues in research, I am currently not actively involved in conducting it (I’m working on that). Therefore instead of personal experience, this blog features information I have found invaluable from individuals who do have vast experience in these ideas, and commentary within the research base itself. I hope you get something from it!
‘Evidence based practice’ is a concept that has been drummed into all of us through our education and training to become physiotherapists. It’s also come in pretty handy as a buzzword when going for interviews at all levels, but what actually is evidence based practice? Has it always been a thing? And how easy is it to actually be ‘evidence based’? I was always led to believe that to be evidence based is to integrate research findings from papers following the hierarchy of evidence. That being, systematic reviews and meta-analyses holding the most value, and case reports holding the lowest value given that there is no controlling of the potentially many confounding factors and bias.
Seems pretty simple right? I have a clinical question of if treatment X would be helpful, I have a quick google and find a systematic review that says there is positive effects. I give treatment X and BOOM! Evidence based practice.
Life was a lot simpler when this was the case.
To truly understand evidence based practice, you have to understand why this became a thing in the first place. Surprisingly, it wasn’t until the mid 1980’s that it started to gain traction, when an American physician, the late Dr. David Sackett (considered the father of evidence based medicine) released a book ‘Clinical Epidemiology; A Basic Science for Clinical Medicine’. It was over the next 15 or so years that this idea of using research to guide practice began to take hold in healthcare systems around the world. Before this, the model for healthcare was mostly ‘expert’ or ‘authority based medicine’, which I’m sure we can all appreciate was likely heavily influenced by bias and personal beliefs / ego’s of senior clinicians. People back then may well have received the best available care, but equally, they may not have. I’m sure in some places today, authority based medicine is still practiced under the guise of evidence based medicine… or is that being too cynical?
(I want to be clear that I am not having a dig at how medicine was practiced, back then there was nowhere near the volume of research we have today, or perhaps more importantly, the accessibility to it. Let alone the ease of communicating with colleagues at the touch of a button).
In 1992, Guyatt et al released a paper entitled ‘Evidence Based Medicine; A New Approach to Teaching the Practice of Medicine’, which might be considered the official start of the paradigm by some. It took off like wildfire, suddenly all levels of clinicians had the ability to critically reason treatments, provide better advice to patients and have the ammunition to challenge senior colleagues should their recommendations not fit with what relevant research had suggested.
Evidence based practice is now widely accepted by most healthcare professionals, physiotherapists included, as the best paradigm within which to practice (a paradigm being a thought process or pattern within which to work). But there are a number of concerns shared about it in recent years. Has it lost its way? After nearly 30 years, shouldn’t we be a lot further along than we are? Shouldn’t it be as simple as what I described above, a quick google and then applying the trusted findings into practice?
So why should you keep reading?
This blog will aim to explore two key areas related to evidence based practice, both of them relating to research only (we all know that evidence based practice does go beyond what is printed in an article). The first area will briefly explore what research actually asks the important questions, and is our practice actually based off of these articles or not?
The second will briefly explore more meta-science, or the science of science, which I assure you is much more interesting and important than it might sound, and I believe is crucial to understand!
By the end of this blog; I would be happy if you have learned something about potential issues in extrapolating research findings, and would be maybe a little bit closer to being a medical conservative.
**Sidenote** I am fully aware conservative will be an offensive word to some, so let me give a definition. ‘A conservative is someone who stands athwart history, yelling Stop, at a time when no one is inclined to do so, or to have much patience with those who so urge it.’ (Mandrola et al, 2019)
What does the research base consist of?
Let’s start off with a shocking statistic. In 2012, Smith et al, commented on a feature that the British Medical Journal used to run entitled ‘Clinical Evidence’, which demonstrated that out of 3000 NHS treatments, only 11% were found to be clearly beneficial! As if that wasn’t shocking enough, 50% were found to be of unknown effectiveness! Just think of the millions of pounds that are being spent on procedures that might actually not do what they are thought to do, based off of poor quality evidence!
Does this fill you with confidence when you go to see a healthcare professional, are you receiving efficacious treatment? (this is different to effectiveness which we will explore in a bit). I worry that too much is reliant upon non-specific effects which by definition, cannot be relied upon to provide positive outcomes. Unfortunately there is probably no easy solution for this in the short term.
What about focusing more on our interest area of musculoskeletal conditions? This year, Harris et al conducted a search for randomised controlled trials that compared common surgical procedures to non-operative treatment or sham surgery.
Now, it’s important to understand the difference between these two research questions. Comparing a surgery to non-operative treatment can only comment on treatment effectiveness; if results come back in favour of surgery, it does not prove that it is the actual surgical procedure that was the reason someone got better. This is because a whole other host of factors come into play such as the ritual of going to see a surgeon (does this validate someone’s concerns about their problems?), the power of surgical placebo and the potential increased co-operation with post-operative rehabilitation, to name only a few.
The more interesting question is how does the surgery compare to sham surgery? In this question, all patients receive exactly the same type of care except for whether or not they undergo the surgical procedure AFTER the incisions have been made. This can be referred to exploring the efficacy of a treatment, i.e. is it actually the treatment that helps.
Out of the 6734 randomised controlled trials that this research group found, only 64 asked the question of whether true surgery was better than sham surgery. That’s 1%.
1% of all available surgical trials that are designed to answer the question of causality that compared doing the surgical procedure to not doing the surgical procedure. This alone is a cause for concern, but frustratingly, isn’t the biggest concern.
So, of this 1% of trials, guess how many showed a clear benefit for surgery? 14%.
To me, that’s crazy! That is 86% that showed the actual surgical procedure WASN’T the mediating factor for recovery. Just let that sink in for a minute before reading any further.
This is troubling to me for a few reasons. One of the biggest is that currently our whole healthcare system for musculoskeletal pain is set up in a biomedical framework. I find it frustrating that patients can sometimes see other interventions such as those employed within physiotherapy as something to ‘fail’ before they can get ‘gold standard’ care of surgery. How are we meant to succeed in explaining pain and providing ‘evidence based practice’, within a biopsychosocial model when ‘the next step in treatment’ is completely biomedical in nature and not even well supported by high quality research?
Additionally, the economic burden of this is huge. Let’s take a look at sub acromial decompressions as an example which is perhaps one of the most ‘debunked’ orthopaedic procedures (possibly behind meniscectomy for degenerative meniscal tears). It has been shown that it’s not the actual shaving of the inferior surface of the acromion that helps people. In England alone (never mind the other parts of the UK), this procedure rose in popularity by 91% in 2016/7 compared to 2007/8 with approximately 52 per 100,000 people having the procedure. The NHS forks out for around 30,000 procedures a year at a cost of roughly £125 million per year.
Again let that set in. £125 million a year for a procedure that doesn’t actually do what it was originally hypothesised to.
But Jeff, this isn’t physiotherapy so, whilst interesting, doesn’t really apply to us. Okay, but let me ask you this, do you think physiotherapy research is a) as well funded as orthopaedics? or b) as powerful (in terms of participant numbers or background theory) as orthopaedics?
The Harris paper is great for shock value, and to really punctuate the issues with availability of high quality research. Within the physiotherapy / MSK rehabilitation world, there are numerous examples of practices that gained huge popularity without the science backing it up, check out these below for only a few examples;
- ‘Core stability’ for low back pain.
- Hopefully everyone who is reading this will be off this bandwagon by now. But this all started from some EMG research in the late 90’s that showed a tiny delay in the transversus abdominus activation in people with low back pain. Cue the next 15 years of core stability research and all people with low back pain being given bridges, TVA activations and bird dogs. (Please read Smith et al’s 2014 systematic review on core stability for low back pain for one of the most concise and clear conclusions you will ever need to read about this topic)
- Vastus Medialis Oblique strengthening for patellofemoral pain
- This theory came off the back of biomechanical research investigating the ‘Q-angle’ suggesting that if you could improve the strength of the VMO, it would reduce compression at the lateral patella facet and hence reduce pain. Now, people can get better with quadriceps strengthening when they have patellofemoral pain, but it definitely isn’t because they’ve isolated their VMO!
- Isometrics for analgesic effects in tendinopathy
- This hype was only five years ago, a paper published for patella tendinopathy by Ebonie Rio was a huge hit with physio’s across the world for people to give isometrics for immediate pain relief. This is due to people adopting this as fact rather than the aim of the research paper.
‘But these are quite old now, it wouldn’t happen these days’
What about the Acute:Chronic Workload Ratio (ACWR) which (questionably) aims to quantify training load to help avoid injury? This has had some recent, serious questions raised about the basic science of it (Impellerizzi, 2020), yet it is already in the International Olympic committee guidelines and consensus!
What about blood flow restriction training? Is there actually high quality research that looks at medium and long term outcomes to support its benefit? I’m not suggesting that it might not be useful, but is the research base strong enough to warrant splashing out on fancy equipment or having it feature as a big part of treatments?
Right, so the first hiccup in evidence based practice with the MSK world, is that we just don’t have a vast amount of research that ask the important questions in the right way. This is a strong argument for adopting more of a medical conservative approach.
‘The medical conservative recognizes that many developments promoted as medical advances offer, at best, marginal benefits. And adopts new therapies when the benefit is clear and the evidence strong and unbiased’ Mandrola et al (2019) – well worth a read!
This might seem confusing, especially when physiotherapy has less at stake than say, a disease modifying medication. However, I suspect that a lot of the time, new or fancy modalities can detract the therapists focus from the aspects of treatment that we know are beneficial. Some of these aspects being;
- Actually listening and being interested in a patients story. Do you have time for this if you have a number of different interventions to fit into a treatment session?
- Building a solid rapport. Depending on patient expectations, some modalities might enhance this however it’s often built from history taking.
- Providing advice and guidance on how to lead a more active lifestyle with achievable goals.
- Providing up to date education and allowing the patient to address concerns that they have.
- Actually running through the entire exercise plan with patients so they can violate expectations in the presence of their therapist.
- Signposting to services to address other health concerns e.g. smoking cessation and having a clear explanation as to why this is important.
This isn’t to say other things don’t have a place, but in my opinion, the basics done well should always come first. Especially for myself who has 30 minute appointments in a busy NHS clinic. How many of our ‘fancy treatments’ actually have specific effects? Some have called them ‘theatrical placebos’ and I might be inclined to agree with that. But coming back to high value care, using a theatrical placebo might not the worst thing in the world as long the basics have been done well before it! Maybe that’s a topic for another time…
So what’s this about meta-science then?
What about if you do actually find a research article that explores the question you are asking from a clinical perspective? How do you know it’s asking the question in the best way, and is being truthful with the results?
For the first part of that question, I direct you to the Steven Kamper series ‘Evidence in Practice’ in the JOSPT which excellently aims to provide insight into how to appraise literature in no more than 2 pages per piece. Alternatively if you don’t have time for this, I have summarised each piece in an Instagram post over on my page @jmortonphysio
The second part to this question, how can you trust research, may be a new idea for some. Unfortunately there is a reproducibility crisis across many areas of science with physiotherapy being a part of this. A part of this problem is thought to be due to P-Hacking and HARKing.
Imagine you are a researcher, you have a theory that a particular intervention might be beneficial for a certain patient group. You devote a huge amount of time to investigating the effects of this treatment, perhaps a company or institution have provided funding because they share your hypothesis. It comes down to crunching the numbers and… f**k, it hasn’t reached statistical significance.
But it’s close! Maybe if you added a few more participants it would reach it and you could prove that this intervention is worthwhile! (If we temporarily ignore clinical meaningfulness for now)
So to put this in an example, say you collected 20 participants and your P-Value was at 0.08 (close to 0.05) with a trend towards it becoming more significant. You add 3 more participants one at a time until you reach the magic 0.05 number which reaches statistical significance! SUCCESS! Right? Well, maybe not.
Are you just arbitrarily stopping at 23 participants? What if another participant has a negative response which would take it back below the significance threshold making the research suddenly support the null hypothesis?
This is an example of P-Hacking. It might seem innocuous enough, it might even make sense to some people, I must admit I didn’t get it straight away. But this is actually manipulating data to try and reach a statistically significant result.
Other types of P-Hacking could involve using different software to do statistical analysis if it comes back under the significance value, but I don’t know much about this at all so will keep quiet! Just be aware that it’s a thing!
What about if a researcher wants to explore the effect of an intervention on a person with a condition but without a clear aim of exactly what they want to explore? Let’s say for example a particular exercise type for people with persistent low back pain.
Imagine you are the researcher again. You start your participants off with the exercise intervention and have a good feeling about it. You want to make sure you fully explore the effects of this intervention so you can provide as much information to your colleagues as possible! So, you decide to record a lot of outcome measures, maybe;
- Pain (NRS)
- Disability (Oswestry Low Back Pain Questionnaire)
- Amount of medication participants take
- Mean number of days off over a year due to back pain
- Return to work following sick leave
- Range of movement
- Transversus abdominus activation
- Lumbar extensor strength
- Lumbar flexor strength
- Lumbopelvic posture
- Tampa scale for kinesiophobia
At the end of the study you see that there is a significant result for pain. GREAT! Your intervention works to reduce pain in people with persistent low back pain. Right? Well, maybe, then again maybe not.
This is another classic example of P-Hacking. The 0.05 significance level equates to a 1 in 20 chance that results you get are random statistical noise. So what happens if you collect a large number of outcome measures? The chance you will find a statistically significant result increases! Therefore a significant result might not reflect a true relationship between your exercise intervention and the outcome measure! (But who cares, you can publish it right?)
When it does come to publishing, do you mention all of the other outcomes you measured for as well? Maybe you think people don’t need to know about these if they aren’t significant anyway? Maybe you actively think it might draw questions about the validity of your results.
If the results are then written up, and the literature review and hypothesis state that you ONLY investigated the effect of this exercise on pain, this is misleading the reader. This is also an example of HARKing which stands for Hypothesis After the Results are Known.
Again, this might seem not too bad to some. How else are we supposed to come up with relationships to explore? Well, it might not be the actual research practice that is the issue here (unless you collect a crazy amount of outcome measures that can’t be relied upon); the biggest issue is reporting as if you only meant to investigate the significant outcomes. Because the reader is pretty powerless to prove that this wasn’t the case.
So here’s the rub; how do we actually know we can trust the investigative research articles that we read?
What’s the scale of the issue?
Put simply, these issues are enormous. In 2015 the Open Science Collaborative reproduced 100 psychology studies exactly as they were first completed. That is an absolutely mammoth task! Think about what it would take to gather the exact amount of participants with the same baseline characteristics.
When originally published, 97% of these studies had statistically significant results. However, when they were reproduced by different authors and with no data manipulation, only 36% reached statistical significance! That is quite unbelievable.
What impact does that have on practice for psychologists? How many of their treatments are based off of this research that can’t be reproduced?
But Jeff, this is psychology, it doesn’t relate to us!
Do you really think physiotherapy would be drastically different? Do we not interact with people to take messy measurements about complicated constructs such as pain over long periods of time?
The reproduction of 100 articles is a huge task, and so far hasn’t been done in physiotherapy. In fact, reproducing papers doesn’t really get done much at all for reasons such as publisher bias (which we will touch on in a bit).
Have a look for yourself
Let me draw your attention to this paper:
‘Manual therapy versus therapeutic exercise in non-specific chronic neck pain: a randomized controlled trial’ by Bernal-Utrera et al, 2019. (Don’t worry this isn’t another manual therapy debate)
Their primary outcome measures were pain and disability and reported as such in their results paper. However, if you take a look at the protocol they published the year before, the primary outcome measure was POSTURAL STABILITY! What?! There was no mention of this at all in their final paper! Not even an explanation for why this wasn’t reported. Was it measured and didn’t reach significance? Was it just not feasible due to availability or equipment issues?
Not having an explanation makes me question the trustworthiness of these results.
Enough with the problems, what’s the solution?!
What can be done about these issues to restore the faith? Probably one of the biggest things researchers can do is to pre-trial register their protocols. This includes their a priori hypothesis, research method (including pre-determined number of participants) and how they will calculate the results from a statistics perspective.
With this, we can see that there has been no foul play, either intentionally or unintentionally. To check this you can look at clinicaltrials.gov or search for a separately published protocol (such as the Bernal-Utrera paper above).
**Side note** One huge orthopaedic paper I am eagerly awaiting the results of is the HIPARTI trial comparing surgery for femoroacetabular impingement syndrome to sham surgery. They have pre-trial registered the study (it is a huge study so I’m not surprised); but if you want to see what pre-trial registration should look like, have a look here https://clinicaltrials.gov/ct2/show/record/NCT02692807?type=Intr&cond=hip+arthroscopy&draw=2
Great. So this makes it more trustworthy for the reader, but what if the results come up not significant? How likely is it that a publisher will want to publish an article that doesn’t actually show anything other than a treatment not being effective?
This is called publisher bias, and I imagine it is a huge problem for researchers, after all, researchers need to make a living! That’s what’s probably influenced the reproducibility crisis in the first place!
Might this make researchers less likely to pre-trial register their protocols? I don’t know.
There is a way around this however! It is called a registered report. With this method, the paper gets its protocol peer reviewed BEFORE the intervention takes place, with reviewers commenting on how the research can be strengthened and the publisher agreeing before the trial even takes place, to publish the results. Because it will be publishing good science.
This will take a hell of a lot of strain off of the researcher, and true results can hopefully start piling in!
But, the down side (there has to be one) according to the ‘Science PT’ himself, Erik Meira, there are no physiotherapy journals currently offering this. So it’s not so much a solution as a potential solution. So I guess it’s a case of ‘watch this space’.
And of course, all of this is a problem within the evidence based practice paradigm… but is this even the best way to practice? Maybe I can confuse you (and myself) with philosophical considerations of evidence based healthcare if ThrivePES has me back.
There we have it
- Evidence based practice isn’t as easy as googling a paper to support your theory.
- The evidence base often doesn’t answer the important questions that need to be answered in order to provide evidence based care.
- It’s beneficial to have a healthy dose of scepticism about new ideas and treatments until there is clear benefit shown in research.
- Not all research can be trusted, and this is a problem in physiotherapy.
- Always check to see if an investigative research paper has been pre-trial registered.
- Make sure you head over and read the Steven Kamper series ‘Evidence in Practice’ in JOSPT (or the poor mans version of the posts on my page @jmortonphysio on Instagram)
The points that we’ve just covered are hugely important to be aware of, and if you found them even remotely interesting I assure you it’s worth investing more time into listening to people much more in the know than me talk about it. The best part is, you can do this for free!
- Everything Hertz podcast;
- This is a great resource ran by two non-physiotherapists, for anyone interested in meta-science and how we actually get to the conclusions in research papers before we even get close to trying to implement them into our practice!
- These articles I also highly recommend:
- Mandrola et al (2019); The case for being a medical conservative; The American Journal of Medicine; https://doi.org/10.1016/j.amjmed.2019.02.005
- Head et al (2015); The extent and consequences of P-Hacking in Science; PLoS Biology; doi:10.1371/journal.pbio.1002106
- Ionnidis (2005); Why most published research findings are false; PLoS Med; 2; (8); e124
- Open science collaborative (2015); Estimating the reproducibility of psychological science; Science; 349; (6251); DOI: 10.1126/science.aac4716
Thanks for reading!
Bernal-Utrera et al (2019); Manual therapy versus therapeutic exercise in non-specific chronic neck pain: a randomized controlled trial; Trials; 682; (21)
Harris et al (2020); Surgery for chronic musculoskeletal pain: the question of evidence; Pain; 161; (9); 95-103
Impellizzeri et al (2020); Acute:Chronic Workload Ratio: Conceptual Issues and Fundamental Pitfalls; International Journal of Sports Physiology and Performance; https://doi.org/10.1123/ijspp.2019-0864
Mandrola et al (2019); The case for being a medical conservative; The American Journal of Medicine; https://doi.org/10.1016/j.amjmed.2019.02.005
Open science collaborative (2015); Estimating the reproducibility of psychological science; Science; 349; (6251); DOI: 10.1126/science.aac4716
Rio et al (2015); Isometric exercise induces analgesia and reduces inhibition in patellar tendinopathy; British Journal of Sports Medicine; 49; 1277-1283
Smith et al (2012); Differing Levels of Clinical Evidence: Exploring Communication Challenges in Shared Decision Making; Medical Care Research and Review; 70l; (1) https://doi.org/10.1177%2F1077558712468491