Levels of Evidence in Research: Examples, Hierachies & Practice

  in Research   Posted on January 8, 2021

Evidence-based practice has been gaining popularity since its introduction in 1992. While it may seem obvious that occupational practices based their knowledge on scientific evidence, it has been controversial among the scientific community (Trinder & Reynolds, 2006). Since then, it has spread to various disciplines such as education, management, allied health, law, and more.

A significant part of evidence-based practice is the levels of evidence or hierarchy of evidence. Generally, it applies to any type of research and evaluates the strength of scientific results. While there are specific levels of evidence in various disciplines, the most developed is from medicine and allied health (Hugel, 2013)

This article briefly introduces evidence-based practice and how it is applied to specific disciplines. Furthermore, it shows various examples of levels of evidence in research and how they rank or evaluate scientific research. At the end of the article, the reader should have a clear idea of what evidence-based practice is, how levels of evidence fit in, and the relevant critiques associated with it.

levels of evidence in research

Levels of Evidence in Research Table of Contents

  1. What is evidence-based practice?
  2. What is evidence?
  3. Examples of Levels of Evidence in Research
  4. Criticism of Evidence Hierarchies and Evidence-Based Practice

What is Evidence-Based Practice?

Evidence-based practice (EBP) is the idea of occupational disciplines being based on scientific evidence (Trinder & Reynolds, 2006). It encourages and, in some cases, forces scientists and other professionals to pay more attention to evidence when making crucial decisions.

EBP aims to minimize outdated or unsound practices to more effective, evidence-based ones. It shifts the decision-making from intuition, tradition, and unsystematic experience to well-established and well-researched scientific studies.

Evidence-based practice has been gaining popularity since its introduction back in 1992. It has spread to various fields such as management, law, medicine, education, public policy, and more. There is also an effort to apply EBP in scientific research itself, which is called metascience.

Some examples of the application of EBP in various disciplines are as follows:


Evidence-based research, also known as metascience, is the utilization of scientific methodology to study science, which aims to increase the quality and efficiency of the research process (Ioannidis, 2020). As metascience concerns itself with all fields of research, it is also referred to as “a bird’s eye view of science.”

Metascience is made up of five major areas of studies:

  1. Methods – It seeks to highlight poor practices in research such as biases, manipulation of statistics, and poor study designs. It also finds various ways to reduce such practices.
  2. Reporting – It also seeks to identify issues in disseminating, explaining, reporting, and popularizing research. Poor reporting results in difficulty in the interpretation of results or even miscommunication. Metascience proposes solutions such as establishing reporting standards and improved transparency.
  3. Reproducibility – Replication is an essential part of scientific research. However, there has been a widespread issue in replicating results in various studies, which puts into question their reliability. Such issues often plague psychology and medicine (Lehrer, 2010).
  4. Evaluation – It aims to develop a scientific foundation for peer review, which is currently plagued with issues such as biases, underrepresentation, and more (Smith, 2006). Metascience evaluates using various systems like open peer review, pre-publication peer review, and post-publication peer review.
  5. Incentives – It also seeks to promote better studies through enhanced incentive systems. Metascience studies the effectiveness, accuracy, costs, and benefits of various approaches in evaluating and ranking research and those who perform them (Ioannidis, 2020).


Medicine and allied health have adapted evidence-based practice, aptly named evidence-based medicine (EBM), which is an approach to the practice that optimizes decision-making by utilizing evidence from well-conducted and well-designed research. It is done by balancing three components, namely research-based evidence, patient values and preferences, and clinical expertise (Haughom, 2015). With this approach, medical practitioners can improve the quality of healthcare, improve satisfaction among patients, and potentially cut down the cost of treatments.

Components of EBM

The field already has some degree of empirical support in itself. However, EBM goes further by classifying levels of research. It considers the epistemological strength of evidence from the strongest types (systematic reviews, meta-analyses, and similar studies), which produce strong recommendations and results from weaker types (case-control studies) that yield weak recommendations.

Evidence-based medicine is applied to various parts of the discipline, from education to the administration of health institutions (Eddy, 1990). It advocates that decisions and policies should be based on evidence as much as possible instead of the beliefs of experts, practitioners, or administrators. For example, it assures that a doctor’s opinion, which may be limited due to biases or gaps, is supplemented by knowledge and information from the current literature in order to provide the best recommendation.


Evidence-based education is the utilization of well-designed and well-researched studies to identify which education methods work best and produce the best results. It combines evidence-based learning and evidence-based teaching. While the reception for EBE is generally positive, some critics point out that some educational research is poor in quality and limits the scope of relevant research (Biesta, 2007). Other studies are often difficult or impossible to replicate.

Some examples of evidence-based learning techniques are as follows:

  • Errorless training – This training style was introduced by Charles Ferster with the assistance of B.F. Skinner, who developed teaching methods that were created in such a way that students are not required to—and do not—make errors when learning new information or processes. They noted that errors are not a function of learning or vice versa. Mistakes are also not blamed on the learner.
  • N-back training – Also known as the n-back task, it is a performance task that is commonly used in cognitive neuroscience and psychology to measure the working memory capacity and overall working memory of a learner. It involves presenting a series of stimuli (a shape, letter, etc.) spaced a few seconds apart. The learner should decide whether the current stimulus matches the one displayed in trials.
  • Spaced repetition – Often used with flashcards, new information is presented, repeatedly spaced by a certain amount of time. New flashcards and more challenging topics are shown more frequently than less difficult or older flashcards. It takes advantage of the psychological spacing effect—learning is more effective when spaced out in several sessions. Spaced repetition has been proven to improve the rate of learning (Smolen, Zhang, & Byrne, 2016).

What is Evidence?

The entire practice revolves around evidence, its various types, and its validity. Evidence is the result or product of scientific research that enables decision-making. It can be divided into two main categories:

  • Primary information (unfiltered) – It contains the original data or results and analysis from the scientific study. It includes no interpretation or external evaluation.
  • Secondary information (filtered) – It is considered as the highest quality evidence. Filtered information include synthesis, analysis, interpretation, evaluation, and/or commentary on the unfiltered information. Additionally, it may come with relevant recommendations for practice.

Evidence-based practice involves levels of evidence that help practitioners determine the “strength” or value of the evidence. The hierarchy of evidence depends on the discipline itself and how each field develops their standardized process of evidence evaluation. A great example is the evidence pyramid used by medical practitioners and researchers.

Application of Hierarchy of Evidence

Medical experts and researchers rank evidence according to quality, resulting in the evidence pyramid (Queensland University of Technology, 2019).

Secondary Information (Filtered)

The top three levels of evidence in medical research are composed of filtered information. These include the following, starting from the evidence or source with the highest quality:

  • Systematic reviews – Such reviews focus on peer-reviewed journals or publications that discuss specific health issues or topics. These publications use standardized processes for selecting and evaluating articles.
  • Critically-appraised topics and individual articles – It is composed of articles and topics that summarize information from an individual article from journals or publications. These are often written to answer a specific medical topic, problem, or concern.

Primary Information (Unfiltered)

The next three lower levels of evidence include unfiltered information starting from the highest quality:

  • Randomized control trials – Experiments where subjects are randomly divided into different groups where one or more groups receive procedures, intervention, or substance. A control group does not receive any treatment and will serve as the benchmark of the study. The effects of the treatment on each group are then recorded, assessed, and evaluated.
  • Cohort studies – A study observes a group of people who share some defining characteristics, called a cohort, which is observed over a long period of time. Typically, a cohort was exposed to a disease or have taken treatment at some point. Experts observe the effects of such exposure and how they compare to other cohorts with different exposure levels.
  • Case-controlled studies and series – A case-control study observes a group of people with a condition or a disease and a suitable control group. Potential risk factors or attributes to the disease are identified and examined by comparing the control group and the diseased subjects. A case series is a collection of patients or subjects sharing common attributes or characteristics related to a condition, a treatment, or a disease. Relevant outcomes are examined and evaluated.

Background Information

The background information, or expert opinion, is not considered evidence. However, it is the foundation level of the pyramid, which supports the other evidence. It is also used to provide context to the interpretation of the evidence when needed.

The hierarchy above is just one example of the application of the hierarchy of evidence. Each discipline and its corresponding sub-fields can have their process of evaluating evidence. For example, experts have proposed more than 80 different hierarchies for assessing and evaluating medical evidence (Siegfried, 2017).

the evidence pyramid

Examples of Levels of Evidence in Research

As mentioned above, the level of evidence in research is a method of raking the relative strength or validity of results coming from scientific studies. In the past, there have been multiple proposals in assessing levels of evidence in research. These include:

Grading of Recommendations Assessment, Development and Evaluation

The Grading of Recommendations Assessment, Development, and Evaluation (GRADE) process was proposed in 2001 by guideline developers, methodologists, clinicians, public health scientists, and other interested members. It measures the strength of recommendation (or confidence in estimated effect) and certainty in the evidence of a certain scientific study or research.

It is endorsed by various international health organizations, such as the World Health Organization, the Canadian Task Force for Preventive Health Care, and the U.K. National Institute for Health and Care Excellence (NICE), among others (McMaster University & Evidence Prime Inc.). While it has its roots in medicine and allied health disciplines, it can also be applied to research dealing with other topics.

GRADE provides the following ratings to various evidence:

  1. High – A lot of confidence that the true effect is similar or close to that of the estimated effect.
  2. Moderate – Moderate confidence in the estimated effect. It is likely that the true effect is close to the estimated effect. However, it is possible that substantial differences may exist.
  3. Low – Limited confidence in the estimated effect, which may be substantially different from the true effect.
  4. Very Low – Very little confidence in the estimated effect, which is likely to be substantially different from the true effect.

Guyatt and Sackett

G.H. Guyatt and D.L. Sackett proposed the first version of the hierarchy of primary studies back in 1995 (Guyatt & Sackett, 1995). T. Greenhalgh further modified the ranking, resulting in the following (from highest to lowest in value):

  1. Meta-analyses and systematic reviews of randomized controlled trials (RCTs) with definitive results (exact solution or answer to the query or question).
  2. RCTs with definitive results.
  3. RCTs with non-definitive results.
  4. Cohort studies
  5. Case-control studies
  6. Cross-sectional surveys
  7. Case reports

Saunders et al.

B. Saunders and his colleagues also proposed a protocol that assigns research results and reports into six categories. The assignment is based on the theoretical background, research design, general acceptance, and evidence of possible harm (Saunders, Berliner, & Hanson, 2003). Much like other levels of hierarchy, it is rooted in allied health.

The research outcome should have a descriptive publication such as a manual or something similar.

  • Category 1 – Treatments or results that belong in this category are well-supported and efficacious. Ideally, two or more randomized controlled outcome research compare the study results to the target results.
  • Category 2 – Results in this category are supported and possibly efficacious results. It is based on positive outcomes of non-randomized study designs with some control.
  • Category 3 – Outcomes in this category are supported and acceptable results. It may be supported by one controlled or uncontrolled study or by a series of single-subject research.
  • Category 4 – Promising and acceptable results belong to this category. It may involve studies that have no support, aside from general acceptance and existing literature.
  • Category 5 – Innovative and novel results are assigned to this category. It includes studies that are considered not harmful but are not widely discussed or cited in the literature.
  • Category 6 – Concerning results belong to this category. It may include outcomes that may possibly do harm and have unknown or untested theoretical foundations.

Criticism of Evidence Hierarchies and Evidence-Based Practice

While many disciplines have adapted and accepted evidence-based practice and evidence hierarchies, there was an increasing criticism in the past few years. Many of which highlight the shortcomings of EBP in medicine and allied health as it is most practiced in these disciplines. For instance, a 2016 survey found out that 70% of researchers are unable to reproduce the experiments of their peers (Baker, 2016).  The same study revealed that 52% of the respondents agree that there is indeed a significant reproducibility crisis in the research industry.

Source: Nature (2016)

Many critical works have been published in the literature in the past decade or so. However, upon a survey of these works (Solomon, 2011), they usually fall into one of the following:

  • Procedural aspects and issues of evidence-based medicine.
  • The fallibility of EBM is greater than expected.
  • Some experts consider EBM as incomplete as a philosophy of science.

Furthermore, many practitioners and administrators point out that EBM has limitations in terms of informing the care of patients. The hierarchy of evidence also does not consider the research on the efficacy and safety of medical interventions. Also, studies designed using EBM guidelines fail to define key terms, consider the validity of non-randomized controlled trials, and underscore a list of study design limitations (Gugiu et al., 2012).

Others have specifically criticized the hierarchy levels of evidence, such as Borgerson, who wrote that these rankings should not be absolute and do not necessarily epistemically justify such order. Also, he noted that researchers should take a closer look at social mechanisms for managing pervasive biases (Borgerson, 2009).

A. La Caze added that basic science could actually be found on the lower tiers of EBM as it plays a significant role in specifying and contextualizing experiments. These lower ranks of evidence also help interpret and analyze the resulting data (Caze, 2010).

How Do Levels of Evidence and Evidence-Based Practice Apply to Your Research?

While many of the protocols in evidence-based practice are specific to disciplines, it is a great introduction to how processes and protocols are followed in the research process. At its most basic form, evidence-based practice shows the importance of the value of supported and backed-up evidence, especially in scientific studies.

Examining the evidence, whether in EBP or not, is one of the cornerstones of science. Having a standardized method of evaluating evidence, such as a hierarchy, streamlines the entire research workflow. At the very least, it provides researchers with a tool to determine the value of the evidence, their sources, and whether it is relevant to the study.

It may seem obvious that evidence and proper sources should be considered in every scientific study, EBP and levels of evidence are controversial as well. Experts and practitioners have published critiques that examine the applicability and validity of the protocols in their respective disciplines. However, just like any scientific process, it is undergoing improvements through continuous evaluation.



  1. Baker, M. (2016). 1,500 scientists lift the lid on reproducibility. Nature, 533, 452–454. https://doi.org/10.1038/533452a
  2. Biesta, G. (2007). Why “what works” won’t work: evidence-based practice and the democratic deficit in educational research. Educational Theory, 57 (1), 1-22. https://doi.org/10.1111/j.1741-5446.2006.00241.x
  3. Borgerson, K. (2009). Valuing evidence: bias and the evidence hierarchy of evidence-based medicine. Perspectives in Biology and Medicine, 52 (2), 218-233. https://doi.org/10.1353/pbm.0.0086
  4. Caze, A. L. (2010). The role of basic science in evidence-based medicine. Biology & Philosophy, 26 (1), 81-98. https://doi.org/10.1007/s10539-010-9231-5
  5. Eddy, D. M. (1990). Practice policies: Where do they come from? JAMA: The Journal of the American Medical Association, 263 (9), 1265-1265. https://doi.org/10.1001/jama.263.9.1265
  6. Greenhalgh, T. (1997). How to read a paper: Getting your bearings (deciding what the paper is about). BMJ, 315 (7102), 243-246. https://doi.org/10.1136/bmj.315.7102.243
  7. Gugiu, P. C., Westine, C. D., Coryn, C. L., & Hobson, K. A. (2012). An application of a new evidence grading system to research on the chronic care model. Evaluation & the Health Professions, 36 (1), 3-43. https://doi.org/10.1177/0163278712436968
  8. Guyatt, G. H., & Sackett, D. L. (1995). Users’ guides to the medical literature. IX. A method for grading health care recommendations. Evidence-Based Medicine Working Group. JAMA: The Journal of the American Medical Association, 274 (22), 1800-1804. https://doi.org/10.1001/jama.274.22.1800
  9. Haughom, J. (2015). 5 reasons the practice of evidence-based medicine is a hot topic. Health Catalyst.
  10. Hugel, K. (2013, May 16). The journey of research – Levels of evidence. CAPhO.org.
  11. Ioannidis, J. P. (2020). Meta-research: Evaluation and improvement of research methods and practices. Krise Der Demokratie – Krise Der Wissenschaften? 22, 101-118. https://doi.org/10.7767/9783205233008.101
  12. Lehrer, J. (2010, December 6). The truth wears off. The New Yorker.
  13. McMaster University, & Evidence Prime Inc. (n.d.). Resources. GradePro.org.
  14. Queensland University of Technology. (2019, January 3). Evidence explained. QUT Library.
  15. Saunders, B., Berliner, L., & Hanson, R. (2003). Child Physical and Sexual Abuse: Guidelines for Treatment. PsycEXTRA Dataset. https://doi.org/10.1037/e319002004-001
  16. Siegfried, T. (2017, November 13). Philosophical critique exposes flaws in medical evidence hierarchies. Science News.
  17. Smith, R. (2006). Peer review: A flawed process at the heart of science and journals. Journal of the Royal Society of Medicine, 99 (4), 178-182. https://doi.org/10.1258/jrsm.99.4.178
  18. Smolen, P., Zhang, Y., & Byrne, J. H. (2016). The right time to learn: Mechanisms and optimization of spaced learning. Nature Reviews Neuroscience, 17 (2), 77-88. https://doi.org/10.1038/nrn.2015.18
  19. Solomon, M. (2011). Just a paradigm: Evidence-based medicine in epistemological context. European Journal for Philosophy of Science, (3), 451-466. https://doi.org/10.1007/s13194-011-0034-6
  20. Trinder, L., & Reynolds, S. (2006). Evidence-Based Practice: A Critical Appraisal. Oxford: Blackwell Science. Retrieved from Google Books.