Making Sense of Medical Research
With the popularity of the Internet and public access to the Medline database and Cochrane Consumer, the average person faces the challenge of deciphering the results of medical research. They may be confronted with a variety of unfamiliar terms like 'randomised controlled trial', 'double-blinding', 'ph values' and 'statistical significance'. People also have to make sense of media articles on medical research, which tend to report information in the most dramatic way possible. This article will provide a brief overview of the types of different medical research and what to look for when reading a medical research paper or a media report on new research.
Different forms of medical research
There are a number of different forms of medical research, some of which are considered to be more valid than others. The two main types of research are experimental/intervention trials and observational trials. An experimental/intervention trial is a study which intentionally administers a particular treatment, procedure or regime to determine if the intervention is beneficial. However, these types of trials are not suitable for testing all hypotheses. For instance, it would be unethical to conduct an experimental/intervention trial to determine whether a particular substance taken during pregnancy increased the mother's risk of birth defects. Therefore, observational studies are used to investigate research questions of this type. Observational studies are designed to observe a group of people from a particular point in time and report on what happens to them.
Experimental studies allow causal statements to be made, such as 'coffee consumption causes blood pressure to rise'. Observational studies only allow statements of association to be made (eg. 'when coffee consumption increases, blood pressure rises are also observed'). There may be competing explanations for the blood pressure rise (eg. increased stress levels which just happen to coincide with increased coffee intake). So, as study designs become less experimental/interventional, there is a greater chance for competing explanations for the research results.
The different forms of medical research design are listed below in descending order of their ability to avoid competing explanations.
A systematic review involves the review of research available on a particular topic, based on a rigorous and predetermined methodological approach (1). The role of the reviewer is to select the best evidence available (usually confined to randomised controlled trials - see below) and to advise what the current situation is. Some of the most substantial and rigorous reviews are those carried out by the Cochrane Collaboration, an international non-profit organisation which reviews and promotes the best available evidence on effects of interventions and treatments. Review groups within the organisation gather suitable trials and then evaluate the data, resulting in a 'Cochrane review'. Cochrane reviews are published on the Cochrane Library which is available by subscription (although the Federal Government are currently negotiating free access for the Australian public). At present, general public has access to summaries of Cochrane reviews on the Cochrane Consumer website: www.cochraneconsumer.com
A meta-analysis involves pooling the results from a number of studies which have compatible methods and participants. This has the benefit of combining data to produce a more substantial sample size (number of participants). The person conducting the meta-analysis also presents the results of individual studies in the same way and with some form of commentary. This is particularly useful for readers with limited experience in interpreting medical research. However, a meta-analysis is only useful if the individual studies included are comparable and of good quality and if the actual selection process is unbiased (2), which is difficult to achieve in practice.
Randomised controlled trials
A randomised controlled trial (RCT) is most commonly used to investigate drugs and procedures but can also evaluate health services, community interventions or health education/promotion activities. In a RCT, participants are randomly allocated to either the intervention group or the control group. The intervention group receives the intervention (drug, lifestyle change, therapy etc) while the control group receives either an existing service, program or treatment or sometimes no intervention. The randomisation is important as it helps to reduce the possibility of bias. Without randomisation there may be a tendency for researchers to recruit participants into the trial who are more likely to respond to the intervention or to select these participants for the intervention group.
Some RCTs are also double-blinded which involves both the participants and the researchers not being aware of which participants are in which group, reducing the possibility of biases occurring. For example, if the researcher knows a participant is in the active group they may unintentionally respond to them differently, creating a bias. Also, from the other perspective, participants may offer information that they believe the researcher wants to hear.
Similarly, in some cases the participants in the control group may need to be given a placebo so they remain unaware of which group they are in. A placebo is an inactive treatment that is identical in all other ways to the intervention. Although supplying a placebo seems simple enough, if it differs too much from the intervention, the control group participants may become aware of this. For instance, a placebo of a drug administered orally may need to taste unpleasant like the active drug. It can also be difficult to obtain a placebo for some studies. How can a placebo for massage therapy be found? Or, is the use of sham acupuncture sites acceptable? (3).
The larger a RCT sample is, the more significant the results are. The selected sample size depends on the research method being used, as well as on economic considerations (the larger the sample the more the research will cost). Studies with large numbers of participants (in the thousands) are often referred to as 'mega-trials'. In order to recruit large numbers of people for trials and to exclude local biases RCTs can also be 'multicentred', meaning that they take place in a number of different locations and/or institutions.
Cohort studies examine a group of people with particular characteristics over a specific period of time. These studies are also more generically referred to as longitudinal studies. A cohort study is conducted either prospectively or retrospectively/historically. A prospective trial involves recruiting participants and then following them to count events of interest over a defined period of time. A retrospective trial uses existing data, going back in time to examine past events or characteristics. Because of the greater control over how data are collected, prospective trials are considered to be more accurate than retrospective trials. However, prospective trials can take a long time to conduct and, therefore, are an expensive research method. There is also a greater chance that participants will drop out of the study over time.
The Australian Longitudinal Study on Women's Health is one of the largest cohort studies conducted in this country. The study involves three cohorts of women from different age brackets (aged 18-23,45-50 and 70-75) and is designed to run for 20 years. It is examining a broad range of issues including use and satisfaction with health care services, life stages and key events, time use, weight and exercise, and violence (4).
Case-control studies involve comparing a group of people (referred to as 'cases') with a condition/disease to a well group from a similar population (referred to as 'controls'). The two groups may be asked to recall past information and/or have their medical records examined. These studies aim to determine the importance of suspected causal agents or to examine possible outcomes. Adjustments are made to allow for other factors that may also have contributed to the development of the disease/condition (confounding factors). Case-control studies can be helpful when investigating conditions that are rare or if a quicker answer is required than can be achieved with a cohort study.
Cross-sectional studies are 'snap-shot' studies, designed to estimate the prevalence of a particular disease, exposure to risk factors or other factors in that population. They involve interviewing or examining a sample of a determined population at a single point in time. For example, the Census is a cross-sectional survey.
A case study is a report of a case involving a person with unusual or interesting medical characteristics (history/symptoms). Sometimes a number of case studies are featured together, which is referred to as a 'case series'. Case studies can be useful with rare diseases, unusual side effects, or when there is very little other information available. They can also provide direction for areas where further research is needed.
Other types of studies
There is a number of other types of studies which people may come across while searching for literature on a particular topic. If the term 'in vitro' is used it means that the study took place outside a living organism, more accurately in a petri dish or a test tube. The term 'in vitro' is latin for 'in glass'. In vitro studies are often conducted before animal studies and studies in humans so while the results may be interesting they cannot be applied to the human body. Studies conducted in a living organism are referred to as 'in vivo'.
Other studies are carried out in animals, often laboratory rats and mice (the term 'murine' is used to refer to both). Results from such studies may highlight areas in which further research is required but, like in vitro studies, these results cannot be directly applied to the human body.
Interpreting the results of medical research
Medical research is reported in a standard format but may make little sense to the everyday person. This section explains some of the common terms used.
The point of medical research is to build on what is known. Just as evidence builds in a court case to prove a person's guilt, research collects data as evidence to prove or disprove a relationship. Again, just as in a court case there is the need for a decision 'beyond reasonable doubt', there is the need in research to quantify doubt about making certain conclusions. When inadequate or distorted evidence is collected, sometimes even guilty persons are set free. This also occurs in research; sometimes data collection goes awry for a range of reasons. Medical researchers quantify the amount of doubt they have about presenting the wrong conclusion, and they use a p value or significance level to do this. The p value represents the probability that a particular conclusion might have occurred due to chance-related differences. Low p values imply strong evidence that the result is not just a chance finding.
A p value is said to be significant when it is less than 1 in 20, expressed as p<0.05. This means that if the study was repeated 20 times, at least 19 out of 20 would yield conclusions similar to those observed by the researchers. The establishment of 0.05 as the cut off for what is considered 'significant' is arbitrary.
Any difference, no matter how small, may be found to be statistically significant, if the sample size is large enough. A statistically significant result, however, may not be 'clinically significant'. An outcome is said to be clinically significant if it makes enough difference to both patients and health care providers to change current practices. Examining the absolute risk reduction (see following sections) can help determine if a result is clinically significant.
Confidence interval (CI) and confidence levels
While p values are useful, confidence intervals can help provide a clearer picture of results. Remembering that research is often done on a sample of persons rather than the whole population, it is possible that the sample may be unrepresentative. A confidence interval "measures the precision with which an estimate from a single sample approximates the population value" (5). Confidence levels of 90%, 95% or 99% are used by researchers, with 95% being the most common. A 95% confidence level means there is a 95% certainty that a similar result would occur if the study was repeated in other populations (6).
An example of the way a confidence interval may be presented in medical research is 1.5 (95% CI, 0.91-2.4). Here, the 1.5 is the average result of the study, the 95% is the chosen confidence level and the 0.91-2.4 is the theoretical range of values within which the researchers are confident that the true result would lie. So the best estimate is 1.5 but they are 95% certain that the truth lies somewhere between 0.91 and 2.4. In comparison, an average of 1.5 with a confidence interval of 0.3 to 13.7 means that the researchers are less confident of these results because they have less precision.
A narrow confidence interval means that the results are more precise, while a wide confidence interval means poor precision and also suggests an inadequate sample size. Confidence intervals are closely related to sample size; they get narrower as the sample size increases. For example, if a study involving 10 people finds 40% are non-smokers, statistical calculations show it is 95% certain that if the study were to be repeated in other populations between 10% and 74% of people would be non-smokers. However, if 40% of people were found to be non-smokers in a sample size of 1000, the confidence interval is much narrower, approximately 37%-43%. Therefore, these results are a more precise estimate of smoking in the general population.
'Relative risk' (RR), a general term that may also be referred to as 'risk ratio' or 'rate ratio', is the most common way that results are reported in randomised controlled trials and cohort studies. Relative risk is the event rate in the intervention group divided by the event rate in the control group. As a ratio, relative risk of 1.0 stands for no effect on risk at all. A RR bigger than 1 means that the population is at increased risk, while a RR smaller than 1 means that they are at a decreased risk.
For a relative risk to be properly interpreted, a person needs to know what the average risk is. If a person is told they have five times the risk of developing a particular condition it is meaningless unless they know 'five times' what amount. If a disease is relatively rare a 'five times' risk may mean their risk is still small.
While relative risks are presented in research papers as 1.5, 1.26 etc, media articles often report the risk as a percentage. In the recent study linking oestrogen only hormone replacement therapy (HRT) with ovarian cancer, the RR of ever taking oestrogen only HRT compared to those that had never taken HRT was 1.6. Some media coverage of the study reported that women who had ever taken oestrogen only HRT during menopause and following had a 60% higher risk of developing ovarian cancer compared to those women who had never taken HRT (7).
Unfortunately, the presentation of risks in the form of percentages is often misinterpreted by people. In the case of the 60% increased risk of ovarian cancer, some people may even think this means they have a 60% risk of developing ovarian cancer in a given year. In fact, the 60% increase means that if the approximate lifetime risk of developing ovarian cancer is 1.7%, they now have a 2.72% risk (a long way off 60%!).
The 'absolute risk' is the rate of a particular disease or condition in the general population. The absolute risk is expressed in either the number of cases per 100 000 people per year or as a cumulative risk up to a particular age. The lifetime risk of getting breast cancer for women is 1 in 12. This is an example of an absolute risk expressed in cumulative terms. Cumulative risks are often misunderstood, with many people assuming, for instance, that in any given year their breast cancer risk rate is 1 in 12. In fact, the 1 in 12 rate is the risk if a person survives to the age specified (in this case to age 74). However, at no one time in a woman's life does she actually have a 1 in 12 risk rate of getting breast cancer. For instance, the annual risk rate for women aged between 30-39 is 1 in 246 and for women aged 50-59 it is 1 in 53 (based on breast cancer statistics from 1921-1994) (8).
It is also important for people to recognise that absolute risks are calculated on the general population (or a given population) and, therefore, are not necessarily an individual's risk. A woman's chance of developing breast cancer, for example, is influenced by other factors like family history, reproductive history and hormone use.
Despite these limitations, the absolute risk provides people with the best interpretation of research results. Presenting information in the form of per 100 000 people or of per 100 000 person years allows people to put the risk rate into perspective.
Absolute risk reduction (ARR)
'Absolute risk reduction' (also referred to as 'risk difference') is used when the intervention reduces the risk of a undesirable event or outcome (9). It is simply the difference in the rate between the intervention and control group. For example, if mortality from a condition being treated with a new drug is 20% compared to a 28% mortality with the existing drug (the control group), the ARR of administering the new drug is 8% or 0.08 (28%-20%).
The term 'absolute risk increase' (ARI) is used when the intervention increases the risk of an undesirable outcome and is calculated in the same way as the ARR.
Relative risk reduction (RRD)
'Relative risk reduction' is also used when the intervention reduces the risk of an undesirable event or outcome. It is the absolute risk reduction (ARR) expressed as a percentage of the rate in absence of intervention. In the example of the two drug treatments above, the relative risk reduction or amount by which the new drug treatment reduces mortality is approximately 29% (0.28-0.2)/0.28.
The term 'relative risk increase' (RRI) is used when the intervention increases the risk of an undesirable outcome and is calculated in the same way as the RRD.
Number needed to treat (NNT)
'Number needed to treat' is the term used for the number of people that would need to be treated, on average, for one to benefit. The number needed to treat is simply the reciprocal of the absolute risk reduction. Therefore, for the above example the number needed to treat with the new drug for one person to benefit would be 12.5 (1/0.08). The number needed to treat statistic is particularly useful when comparing different treatment options and in evaluating cost effectiveness of treatments.
The term 'number needed to harm' (NNH) is used for the number of people that would need to be treated, on average, to cause one bad outcome. This is used for instances where the intervention has caused an undesirable outcome.
- Redmond, A C. et al. 'Horses for courses': The differences between quantitative and qualitative approaches to research J Am Podiatr Med Assoc 2002; 92:3:159-169
- Redmond, A C. et al. 'Horses for courses': The differences between quantitative and qualitative approaches to research. Ibid;166-67
- Ryan, D. Toward improving the reliability of clinical acupuncture trials: Arguments against the validity of "sham acupuncture" as controls Am J of Acupunct 1999; 27:1-2:105-109
- Women's Health Australia: The Australian Longitudinal study on Women's Health http://www.newcastle.edu.au/centre/wha
/index.htm [website] date accessed: 25 July 2002
- Opinion Search. Explanation of Confidence Intervals http://www.opinionsearch.com/english/regular/graphic/
date accessed: 25 July 2002
- Cochrane Consumer Network. Research Glossary http://cochraneconsumer.
u_consumerglossary.asp [website] date accessed: 28 June 2002
- Marcus, A. Estrogen therapy linked to ovarian cancer HealthScoutNews http://story.news.yahoo.com/news?tmpl=story&u=/hsn
[website] date accessed: 18 July 2002
- Kricker, A & Jelfs P. Breast Cancer in Australian Women 1921-1994 Canberra: Australian Institute of Health and Welfare 1996; 13
- Medical University of South Carolina. Evidence Based Medicine Terms http://www.musc.edu/dc/icrebm/ebmterms.htm [website]
date accessed: 25 July 2002
Further help and information from Women's Health Queensland Wide
Health Information Line:
Our free statewide line is staffed by women's health nurses and midwives. They provide women with up to date information, support and referral to health practitioners and services. Women can contact the Health Information Line by phone or email via the 'Ask a Health Question' page on the website. All phone calls and emails are confidential.
(07) 3839 9988 or 1800 017 676 (toll free outside Brisbane)
Our free lending library offers a select range of books on major women's health topics. Topic-based booklists are availableon our website, or can be posted out; books can be requested by phone or email and are posted to borrowers.
Contact on administration : (07) 3839 9962
All our factsheets and booklets are available on our website. The website also features articles on women's health from our newsletter, student factsheets, upcoming events, library services and 'Ask a Health Question' page. A list of reputable links is also available where women can search for further information on health topics.
This article was written by Kirsten Braun and reviewed by the Editorial Committee for Health Journey, Vol 3 2002. Women's Health Queensland Wide sincerely thanks Dr Diana Battistutta, Queensland University for Technology, for critically reviewing the penultimate draft of this article.
Please note that this article is an archive. While every effort was made to ensure the information was accurate at the time of publication, the article has not been updated since this time.
September 1, 2002