0% found this document useful (0 votes)
223 views46 pages

Biolayne How To Read Research

Uploaded by

Joe Robberechts
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
223 views46 pages

Biolayne How To Read Research

Uploaded by

Joe Robberechts
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 46

Table of Contents

Article 1: Overview of Research


3
- The Scientific Method 3
- Variables 4
- Types of Research 5
- Study Designs 6
- Types of Publications 7

Article 2: Reading and Interpreting Research 9


- Abstract 10
- Introduction 10
- Materials & Methods 11
- Results 11
- Discussion 11
- Conclusion 11

Article 3: Statistical Concepts 12


- Overview of Statistics 12
- Data Representation 16

Article 4: Challenges for Researchers


20
- Funding 20
- Trusting Research 22

Article 5: Common Methods for Measuring Variables 28


- Body Water 28
- Body Composition 29
- Protein Metabolism 33
- Hypertrophy Measurements 36
- Energy Expenditure 37
- Hormones 39
- Muscle Excitation 40
- Strength Testing 40
- Psychometrics 41
- Closing Remarks 43
- References 44

How To Read Research: A Biolayne Guide 2


Article 01 3. Empirical: collecting data.

Overview of 4. Reductive: evaluating data to establish


generalizations for explaining relationships.

Research 5. Replicable: the research process is recorded and


described in detail to allow for future studies to
Science is known as a branch of knowledge or body of test the findings and build future research.
truth/facts. Science is based on research. The Oxford
University Press defines research as, “the systematic
investigation into and study of materials and sources in

The Scientific Method


order to establish facts and reach new conclusions” 1.
The term research can have a variety of definitions and
meanings depending on the context. There are many
different branches of research with diverse focuses. You may not remember learning about the scientific
This guide will provide a general understanding of method in grade school, so let’s do a quick recap.
research from a broad perspective and narrow it The scientific method is a formal set of steps that
down to the details that are of specific importance researchers follow to conduct research. The scientific
to exercise and nutritional science. It’s important to method can be broken up in a variety of different ways,
understand that many of the definitions, topics and but for the sake of simplicity we will divide the scientific
concepts that we discuss have multiple definitions and method into these four steps 3.
lack clear characteristics. With everything we discuss
in this guide, we will provide our best definition and 1. Identifying and developing a problem: all
interpretation as we understand it. Our goal with this research starts with identifying a problem or
guide is to provide you with the necessary knowledge topic of interest and defining the studies purpose.
and information you need to critically read and
interpret scientific publications and their respective 2. Formulating the hypotheses: a hypothesis
findings. Remember, one study doesn’t prove anything. is a testable statement of the anticipated
Individual studies are pieces to a much larger puzzle. results of a study. This is a formal prediction
of what will occur when the study is carried
Tuckman, 2012 2 characterized the research process out, based on prior results or theory.
by these five properties:
3. Gathering data: researchers use processes
1. Systematic: researchers follow certain rules and and validated methods to measure and
parameters when investigating a specific question collect data during the study or experiment.
and designing a research study. This involves
identifying variables of interest, designing a 4. Analyzing and interpreting results: once data is
study to test the relationships of the variables, collected from the experiment or study, it is then
and collecting data to evaluate the problem and analyzed using statistical methods to determine
prediction. the accuracy of the hypothesis. Researchers aim to
understand what was found and how it fits within
2. Logical: examining the procedures from testing the context of other evidence.
a theory allows for evaluation of the conclusions
that are made.

How To Read Research: A Biolayne Guide 3


Variables
Variables are factors that can be measured or
manipulated during research. Once the problem is
identified variables of interest are determined to
design a study around those variables to be tested
and measured. There are a number of different types
of variables and here we cover the primary variables
that you should know to further understand the
research process.

Independent Variables
Independent variables are what is being manipulated by
the researcher to determine the relationship or affect
it has on another variable. Independent variables are
also known as the experimental or treatment variable,
input, cause or stimulus. For example, an independent
variable could be the type of diet subjects are following
(i.e. high carb, high fat, low carb, etc.). Independent
variables can also have different levels. For example,
if a training study is evaluating high, moderate and
low training volume and muscle hypertrophy, training
volume would be the independent variable with the
different levels being high, moderate and low.
effects they may have on the relationship between the
Dependent Variables independent and dependent variables 2. The caloric
Dependent variables are measured following a intake in a diet study could be viewed as a control
treatment or stimulus. Dependent variables are variable when comparing two different types of diets.
known as the output or response variable and they are
observed or measured to determine the effect of the Extraneous Variables
independent variable 2. The dependent variable changes Extraneous variables are factors that can influence the
as a result of the manipulation of the independent relationship between the independent and dependent
variable. Examples of dependent variables are body variables, but it is not identified or controlled in the
composition, strength, resting metabolic rate, blood study 3. This can cause spurious associations between
hormones, etc. If a study is investigating high fat vs. variables. There may be an association between the
low fat diet and weight loss, weight loss would be independent and dependent variables but could be due
considered the dependent variable while the type of to both variables being affected by a third unknown or
diet would be considered the independent variable. uncontrolled variable (extraneous). For example, let’s
assume a study is examining differences in weight loss
Control Variables when following a high carb/low fat diet or a high fat/low
Control variables are factors that could influence the carb diet and let’s say they don’t equate calories. By not
results and are left out of the study 3. Control variables having any control over caloric intake that could be an
are not a part of a study and instead controlled by the extraneous variable because it can impact the changes
researcher to cancel out or neutralize any potential between groups irrespective of the type of diet.
How To Read Research: A Biolayne Guide 4
Extraneous variables are usually identified following an while measuring fat loss or muscle growth would be
experiment when associations between variables have considered an applied form of research because they
been identified and examined further. They can also are performed in real world settings with limited control
be identified by researchers during the study design, over the environment.
but because of lack of resources researchers may be
unable to control or account for a specific variable. Quantitative Research
Other variables known as confounding variables and Quantitative research is the most common type
covariates are similar to extraneous variables and of research you will find in exercise and nutrition
often used synonymously, but slightly different. Just science. Quantitative research is concerned with
know that extraneous variables, confounding variables numbers and groups, the aim is to determine the
and covariates are additional, unknown variables
relationship between variables 4. The relationships
that weren’t identified or controlled in the study and
between variables are expressed through statistical
have some type of impact on the independent and
analysis (we’ll cover later). This type of research is
dependent variables.
objective and tightly follows the scientific method and
seeks to determine a cause and effect. Studies that
are classified as quantitative research can be further
classified into two different study types known as
Types of Research experimental and descriptive (observational).

There are many different types of research to Experimental - Experimental research involves the
answer different kinds of questions and problems. manipulation of treatments or interventions. The
The different types and categories of research are aim of experimental research is to establish cause-
limitless, we will discuss the common types that and-effect relationships and commonly utilizes
are generally incorporated into exercise and sports some form of randomization (discussed below) 3.
science research. Experimental studies require diligent control over
variables and other factors that may impact the
Basic vs. Applied outcomes of a study. Experimental studies are also
Research in exercise and nutrition science can be known as longitudinal or repeated-measure studies 4.
placed somewhere on a spectrum between basic and Experimental studies measure subjects before and
applied research 3. Basic research is commonly referred following treatments or interventions. This type
to as “bench science”. Basic research is difficult and is of research aims to explain phenomena through
generally done in a laboratory under tightly controlled controlled manipulation of variables, commonly
conditions. Basic research operates under scientific viewed as the ‘gold-standard’ for research.
theories and often involves animals, but the relevance
or direct value to practitioners is limited 3. You can Descriptive - Descriptive research is also known as
think of this type of research as a scientist in a lab observational research and measures things as they are
with pipettes and cell cultures, studying underlying without intervening 4. There is no attempt to change or
molecular mechanisms. In contrast, applied research modify certain behaviors. This type of research doesn’t
is limited in the type of control it offers, but it’s much attempt to determine cause and effect (although
more practical and carries high ecological validity. many media outlets and even researchers are guilty
Meaning, it applies to real-world settings/conditions. of attempting to infer causation from these results)
This type of research involves human subjects and and instead characterizes phenomena as it exists.
is based on common practice and experiences. This type of research is less controlled and utilizes
Comparing different diet and training programs questionnaires, interviews and observation.
How To Read Research: A Biolayne Guide 5
Qualitative Research
Qualitative research is concerned with words and
Study Designs
individuals. Qualitative research is more subjective
and seeks understanding of multiple realities/truths Animal models
and requires constant comparison and revision. Animal model research commonly includes rats or
Qualitative research rarely develops hypotheses prior mice as subjects to perform more intensive and
to the study and instead uses more general questions controlled experiments. Other species are included
to guide the study 3. Qualitative research has been in various types of research and many debate the
growing interest in the field of exercise science and ethical considerations associated with this design.
is now being included more frequently. This type of Nevertheless, humans share many anatomical and
research has been historically used in social sciences physiological similarities with different animals, which
like psychology, sociology, and anthropology 5. This type allows investigation into underlying mechanisms.
of research is concerned with behaviors like attitude, Animal models allow for testing of novel therapies
beliefs, motivation and perception, all of which are before applying to humans, although not all results can
becoming popular in the field of exercise science and be directly translated to humans 6.
sports medicine. Qualitative research is frequently used
to evaluate community and school physical activity Controlled Trials
programs to understand the less tangible outcomes Controlled trials include a group that does not receive
like the participants attitudes and experiences about a specific treatment or intervention. This is called the
a program of interest 5. Qualitative methods of data control group and either receives nothing at all or a
collection can include open-ended questionnaires, placebo.
interviews or market research focus groups 5.

How To Read Research: A Biolayne Guide 6


Placebo-Controlled - When one of the treatments is research because it’s hard to recruit that type of
inactive and does not produce any impact or effect on population and implement an intervention that they’re
any of the variables it’s considered a placebo. Placebo- willing to follow. Generally case studies have widely
controlled trials can be single or double blinded. been utilized in fields such as medicine, psychology,
Single blinded trials are when the subjects are blinded counseling and sociology 3.
to the type of treatment they’re receiving. In other
words, they don’t know if they’re getting the active or Cohort Studies
inactive treatment. This is done to avoid the placebo This is a type of longitudinal study that investigates
effect. If subjects believe one treatment is more or a certain sample of people that share defining
less effective than the other it can actually cause a characteristics. This type of design can be experimental
psychosomatic change to occur irrespective of the or observational depending on how it is applied.
treatment itself. Double-blind trials include the subject
and the researchers being blinded to the treatments
and when done properly researchers are blinded to the
statistical analysis as well.
Types of Publications
Randomized Controlled Trials - Randomizing
participants to groups can reduce the risk of researcher After a scientific study is conducted, analyzed and
bias on the outcomes of interest and assumes both written, it’s then submitted to a journal for peer review.
groups to be similar. This type of study design is of Peer review involves one or more professionals or
the highest quality because it tightly controls for experts within the same field to critically evaluate
factors and variables that could influence the results, the submitted manuscript. Reviewers can choose
regardless of the effectiveness of the treatments or to simply reject the paper after reading it or suggest
interventions. revisions for the authors to complete before the
paper can be accepted. The peer review process is
Crossover Designs not perfect by any means, but it provides a form of
This type of study design includes both groups receiving regulation to maintain the quality and integrity of the
both treatments at different times. For example, group scientific literature and ensure the study is suitable for
1 may receive treatment 2 and group 2 may receive publication. Different journals follow minor differences
treatment 1. After a specified time period, treatments in their rules and regulations. They also vary in the
are switched to the other group. These studies are way their publications are formatted, while following
unique in that each subject is able to be used as their a general template. All scientific journals have what’s
own control since they both receive each treatment. called an impact factor. The higher the impact factor
of a journal, the higher the quality and therefore,
Case Studies higher quality studies are published in those journals.
Case studies observe and report data on one The impact factor is calculated based on the number
participant (n = 1). Case studies provide an in-depth and of citations the articles receive that are published in
detailed analysis that can assist in developing theories, that journal. There are a number of different types of
evaluating programs, and developing interventions 7. scientific publications, but here we briefly describe the
Case studies lack a specific intervention or treatment primary types you’ll encounter.
and instead observe and control testing procedures.
This type of study design is generally categorized Original Research
as a type of quantitative, descriptive study, but can Original research is a standard peer-reviewed
be used in qualitative research as well 7. This type of publication, what you would consider to be a published
study design has gained popularity in physique athlete scientific study. This type of publication follows a

How To Read Research: A Biolayne Guide 7


Systematic Reviews
The main purpose of systematic reviews is to create
generalizations by integrating empirical research 8.
Systematic reviews attempt to answer a specific
research question and use a systematic process to
collect relevant data sources and synthesize the
empirical findings. Systematic reviews address relevant
theories, critically analyze the data of the included
studies, attempt to resolve conflicting evidence on a
topic and identify central issues for future research 8.
Systematic reviews are a superior form of a literature
review because they use a systematic process to
collect, evaluate and synthesize the data on a particular
subject. Commonly thought to be the same thing as a
meta-analysis, systematic reviews differ in that they
don’t use any formal statistical methods to analyze the
combined data of studies, they simply summarize the
empirical evidence.

Meta-Analysis
Meta-analyses include the results of two or more
studies. Meta-analyses were first introduced in 1976
general format including an introduction, methods, by Gene Glass and defined as “a technique of literature
results, discussion and conclusions. Original research review that contains a definitive methodology and
is considered a primary source and includes data and quantifies the results of various studies to a standard
results that have not been published previously. metric that allows the use of statistical techniques
as a means of analysis” 3. Meta-analyses can be
Narrative (Literature) Review distinguished from literature reviews because they
Narrative reviews are considered secondary sources include a definitive methodology for including specific
and provide a review and general consensus on a studies in the literature analysis, and the results of
specific topic. Authors collect relevant, primary various studies are quantified to a standard metric
source articles relating to a specific topic and provide called effect size (which we will cover later) 3. Different
a summary of the most current and relevant evidence from systematic reviews, they use various statistical
pertaining to that topic. Narrative reviews are different methods to combine and analyze the data of a number
from systematic reviews in that they are based on of studies. Meta-regressions are an extension of meta-
the opinion of the authors and lack strict control over analyses and include a more effective and advanced
which studies to include in the review. You can think of statistical tool to assess the relationships between
these as an opinion-based article including a collection variables. Meta-regressions account for covariates or
and summary of original research. These can be helpful other study characteristics of interest. When carried
when trying to understand concepts, theories or a body out properly, meta-analyses are considered the highest
of evidence regarding a specific topic but be careful quality of scientific study.
accepting them as truth since it’s only the opinion of
the researchers who wrote it. These reviews can be
subject to confirmation bias and cherry picking studies
that fit their narrative.
How To Read Research: A Biolayne Guide 8
will cover basic statistics and dig into challenges faced
Article 02 by researchers in exercise and nutritional science. We
will finish this section with how to trust studies and

Reading and evaluate studies reporting conflicting findings.

Interpreting General Format

Research
The author line of publications follows a specific
order. The first author is the one who coordinated
and had the largest role or responsibility in the study.
Generally, if this is a graduate student’s project or
Reading research can be a challenging task for
thesis their mentor or supervisor will be listed last. The
those who are not experienced and educated to read
remaining order of authors will be based on their level
scientific publications. Before being able to interpret
of contribution. The general format for peer-reviewed,
results and findings from research, it’s necessary to
academic publications include five sections known
understand the layout and how to read a study. Most
as the introduction, methods, results, discussion, and
peer-reviewed journal publications follow a similar and
conclusion. The abstract is another section, but it is
general format, with minor differences. Understanding separate from the actual publication.
the general layout of publications will make it easier
to identify key details of studies and understand the Abstract
findings and takeaways. This section of the guide After the study title and author line you will find the
focuses on how to read scientific studies and interpret abstract. The abstract is a paragraph summary of the
their findings. After we cover the general layout and study. The abstract includes one to two sentences
briefly describe each section of a published study, we from each of the sections of the publication. Don’t be
How To Read Research: A Biolayne Guide 9
an abstract warrior and only read the abstract to report experiment occur and how often did they observe
what the study found. The details are important, and and measure changes?
findings are accompanied with caveats.
• Instrumentation: which devices and methods
Introduction were used to collect data. How was body
The introduction is the first section of all publications.
fat percentage (BF%) tested? Did they use
The introduction includes a discussion of recent and
appropriate equipment for what they were
previous studies that relate to the current study of
interest. Intro’s start with more general background attempting to test? Were their measurements
information and progress into key details and valid and reliable?
publications that apply to the current study. The intro
also discusses any controversies between theories • Level of control: were the participants in a tightly
or hypotheses and highlights the importance for the controlled environment (metabolic ward) or was
current study. The intro includes two key pieces of the this a free-living experiment? Studies that include
study known as the purpose and the hypothesis: supervision for resistance training studies are
more tightly controlled than studies that allow
Purpose - The purpose of the study is a one to two
subjects to train on their own. Studies that provide
sentence that describes the aim or the reason for why
food to subjects during diet studies have more
the study is being carried out.
control over studies that rely on self-reported
Hypothesis - Based on previous research and nutritional intake. Ethical and diligent researchers
understanding researchers develop what’s known will specify their studies strengths and limitations
as a hypothesis, a short explanation of the predicted in the discussion but paying close attention to the
results. Hypotheses cannot be proven, but when the details in the methods will allow you to identify the
data backs up the hypothesis it is “supported” and when level of control in a particular study.
it doesn’t its “rejected” 10.

Materials & Methods


The Material & Methods (methods) section is where the
study design is explained, detailing the procedures for
each measurement during experimentation. Methods
provide specific details of how the experiment was
carried out so that future research can attempt to
replicate and build on previous results. Key details that
you want to focus on are:

• Variables of interest: what did the researchers


manipulate and have control over (independent),
and which variables were tested or measured
(dependent).

• Participants: how many people were studied


and what were their characteristics. Were they
male? What was their training status? Were they
overweight?
• Duration of the study: how long did the
How To Read Research: A Biolayne Guide 10
Results Group Effect - This tells us if there was a significant
This is the section of publications that most people change within a group, this does not compare groups,
skip over or shy away from because most people find but rather tells us if a group made a real change. For
math and numbers confusing. Later we will provide a example, this would tell us if the high carb group
brief and general overview of statistics to help with experienced a significant change from baseline to
your confidence and ability to interpret results. In the post-testing.
results section researchers report the outcomes of the
statistical tests that include the relationships between Interaction (group x time) Effect - An interaction
data from experimentation 2. The results section effect is what you want to focus on if you wish to
also includes the majority of figures and tables that compare groups. This compares the rate of body
represent the data in a different way than reported in weight change from baseline to post-testing between
the text. The results section is written so that readers dieting groups. In other words, did the high carb group
can interpret the data from only reading the text and lose more body fat from baseline to post-testing or did
the figures are designed to represent the data in a the high fat group lose more body fat from baseline to
way that allows for interpretation without having to post-testing.
read the results section. The results section does
not include any of the researcher’s interpretation or Discussion
explanation of the data, that occurs in the discussion Like the intro, the discussion is a heavier section
section. There are three types of effects generally (not where researchers provide their interpretation and
always) reported in the results section that you should explanation for the results they found. There is no
focus on. For the following sections we will reference general format for this section, but includes an in-
this table for an example: depth summary of the results from the study that was
conducted. The majority of the discussion is focused
on comparing and contrasting the results of the
Changes in Bodyweight between a high conducted study to what has been previously reported
carb and high fat diet. by similar studies. The discussion and intro are good
places to learn about other studies you might not
have known about. Towards the end of the discussion
Diet Group Baseline Post-Testing you’ll generally find a disclosure of the strengths and
limitations of the study. Every study has limitations
and if a study doesn’t explicitly mention their primary
High Carb 200lbs 180lbs limitations, that could be a red flag. Some publications
also include a conclusion within the discussion section,
but some journals may include a separate section for
High Fat 190lbs 175lbs conclusions or practical recommendations.

Conclusion
Everyone knows what a conclusion is, but in this
Main Time Effect - This simply explains if there was short section authors give a final summary of the
a significant change in the dependent variable from main takeaways and practical recommendations.
baseline to post-testing for all subjects. Referring to This is a more concise version of the discussion,
the table above, this will tell us if there was a change short and practical.
in body weight from baseline to post-testing for both
groups (high carb & high fat) combined.

How To Read Research: A Biolayne Guide 11


Article 03 variables, this allows for greater objectivity when in-
terpreting research and drawing conclusions. This

Statistical section provides a simple overview of some common


and basic statistical concepts that you will encounter

Concepts
throughout exercise and nutrition research. Again, this
is a brief section and doesn’t even scratch the surface
of the broader and more complex statistical methods
that exist. Statistics operate under a number of as-
sumptions and rules, if these are violated, they can
Overview of Statistics
misrepresent the data. Statistics is not our area of ex-
Most people cringe at the word statistics and we un-
pertise and it’s important to realize that if you don’t fully
derstand why. Math and statistics can be complex and
understand statistics they can be misused to deceive
difficult to understand. There are various meanings for
people into believing the data is more appealing than it
the word statistics, which adds to its confusion. With actually may be.
a mixture of math and logic, statistics is a branch of
mathematics that is concerned with collection, anal- Percent Change
ysis and interpretation of data . Data are scores and
3
Very simply, this is the change between two values ex-
values that we obtain from measuring the outcomes pressed as a percentage. You have to be careful with
(dependent variables) of interest in a study. Collect- percentage change because it can sometimes appear
ing data is only one piece of the puzzle, if researchers to be a greater change than it actually is. That’s why you
don’t know what to do with the data and how to properly also want the raw or true values. For example, if a study
describe the data, then the findings may seem under- is looking at leptin changes and they have a baseline
whelming. Statistics are a way of describing data char- value of 0.3ng/mL and a post-test value of 1.0ng/mL,
acteristics and examining the relationships between the absolute change is 0.7ng/mL, but the percentage
How To Read Research: A Biolayne Guide 12
change is 233% [(1 - 0.3) / (0.3 x 100)]. While this change Standard Deviation
is minimal and may not be meaningful, the percentage The standard deviation is concerned with the variabili-
change can make it appear as if it’s a big deal. ty or the spread of a data set. As previously mentioned,
the mean is the central point of a data set and the stan-
Central Tendency dard deviation is an estimate of the variability around
The mean is probably one of the most commonly un- that central point. In other words, the standard devi-
derstood mathematical terms. The mean describes ation represents the typical amount that a score devi-
the average value of a group of numbers. In statistics, ates from the mean. When the standard deviation is low
the mean is a measure of central tendency, which rep- that means the spread or dispersion of scores is small
resents a central or balance point within a set of data 10. and more tightly grouped closer to the mean. When the
The mode and median are similar to the mean because standard deviation is large it signifies a widespread or
they represent centrality, but technically they’re slight- high variability of scores, when this occurs the mean
ly different. Mode refers to the most frequent value may not be a good representation of the data. The
that appears in a data set, which may or may not be mean and standard deviation are forms of descrip-
close to the mean. Median refers to the middle point of tive statistics which is useful for summarizing the
a data set, in other words 50% of the scores will fall un- data of a specific group. Meaning, they are only able
der the median. For example, let’s assume the following to describe the data we have accrued, it cannot tell
10 scores were collected during an experiment: us if the results we acquired will happen again. Oth-
er statistical tests can fall under another form known
6 6 6 10 11 12 14 14 16 17 as inferential statistics, which can allow (not always)
for conclusions and generalizations of a sample to the
larger population.
Mean = 11.2 The average of all scores
(6+6+6+10+11+12+14+14+16+17 **/** 10)
P-value
Probability is the underlying concept of p-values,
Median = 11.5 Middle value
which is the likelihood that something will occur.
(5 scores below and 5 scores above this value)
P-values reflect the level of significance, and the odds
that the findings are due to chance, it’s impossible to
Mode = 6 Most frequent score
have a p-value of 0 3. In exercise science the p-value is
considered to be ‘significant’ at p < 0.05. Meaning, re-
If the data set had an odd number of values, then the
searchers believe that the odds of their findings occur-
middle value is simply the median (ex. 1, 2, 3; 2 would be
ring by chance are 5 in 100 or they are 95% sure the re-
the median). Just remember there are slightly differ- sults were not by chance and the observed differences
ent ways to describe central tendency, but most often were a real change. In the results section when chang-
you’ll hear about the mean since mode and median are es of a specific variable are reported there is a p-value
only reported for certain instances. When evaluating reported after (e.g., 103.5 ± 15.1 ng/dL (*p* = 0.02)). In
data based on calculated means it’s important to iden- exercise and nutritional science, if the p-value is great-
tify any outliers or extreme values in the data. Outliers er than 0.05 the result isn’t deemed to be significant.
and high variability of data can produce inflated or mis- This is also stated as ‘supporting the null hypothesis’.
leading results because the mean is sensitive to outli- The null hypothesis states there isn’t a relationship or
ers and extreme values. In contrast, the median is not difference and instead the findings are due to sam-
sensitive to outliers and extreme values, meaning the pling error or random chance. Statistical tests are per-
median won’t change if there is a greater spread in the formed to either support or reject the null hypothesis
data. If the mean is being reported it’s important to also and anything less than 0.05 rejects the null hypothesis
take note of the standard deviation to account for this. and accepts the research hypothesis. Statistical sig-
How To Read Research: A Biolayne Guide 13
nificance is what you should identify when interpreting for meta-analytic conclusions, and they are commonly
results, but significant differences aren’t the only thing used for future study planning using a power analysis 11.
you want to focus on. A study might show that one type Effect sizes can be interpreted based on recommenda-
of diet lost significantly more weight than another type tions by Cohen 1988, which states that effect sizes can
of diet, but what if it was only by 0.5 lbs? That doesn’t range from small (d = 0.2), medium (d = 0.5), and large (d
mean much, but how do you determine if significant re- = 0.8) 12. Larger effect sizes are more significant. Effect
sults are meaningful? While p-values provide statistical sizes are also commonly used to plan future studies by
significance, effect sizes allow researchers to commu- predicting the sample size needed to detect a differ-
nicate practical significance of their results 11. ence, this type of test is known as a power analysis.

Effect Size Power Analysis


The effect size reflects the meaningfulness in the Statistical power relies on the effect size, the signifi-
changes that occurred during an experiment. While the cance criterion (generally p < 0.05) and the number of
p-value tells us if there was a statistically significant subjects (sample) in a study 11. When researchers are
and real change, the effect size tells us the magnitude planning a study, they want to know how many sub-
in that change. In other words, effect sizes tell us the jects they will need to detect a significant difference
magnitude of a relationship between two variables 8. between treatments or groups. To accomplish this
Effect size is an absolute value that represents the they perform what’s called an ‘a prior power analysis’
standardized difference between two means 3. Effect which includes using effect size estimates from simi-
sizes are frequently used in meta-analyses to compare lar research, the significance criterion of p < 0.05 and
results between different studies. Effect sizes have a generally accepted minimum level of power (0.80) to
been considered as the most important result of em- calculate the minimum sample size needed to observe
pirical studies because they are useful for providing the an effect of a specific size 11, 12. Researchers could also
magnitude of effects in a standardized metric despite use the sample size, significance criterion and power
differences in measurement techniques, they allow to calculate the minimal detectable effect size.

How To Read Research: A Biolayne Guide 14


Correlation Coefficient (r) Coefficient of Determination (r2)
Most of you have probably heard the saying, “correla- You will also encounter a statistic known as the coef-
tion, does not equal causation”. Correlation is an as- ficient of determination (r2). This is commonly used
sociation and in research we often want to know the with regression analysis and can be conceptualized as
degree of association between two variables across a a ‘correlational effect size’. It provides a percentage
group of subjects. In other words, an increase or de- of variance in one variable (dependent, outcome vari-
crease in one variable may occur with an increase or able we want to predict) that can be accounted for by
decrease in another variable, but the changes in one the variance in the other variable (independent, pre-
variable are associated (not caused) with changes in dictor variable) 3. By squaring the r-value you obtain
the other variable. There are different types of cor- R-squared (r2) which can be calculated to a percent-
relations used in statistics, but here we discuss the age. For example, if we wanted to predict how much
r-value, also known as the ‘Pearson product moment leptin (dependent) would decrease as fat mass (inde-
coefficient of correlation’. pendent) decreased in a group of people dieting we
would use the coefficient of determination. Let’s as-
The correlation coefficient is a statistic used to de- sume we obtained an r-value of 0.76 and you square it
(r2 = 0.762 = ~0.58 = 58%) the percentage signifies 58%
scribe the relationship between two variables (inde-
of the changes in leptin are predicted or explained by
pendent & dependent). The r-value can range from -1
changes in fat mass. If there was a regression line cal-
to +1. A negative r-value represents an inverse relation-
culated and drawn on a scatterplot with leptin (y-axis)
ship between two variables and a positive r-value indi-
and fat mass (x-axis), 58% of the data points would fall
cates a direct relationship (we’ll show you this visually
within that regression line.
in the ‘data representation’ section). For example, a de-
crease in body weight is commonly associated with a
T-test
decrease in leptin, this would be an example of a direct
The statistical test used to compare the differences
relationship (+r), whereas a decrease in body weight is
between two means is known as a t-test. The larger the
commonly associated with an increase in ghrelin, this
t-value the greater difference there is between means,
would be considered an inverse relationship (-r). An
larger t-values are likely to produce lower p-values.
r-value of 0 indicates no relationship and an r-value of
There are two types of t-tests we want to focus on.
1 indicates a perfect correlation, however it is likely im-
possible to achieve a 0 or 1 due to the variability in sub-
Independent - This type of t-test determines whether
ject responses and other influences related to physical two sample means are significantly different when the
characteristics, traits or abilities 10. In the scatterplot two groups being compared are unrelated. For exam-
section below, we will provide a visual explanation for ple, if a study randomized 20 subjects to a high carb
the strength and relationships of correlations. It is not diet and 20 subjects to a high fat diet and you wanted to
uncommon to evaluate the strength of correlation on a know the extent to which the mean weight loss differed
spectrum (0.1 – 0.3 = weak, 0.3 – 0.5 = moderate, 0.5 – 1 between groups, you would perform an independent or
= strong) 2. However, some statisticians advise against unpaired t-test. This type of t-test could also be used
this practice because correlation is context specific 10. to determine how different the two dieting groups were
For example, a correlation in biological in vitro exper- in terms of body fat percentage at baseline since base-
iments could commonly consider a 0.9 to be a strong line differences can pose problems.
correlation and correlations close to 0.5 would be
much weaker, whereas free-living experiments could Dependent - Dependent or paired samples t-tests are
consider a 0.6 to be a strong correlation. Regardless, used when comparing two groups that are related in
it’s important to remember that the closer the r-value some way or one group at multiple points in time (base-
is to 1, the stronger the correlation is. line and post-test: repeated measures). For example,
How To Read Research: A Biolayne Guide 15
if a study was measuring muscle thickness in 10 males tion)? There are a number of different types which we
before beginning a training program and then again fol- won’t cover here, but just know that this gives a more
lowing a training program, they would use a dependent specific idea of differences between means/groups.
t-test to evaluate the difference between the mean
muscle protein synthesis from baseline and post-test-
ing. If there are more than two groups we use a differ-
ent statistical test.
Data Representation
Analysis of Variance
If there are more than two means/groups we wish to Figures, graphs and tables are used to represent data
compare, we need to perform an extension of a t-test visually, which can provide a unique perspective and
known as Analysis of Variance (ANOVA) 10. The score greater understanding of the results. There are tons of
that is generated from running an ANOVA is known as different types of figures available, but we’ll talk about
the F-value (similar to a t-value) and indicates the size a few common types you’ll often see. A couple of key
of group mean differences. points we want to make regarding most figures and
graphs. Different journals will have varying formatting
One-Way - A one-way ANOVA is used to determine if requirements, but you can expect some components
statistically significant differences exist between 3 or to be the same. Underneath the actual figure there will
more means/groups. For example, let’s assume a study be a title and a description of what the figure is dis-
is comparing training volume with three groups (low, playing. You will also find any special symbols (i.e., *)
medium, high) and the dependent variable of interest is to be defined here, but generally the symbols that are
muscle growth. The ANOVA would tell us if a difference used to represent statistical significance or depict a
exists between low vs. medium, low vs. high, and me- relationship between variables. It’s important to take
dium vs. high. However, a one-way ANOVA fails to tell notice of the axis titles, units of measurement and the
us where the significant difference in muscle growth scale that is used. There are instances when the scale
is for the three groups. You could evaluate this unof- of a figure doesn’t start at 0 and this can lead to mis-
ficially by examining group means, but to statistically understanding of the actual data. If a graph or figure
test where the difference is, you will have to perform a scale doesn’t start at 0 there should be some type of
post-hoc test. break expressed with two dashed lines (//) to represent
a nonzero baseline. Generally, graphs are better for
Repeated Measures - You will frequently encounter providing a general overview or “big picture” view of a
repeated measures ANOVA in the statistical analysis set of data, whereas tables are better for exact values
section of many exercise science studies. This sta- and individual raw data.
tistical test is used to compare the same individuals
across time points (repeated measures). For example, Histogram
let’s assume a study is comparing muscle growth at the Histograms are a common figure and generally the eas-
beginning, middle and end of a training program. You iest to understand. Histograms are great when com-
would run repeated measures ANOVA to determine if paring groups or the distribution of a set of scores for
there were significant changes between time points. a particular group. Most people would also consider or
refer to these figures as “bar charts”. However, there’s
Post-Hoc - If an ANOVA detects a statistical difference a slight difference. Bar charts are used for qualitative
between means, we then want to determine where this data that are separated into categories (i.e., gender,
significance lies. Is it occurring within a group over race, other specific groups) and the bars are separated
time (baseline to post-test) or did one group exhibit a and not touching each other. Histograms have vertical
greater difference compared to the others (interac- bars that are directly adjacent to one another with no

How To Read Research: A Biolayne Guide 16


space (unless there’s an interval with no scores), signi- example below that is not to scale.
fying continuity 10.
There are 5 elements in all box plots that you want to
Scatter Plot know to understand this type of visual depiction:
Scatter plots are another type of graph that most peo-
ple are familiar with. This type of figure commonly re- 1. Q1: This is the first side of the rectangle and signifies
ports data points for individual scores for two variables the 25th percentile of the data set. Meaning,
but could also be used to display baseline and post-test
scores for an individual 13. You’ll find this type of figure
is used most for correlational analyses and while the
data points are not connected by lines, a non-vertical
line of fit can be generated to summarize or predict the
relationship between variables or data points, known
as simple regression 13. Simply by looking at scatter
plots we can get a pretty good idea of the type of cor-
relation and its strength.

Line Graph
Line graphs depict related data points that are con-
nected with a line, sometimes they include symbols
[13]. Line graphs are great when comparing time trials
where there are multiple testing points over a period
of time. For example, comparing the response of two 25 percent of the scores fall under this line.
different supplement treatments over a predefined pe- 2. Median: The median (as described previously) is
riod of time. the middle value and 50 % of the scores fall under
this value.
Box and Whisker Plots
Box and whisker plots (box plots) are used to depict the 3. Q2: is the right side of the rectangle and rep-
distribution of a data set. Once you understand each resents the 75th percentile, meaning 75% of the
component of a box plot, you’ll realize how simple and scores fall below this value.
effective they can be at summarizing a set of scores.
Usually box plots are vertical, but we have provided an 4. Whiskers: The whiskers can be found on either

How To Read Research: A Biolayne Guide 17


side of the rectangle and depict the minimum and
maximum values within a set of scores. Howev-
er, these do not include any outliers or extreme
values.

5. Outliers & extreme values: Outliers and extreme


values are any scores or values that are widely dif-
ferent from the rest of the data set and “stick out”.
There’s actually a mathematical way to determine
these for a box and whisker plot, but we’ll spare
you the details. Just know they are represented by
the O and E below and can be expressed as other
special symbols in different publications.

Forest Plots
You will mostly see forest plots in Joe Rogan podcasts
with James Wilks… just kidding. You typically see For-
est Plots in meta-analyses because they depict the under the figure doesn’t explicitly state what kind they
individual results as well as the pooled results of the are they can be rather meaningless 25. The standard
meta-analysis. Forest plots will indicate the strength of deviation (SD) bars represent the typical difference be-
the treatment effect with the y-axis containing a list of tween the data points and their mean, whereas stan-
the studies included in the analysis and the x-axis will dard error (SE) bars indicate how variable the mean will
have a distinction of what the studies favor (control vs. be if you repeat the study over and over, and more sub-
treatment) 13. Each study will have their mean symbol- jects or samples decrease the SE 15. You’ll notice in the
ized as a data marker and their respective confidence forest plot above that they included 95% CI error bars,
interval (we will cover next, but generally 95%) repre- which indicates where the true mean will fall within
sented as a horizontal line 13. The size of the data mark- that bar on 95% of occasions 15. SE and CI with wider
er generally represents the sample size, or the weight bars indicate larger error and shorter bars indicate
carried by that particular study in the meat-analysis. higher precision, as sample sizes increase the bars be-
Diamond markers are generally used to represent the come shorte 15. Error bars are helpful in visually depict-
overall or pooled result 13. In the example below adapt- ing the significance in changes between groups. When
ed by Morton et al. (2017), you will find three different error bars overlap the difference isn’t significant or in
diamonds 14. The first two unfilled diamonds represent other words, the larger the gap between error bars the
the pooled results of trained vs. untrained samples and smaller the p-value will be. Error bars can be valuable in
the filled in or dark diamond represents the overall or justifying the authors conclusions, but like any statistic
total results of the meta-analysis (including trained and they are only a guide and you should rely upon your own
trained subjects). Oftentimes forest plots will contain a logic and understanding to determine the meaningful-
clear description of what each marker symbolizes un- ness in the results being reported 15.
derneath the actual figure.
Tables
Error bars Tables are generally self-explanatory and describe the
Elements that you will commonly see on most figures different symbols in the figure legend/description be-
are error bars. Error bars are lines that represent the low. This table is from Layne’s PhD thesis where they
variability of the data being reported. There are differ- examined the time course of plasma amino acid lev-
ent types of error bars and if the legend or description els in response to ingestion of various protein sources

How To Read Research: A Biolayne Guide 18


[63]. What’s important to notice here is how the statis- 90 minutes. The Whey group at 90 minutes has an ‘a*’,
tics are portrayed. The first number is the mean for the while the Wheat group has a ‘b’, and the Wheat + Leu
particular group under the designated time category group has an ‘ab.’ So what does this mean? It means
and the second number is the standard error associat- that the Whey group is statistically different from the
ed with the mean. The letters after the standard error Wheat group and from baseline. Also, the Whey group
are used to statistically differentiate the means from was not statistically different from the Wheat + Leu
each other, while the means with an * indicate that they group since they both share an ‘a.’ The Wheat + Leu
are different from the baseline levels. For example, let’s group was also not different from Wheat since they
compare the 30-minute whey group leucine (Leu) lev- both share the letter ‘b’ and they weren’t significantly
els to the 30-minute wheat group leucine levels. The different from baseline.
whey group has an ‘a*’ whereas the wheat group has
a ‘b*.’ This indicates that these values are statistically
different from each other (different letters) and both
are significantly different than baseline (because they
both have a *). However, let’s look at threonine (Thr)
levels in the Whey, Wheat, and Wheat + Leu groups at

Post-prandial changes for plasma amino acids 1-3

Baseline Whey Wheat Wheat + Leu

Time (Min) 30 90 135 30 90 135 30 90 135

Leu 86 ± 4 226 ± 17 a* 164 ± 26 a* 173 ± 22 a* 151 ± 8 b* 86 ± 6 b 99 ± 5 b 211 ± 8 a* 137 ± 8 148 ± 3 a*

lle 69 ± 2 116 ± 11 a* 104 ± 4 a* 134 ± 16 a* 110 ± 6 b* 66 ± 3 b 86 ± 6 b 98 ± 6 b* 60 ± 4 b* 67 ± 1 c

Val 117 ± 5 234 ± 17 a* 161 ± 5 a* 186 ± 19 a* 154 ± 8 b* 91 ± 3 b 104 ± 7 b 131 ± 18 b 77 ± 6 b* 77 ± 2 c*

Lys 608 ± 24 1083 ± 78 a* 593 ± 34 688 ± 62 930 ± 64* 553 ± 28 698 ± 14 933 ± 67 597 ± 55 726 ± 48

Met 49 ± 2 102 ± 6 a* 62 ± 2 a* 80 ± 5 a* 72 ± 3 b* 42 ± 1 b 52 ± 3 b 71 ± 3 b* 44 ± 4 b 46 ± 2 b

Thr 309 ± 9 594 ± 73 * 567 ± 18 a* 554 ± 38 a 383 ± 21 330 ± 18 b 314 ± 22 b 387 ± 12 382 ± 20 ab 308 ± 13 b

1
Plasma amino acids express as umol/L.
2
Data are mean ± SE<; n = 5-6. Means without a common letter differ between treatments within.
time-points, P < 0.05.* Indicates different from fasted (P < 0.05).
3
12 h food-deprived controls.

How To Read Research: A Biolayne Guide 19


Article 04

Challenges for
Researchers
Research critics will often complain about studies
not performing a specific measurement or failing to
account for some variable. Oftentimes these criticisms
are invalid or unwarranted because of the limits
imposed on researchers. Armchair scientists who
unfairly criticize studies for certain aspects oftentimes
fail to recognize the challenges that researchers in
nutrition and exercise science face. Depending on the
academic institution, labs and universities vary widely
in the equipment and funding they have available for
research. Obviously, larger labs with graduate and
postdoctoral programs are able to attract larger grants
and more funding for projects which leads to more
sophisticated testing instruments and a higher level of
control over testing conditions. While there is growing
interest in exercise and nutritional sciences which
leads to more funding sources, there are still studies $5,600 and that’s just to test one hormone. That’s not
that can’t be conducted due to lack of resources. considering other lab supplies you might need, and the
researcher wouldn’t be able to pay their staff anything
Funding which means they would need to find students who are
The primary challenge for researchers in exercise willing to volunteer their time on top of their academic
and nutritional science is funding. There are various responsibilities. If you’re looking at studies that test
funding sources available such as governmental like the protein metabolism in rats, the cost of carrying out an
NIH, University grants, industry funding from food or experiment could be upwards of $50,000. Many studies
supplement companies, organizations such as ACSM, need to pay subjects to recruit the necessary sample
NSCA, and other private foundations and non-profit size and if it’s a dieting study that includes supplying
organizations. The unfortunate reality is that even with food, the cost of food can be astronomical. Nowadays
studies receiving funding, the funding generally isn’t many supplement companies are becoming more
enough to support the desired level of control to be interested in having scientifically validated research
considered a high-quality study. To give you an idea of to support the efficacy of their products for improved
how quickly the costs for a study can add up, here in marketing. Some studies sponsored by supplement
Florida the cost of performing a blood hormone test like companies can cost tens of thousands of dollars and
leptin is roughly $70 per blood draw. So, let’s assume you can even reach upwards of hundreds of thousands of
wanted to test 10 subjects before and after a diet, that’s dollars when offering to pay subjects to participate. We
two leptin tests per subject which adds up to $1,400 haven’t even discussed the costs associated with the
for only 10 subjects. That’s a small sample size and if instrumentation necessary to test certain variables in a
you wanted to make it a stronger study you would likely lab. Generally, departments receive funding from their
need more like 40 people which could cost upwards of Universities for lab related costs to maintain, repair
How To Read Research: A Biolayne Guide 20
or replace testing equipment. The amount received exercise science programs will have a metabolic cart,
yearly for department budgets is generally only enough treadmills, cycle ergometers, various types of body
to afford maintenance on their current equipment and composition testing instruments, heart rate and blood
replace regularly used supplies, they can’t afford to pressure monitors, and some other performance-
buy new equipment or replace machines every year. based testing equipment, but again this will depend
Most exercise science programs have what’s called on the university, the region and the faculty’s research
a metabolic cart (which we’ll discuss later) and costs interest. We will cover some common measurement
upwards of $20,000, that’s not including the costs to techniques later, but it’s important to understand that
maintain normal functioning or replace certain supplies very few labs have the most sophisticated testing
needed for regular use. That is why labs are limited by equipment like a metabolic ward, MRI’s or muscle
funding and the equipment they have available. biopsy testing, due to funding. Aside from the major
challenges of funding and lab equipment, researchers
Available lab equipment are governed by their institution to ensure responsible
It should now be no surprise why most exercise science research conduct.
programs can’t afford to have sophisticated testing
equipment. The type of equipment in a researcher’s lab IRB/ethics boards
will determine the type of studies they can conduct. Academic institutions have ethics boards or governing
Some labs are focused on more mechanistic studies bodies that oversee experimental research. At many
that involve molecular biology experimentation using universities the governing body is known as the
cells and microscopes, whereas other labs are focused Institutional Review Board (IRB) for humans and the
Institutional Animal Care and
Use Committee (IACUC) for
animal research . The purpose of
these departments is to ensure
safe and ethical standards are
being followed according to
laws and regulations. Before a
study can begin recruiting and
testing subjects, they must go
through a formal review process
to obtain study approval. This
is one of the most annoying
processes involved in research
because it’s time consuming
and tedious. It’s comparable
to filing your taxes, but more
detail oriented and time
consuming. While necessary,
this approval process can take
away time from conducting
the experiment because most
on more practical and applied research that investigate academic institutions operate on semester timelines
the effectiveness of a type of training modality. that may include breaks or holidays that interfere with
Researchers will focus on a specific area of interest the study timeline. So, if it takes 8 weeks to approve
and build their labs around that focus. The majority of a study and then another 3 weeks to recruit enough
How To Read Research: A Biolayne Guide 21
subjects that’s the majority of the semester and only 50 subjects and the training program consists of 3 full
leaves a few weeks to conduct an experiment. This is body days per week supervised in the lab by research
why you will often see studies that aren’t much longer staff. Not only will you have to create a schedule for
than 12 weeks in duration. The IRB process includes an the research staff to supervise each training day, but
informed consent for subjects and a very formal written you’ll also need to schedule each participant for each
study protocol explaining in detail every aspect of the training session each week. Not to mention, you’ll have
study, including how you intend to recruit subjects. to schedule your baseline testing, mid-point testing
(if there is one) and post-testing. Depending on which
Subject Recruitment measurements will be taken, it could take an hour for
Subject recruitment is the other annoying process each participant, which means 50 hours per testing
for conducting human research. Recruitment can be session multiplied by three testing points and that’s
difficult and time consuming for exercise science and 150 hours only for the measurement testing sessions.
nutrition researchers. As mentioned previously, many That doesn’t account for the hour each subject is
labs don’t have the necessary funding to pay subjects training in the lab 3 days per week over 12 weeks. The
to participate in their studies. Free protein powder time requirement researchers ask from their subjects
and supervised training in the lab can be an appealing can be a lot. This is a good example of why you don’t
incentive to some, but many others don’t want to follow see many training studies over 12 weeks, it takes a lot
a standardized program for fear of less than optimal of time and money!
results. This is why you generally see sample sizes less
than 50 in training studies. Even if a researcher is lucky
enough to recruit 50 people you generally have subjects
drop out due to various reasons and can end up losing Trusting Research
up to 20 subjects or more sometimes depending on
testing or intervention requirements. People have a How can you trust research and how do you evaluate
hard time following specific instructions, especially if studies that show conflicting findings? Individuals
it means changing their usual lifestyle to accommodate without research experience are at a severe
study procedures when there is no incentive to comply. disadvantage when it comes to being able to tease out
Think about asking college students to follow a specific the nuances and extrapolate upon results presented
diet and no alcohol on the weekends or asking them in publications.
to come to the lab early before classes for testing or
training, or how about asking them if it’s ok to stick Bias
a needle as large as a pencil in their leg for a muscle We all have our own biases towards certain ideas or
biopsy? Obviously, studies that include animal models topics, unfortunately most people either fail to admit
don’t have to ‘recruit’ subjects, but they have to pay or don’t realize they have a bias towards a particular
more for their ‘subjects’. topic. Good scientists recognize and acknowledge
their bias in an effort to tightly control for them in their
Scheduling and Testing experimental design. Being biased means having an
As mentioned earlier, scheduling and experimental unbalanced opinion or belief regarding a certain topic or
time frames can be a major issue in conducting idea. This often leads to being close-minded and failing
experiments, especially if operating under University to recognize conflicting or contrary evidence, beliefs
semester timelines. Even if studies have the or ideas. Scientifically speaking, bias is a systematic
opportunity to occur over multiple semesters or with deviation between an estimated value and its true value 3.
no time restrictions, scheduling can be a logistical In other words, it can be used to represent error.
nightmare for research staff. For example, let’s assume There are a few types of biases that are important to
a study is investigating muscle growth over 12 weeks in understand to become more critical of research.
How To Read Research: A Biolayne Guide 22
Confirmation Bias - This is essentially when people are a serious undertaking and require substantial time,
will cite evidence or report data that fits their bias or money, and effort to complete them. When the results
belief, while ignoring or failing to provide evidence turn out to be non-significant it can be crushing to
that says otherwise. You’ll oftentimes see unethical the researcher and the amount of time and headache
individuals cite one study that supports their argument they would have to put into getting it published just
while failing to acknowledge five other studies that isn’t worth it so they store it in a file and forget about
refute their argument. There could also be a scenario it (“file drawer effect”) 17. There’s also scenarios where
where someone misinterprets or takes very weak graduate students carry the responsibility of writing
evidence and glorifies it to make it seem stronger up and submitting their manuscript for publication
than it really is. Politics is a good example, you will after completing the research project and instead
oftentimes see certain media or news outlets reporting they either graduate or move on to another program
a story that is misleading or simply untrue. They may without completing the publication process. Other
use a weak study or twist the narrative of a particular times researchers still put in the effort to get their
topic to support their side of the story. Sometimes study published but due to publication bias of journals
you’ll see a news report showing only a piece of an it may be difficult or impossible to receive acceptance.
interview or press conference where it falsely portrays However, reasons for researchers being guilty of
an individual’s beliefs to make them look bad and push publication bias are due to lack of time, low quality or
their own political agenda. In research you may come an incomplete study, fear of rejection, or insignificant
across a discussion where authors are comparing their findings 16.
findings to other studies, but they fail to acknowledge
other studies that refute their findings. Even though resources, time and effort will go
wasted when studies aren’t published, there are
Publication Bias - Publication bias is actually a pretty some consequences of failing to publish studies with
common and unfortunate practice in the scientific negative results. Before researchers invest time in
community. This type of bias is concerned with designing a study, they obviously explore journals to
publishing studies that only report significant results. find publications that are similar to their research
Published studies that support their hypothesis question or hypothesis and evaluate their findings.
represent 85.9% of published studies in 2007 compared If a study isn’t published due to negative results and
to studies that reject their hypothesis 16. Let’s face it, another researcher wants to test the same hypothesis,
studies with stronger findings or significant results they will be wasting valuable time and resources on a
are more appealing to readers, especially editors and study that would produce negative results. Therefore,
publishers because they’re more likely to get cited in even though a study produces negative results it
other research, which leads to higher journal impact should still be published to inform future research.
factors and more revenue for journals 16. Completing a Additionally, unpublished data can misguide meta-
study with insignificant findings can pose challenges for analysis findings and conclusions. If meta-analyses
researchers and leaving them unpublished also poses are using data that only show significant findings when
a few issues. While the majority of responsibility for there are unpublished studies to conflict with some
publication bias lies with journal editors and publishers, studies, they can produce false positives and misguide
researchers can be guilty also. Researchers are busy recommendations 16. Appropriately performed meta-
and they usually have a research agenda planned out analysis of clinical trials are the highest quality of
so that once a study is completed, they can begin on scientific publications and commonly used for health-
the next project, and oftentimes they have multiple care decision making and therapies 18. One of the more
research projects occurring at the same time. Earlier we serious consequences of unpublished negative data is
briefly described what goes into designing and carrying the potential harm to individuals from pharmaceutical
out a research study, it’s obvious that research studies drugs or even supplements. Publishing these negative
How To Read Research: A Biolayne Guide 23
results could improve safety and standards of drugs transparent and acknowledge any potential personal
before they’re released [16, 18]. Maybe a supplement benefit or gain of the researchers or parties involved.
study is carried out and finds no positive effect of This should be a clear indication that they aren’t trying
their treatment, but there were some subjects who to “hide” something or be dishonest, it should represent
reported adverse symptoms or side effects. This the opposite. If dishonest researchers were attempting
study goes unpublished but could be detrimental to to conceal some relationship or personal benefit, they
someone’s health. simply would risk not listing a conflict of interest. Earlier
we mentioned various sources of funding including
Inflation Bias - Commonly referred to as “p-hacking”, food and supplement companies, governmental
this is when unethical researchers will try a wide organizations, private companies, etc. When you come
variety of statistical tests and then selectively report across a supplement company funding a dieting study
the significant results 17. This is essentially when or a study investigating the effectiveness of a particular
researchers torture their data until they obtain a supplement, this should raise a red flag, as with any
significant finding. It’s important to understand type of company funding a study that investigates their
that statistical analyses should be pre-determined product. But again, it just means you should evaluate
and a part of the study design process. P-hacking the findings more critically. Before even evaluating the
commonly occurs when researchers conduct a study results check the study design. Was it a randomized
and after collecting data decide to perform additional placebo-controlled design? If not, you should be very
or different statistical tests based on the gathered apprehensive to the findings and results. Randomizing
data. Another common occurrence is when they and having a placebo-controlled design is essential
simply eliminate outlier data from subjects who didn’t when comparing treatments.
respond or responded much greater than the rest of
the group. Another situation in which researchers are Evaluating Conflicting Evidence
guilty of p-hacking is when they manipulate or change Let’s assume there have only been two studies published
the groups, they established at the beginning of the on a certain topic and they report contrasting findings.
study to make one group look like they experienced How do you determine which study is better or which
greater change. Lastly, p-hacking can occur from study to trust? This is a difficult question to answer
researchers performing data analysis part way through and involves many considerations, but we will highlight
the duration of the study and discontinuing the study certain aspects and key details you’ll want to focus on.
based on their results or simply not performing other
statistical tests once they find significance [17]. Results - The level of significance of the results
Ethical researchers will do their best to address and is important and this is one of the first things you
acknowledge their biases, which sometimes can be should notice, but as mentioned previously (statistical
unintentional. Unethical researchers obviously make concepts), how meaningful are the results? Remember,
choices with illintent and biases are irrelevant in those we want to see a P-value < 0.05 and the higher the effect
situations. Science and peer-reviewed research does a size value, the more meaningful it is. After evaluating
pretty good job at weeding out the bad apples and part the statistics, check to see if there is any missing data
of this deals with addressing conflicts of interest. or if authors also published raw data within the text,
appendix or supplementary material. A good example
Funding Sources / Conflicts of Interest - Any time is, if a study is comparing two different types of diets,
there is a conflict of interest listed at the bottom of they should have a table showing their respective diet
a publication it should be evaluated more critically. compositions, if not some type of food records or
However, this doesn’t mean you should immediately nutrition data. If there isn’t any type of nutrition data
discredit or dismiss the study or the findings. Ethical and it’s a diet study, we would be VERY cautious of the
researchers list their conflicts of interest to be findings and the conclusions that are drawn. Publishing
How To Read Research: A Biolayne Guide 24
raw data is not necessary, but it’s a good practice and if 10 subjects it carries a lot less weight than studies
there’s raw data available look it over for yourself to see with larger cohorts, but they can still be valuable and
if there’s any glaring issues or if some of the numbers contribute to the body of literature. Case studies are at
don’t add up. Within the results section they obviously the bottom of the totem pole for study designs, but for
will report the results from statistical analysis for investigating certain novel topics they can be the only
the primary variables of interest, but they should appropriate design available. These types of studies
also provide some type of figure or table to visually should just be interpreted with caution and understand
represent the data. Lastly, do the results of the study that their ability to draw strong conclusions is severely
agree with previous studies? It’s ok if they don’t, but in limited. The caveat to this is with studies that are
the discussion the authors should explain conflicting extremely well controlled but have a small subject
results and if there is a reason why results don’t agree. number. An example of these types of studies would
be metabolic ward nutritional studies. In these studies
Study Design / Level of control - How much control every piece of food is provided to the subjects and
did the researchers have over the independent they are housed in a ward that measures their energy
variables? Did they provide food to participants if it expenditure. These types of studies do not need to
was a diet study? Were they supervising the resistance have a high subject number in order to be impactful due
training program prescribed to participants? How to their high degree of control. They are also incredibly
did they control free-living conditions? Obviously, expensive which is why they typically don’t have a high
there are no mandatory requirements researchers subject number.
should be following for their study design, this will be
limited by their laboratory techniques and equipment Study Duration - You will generally encounter training
they have available. But, there are some things you studies in exercise science with durations around
should be asking yourself when reading through the 12 weeks. This isn’t a bad thing, but the strength of
methods section, how did they test and control for X,
Y and Z. If a study had subjects in a metabolic ward
that’s far more valuable data than any free-living
study. Similarly, if a training study doesn’t mention
anything about supervised training it’s going to carry
more confounding variables and limitations than a
study that included supervised training in the lab for
the duration of the study. The level of control is going
to significantly impact the sample size and the study
duration. Increasing the level of control comes at a
cost, higher control = higher cost and generally leads
to a smaller sample size and shorter study durations
to maintain that level of control. Unlike human model
designs, rodent models offer a high level of control,
longer study duration and a larger sample size at a
smaller cost compared to human subject designs. But
the results aren’t always transferable to humans.

Sample size - How many subjects were included in the


study? Generally, studies with less than 10 subjects is
a poor sample size and less likely to lead to significant
changes in the outcomes. If a study has less than
How To Read Research: A Biolayne Guide 25
evidence is going to be less than a study of 24 weeks, living studies. It’s ok for studies to have limitations and
assuming all else being equal. Longer study durations generally they’re outside of the researcher’s control,
provide a bigger picture of what could happen. It’s like but major limitations should be clearly stated and
having two cars drag race, maybe one car has greater explained towards the end of the discussion. With that
acceleration and pulls ahead for the first ¼ mile, but being said, the researchers aren’t going to state every
the other car has greater overall speed and ends up little thing that’s wrong with their study, so don’t expect
winning the race. With longer durations we can have a that. Any major methodological limitations should be
more dependable and reliable idea of the changes that explained. Examples of some limitations are low sample
could occur. The difficulty with longer studies is that sizes, study durations, lack of control over a specific
they are more expensive and less likely to have a high measurement due to lack of laboratory resources, lack
degree of control as they become more invasive to the of generalizing the findings, differences in treatments,
subject’s lives. characteristics of subjects, issues with measurement
devices, etc.
In general it’s important to understand the limitations
that exist in all scientific studies. In general, if you Measures - It’s important to evaluate the methods
want to conduct a long term study in humans, it will section for the types of measurements they used
either be a low subject number or not well controlled to test the dependent variables. There is no perfect
or both. If you want to conduct a tightly controlled measurement available, it’s impossible to measure
study in humans it will likely be short in duration or low someone’s true or exact score of any measure and every
in subject number or both. If you want to conduct a device used in research will have a certain level of error
long term, tightly controlled study with a high subject associated with them. We may have “gold standards’’’
number, it will likely be in animals. Below is a venn or measures that we use to validate other measures,
diagram providing you with a conceptual framework but this is done through correlations and the criteria
to give you a better idea of the give and take between we use to validate other measures have their own error
variables for study designs. rates associated with them. Underwater weighing
used to be the “gold standard” for measuring body
Treatment/Intervention - Any study that involves composition, now we use the 4-compartment model
groups with different treatments or interventions, because it’s been shown to be more reliable [19]. This
it’s important to take note of the dosages or the doesn’t mean any study that uses hydrostatic weighing
amount of the treatment or intervention. If a study is useless, we just need to be critical of its error rates.
is investigating a specific supplement, is the dosage There are endless types of available instrumentation
clearly stated and is it an appropriate dosage to to measure certain variables and we’ll cover some
elicit a response? If you’re looking at two studies common measurements in the next section. When
that compared the effects of caffeine on heart rate, evaluating measurements, we are concerned with the
it should be obvious that whichever study used the validity and reliability of that measurement.
higher dose will see a greater heart rate. If two training
studies are comparing muscle growth in a specific Validity - Validity is arguably the most important
muscle, the level of training volume and intensity are consideration for measurement technique and
going to have a major impact on their outcomes. If indicates the degree to which a device measures
the study is investigating supplements they should be what it’s supposed to 3. This is concerned with how
randomized and placebo-controlled to account for the accurate and “true” the measurement technique is.
various confounding variables and limitations. There are a number of types and ways to establish
validity of a measurement technique. Frequently in
Limitations - Every study carries limitations, you can’t research, validity is established by comparing one type
account and control for everything, at least not in free- of measurement to a criterion method. For example,
How To Read Research: A Biolayne Guide 26
the 4-compartment model that uses bod pod for body Standard error of measurement - Standard error
volume estimates was used as the criterion to determine of measurement (SEM) is calculated using the ICC
if Dual-Energy X-ray Absorptiomertry (DXA) would be and the standard deviation of scores, which means it
an acceptable method to measure body volume 20. The accounts for the variability and reliability of the test.
validity of a measurement is more difficult to establish This value tells us the level of error and precision of a
measurement. SEM values can be viewed as a range,
than reliability. ReliabilityReliability is concerned
plus or minus around the predicted or measured value.
with the consistency of the measurement technique.
For example, if you’re testing body fat percentage
Reliability is the degree to which a device produces
and you measure someone at 15% body fat and the
stable or consistent results. If a measurement
SEM value is 3%, their true percentage is somewhere
technique is not consistent, then you cannot trust
between 18% and 12% body fat.
the test. In other words, “a test cannot be valid if it’s
not reliable” 3. Before performing an experiment it’s Minimal detectable difference - The minimal
important to test laboratory equipment that will be detectable difference (MDD) is calculated using the
measuring our dependent variables to ensure consistent SEM. This tells us how sensitive the measurement is.
and accurate results. This doesn’t have to be done It provides a value in the common unit associated with
prior to every experiment, but the equipment used in the testing device and tells us the minimum amount
research should be tested to ensure reliability. The test- of change needed to exceed measurement error and
retest method is a common technique used to estimate to be considered a ‘real’ change. For example, if the
MDD of an RMR machine is 100 kcal then the person
the reliability of testing devices by performing one test,
you’re testing would have to have an RMR greater
then after a specified time interval, test again 3. We can
or less than 100kcal between testing points to be
then perform some stats to obtain some values that tell
considered a real change. You may often see different
us how reliable our instruments are. Not all studies do
terms used for these three statistics. SEM can also be
this and some studies test reliability in other ways, but
called standard error of estimate (SEE), MDD can also
it’s good science to report some type of reliability for be called minimal detectable Change (MDC), just know
testing devices, to ensure changes that occurred are there may be different names that essentially resemble
dependable. You will generally find these values in the the same meaning. There are also many other statistics
methods section after a brief explanation of the testing available to test validity and reliability of measurement
procedures for a specific device. techniques. These are just a few common ones you
might come across and hopefully give you a better idea
Intraclass correlation - The intraclass correlation of the error rates associated with testing different
coefficient (ICC) is calculated by running a simple ANOVA variables. We want to reiterate that oftentimes people
overlook the error rates associated with some measures
to produce a reliability coefficient (similar to coefficient
and assume they are accurate and/or exact scores.
correlation, as described in the stats section) that
With in-vivo studies it’s impossible to know the true and
provides an estimate of the error variance of a testing
exact score of certain variables, we test them which
device. This is a good indicator of the stability of the
gives us a good estimate or prediction of the score and
measurement. Values closer to 1 resemble scores that we have to know there is always a certain level of error
have a high similarity or high correlation as in other associated with the device and/or technician. So long
correlational scores, likewise scores closer to 0 mean as the same technician, same device and testing is done
they are less similar. In other words, scores closer to 1 under the same conditions, we can use measurements
have less error and better reliability. to compare changes over time.
How To Read Research: A Biolayne Guide 27
Article 05

Common
Methods for
Measuring
Variables
Body Water
Deuterium Dilution
Deuterium is a stable isotope of Hydrogen and
deuterium dilution serves as the “gold standard” or
criterion method for total body water assessment.
Researchers use a labeled water that contains a large
quantity of deuterium (“heavy water”) and measure
concentrations in the urine, blood or saliva to measure
total body water. There are other isotopes that can
be used in a similar manner to the deuterium dilution transmission. Based on fat mass content in your body,
method, but most commonly it is deuterium that’s the impedance (resistance) of the electrical current
used as a tracer. Using this method subjects void their is measured using Ohm’s law (resistance = volume
bladders than drink water with the labelled isotope and / current) which can then be applied in an equation
after it has equilibrated in the body for a duration of time to quantify water volume, percentage body fat, and
researchers most commonly collect a urine sample. FFM 21. There are many different types of BIA devices
The urine is then analyzed using a mass spectrometer available and vary based on specific frequencies, cost
to determine total body water levels. This method is and complexity, which will impact the validity and
expensive, time consuming and requires sophisticated reliability of the specific device being used. Nowadays
laboratory expertise 26. For this reason, other measures you will commonly see BIA technology integrated into
have been developed to more conveniently measure at-home body weight scales. When used for body
total body water (TBW). composition assessment, research indicates that BIA
is comparable to DXA when estimating BF%, fat mass
Bioelectrical Impedance Analysis (BIA) or fat-free mass (FFM) 27. However, other research
BIA technology uses a small electrical current that indicates that single assessments using DXA or BIA
is transmitted through your body extremities and is questionable due to their accuracy on an individual
between voltage detecting electrodes (contacting level [28]. When compared to deuterium dilution
hands and/or feet). Water conducts electricity and for measuring TBW, BIA is close in accuracy, but still
tissues like fat mass and bone have very little water slightly underestimates TBW 29. BIA shows promise
which increases the resistance (impedance) of the in accurately estimating TBW, however accuracy in
electrical current thereby decreasing the rate of its measurement can vary based on the population being
How To Read Research: A Biolayne Guide 28
studied and with little research comparing BIA to fat mass and/or fat-free mass (FFM). The only direct
deuterium dilution, the validity to accurately estimate measurement of body composition would involve
TBW remains questionable 30. Nonetheless, evidence performing an autopsy on a human cadaver to dissect
suggests BIA is acceptable for assessing TBW and and weigh various tissues and organs, which is obviously
displays acceptable accuracy when assessing body impossible for free living experiments. Therefore, we
composition if incorporated into a multi-compartment estimate body composition based on what we know
model 28. Another tool that shares similarities to BIA about the weight and composition of various tissues in
known as Bioelectrical impedance spectroscopy (BIS), the body. It’s important to understand that there is no
seems to exhibit greater validity and reliability than BIA perfect estimate and all techniques and methods have
when assessing TBW 26. error rates associated with them. For this reason, we
cannot place a high level of importance with a specific
Bioelectrical Impedance percentage of body fat. Rather, we use it as an objective
Spectroscopy (BIS) measure to quantify and track changes to determine
BIS features the same underlying technology as the effectiveness of specific interventions.
BIA to estimate body composition and water, which
includes an electrical current traveling through the Skinfold
body between electrodes to measure the impedance The most common and cost-effective method for
of the electrical current. BIS devices differ from BIA estimating body composition is the skinfold technique.
devices by utilizing a ‘spectra’ of frequencies, which is This technique assumes a 2-compartment (2C) model
where the term spectroscopy comes from 30. Although (more on multi-compartment models later), splitting
there are single and multi-frequency BIA devices on body weight into fat mass and FFM.
the market and it’s unclear at what frequency a BIA
could be considered BIS; BIS uses Cole modelling This technique requires firmly grasping the subject’s
to predict body fluids, which has been suggested to
subcutaneous fat and skin with the thumb and
be superior for assessing body composition using
forefingers to measure the thickness (in mm.) with a
impedance based methods 30, 32, 33. BIS is also useful in
caliper. You can accomplish these measurements with
differentiating between intracellular and extracellular
as few as three sites or as many as seven including
body water. The underlying principles used for BIA
the triceps, subscapular, suprailiac, abdominal, upper
and BIS are the same for estimating body composition
thigh, chest, and midaxillary. Measuring seven sites
and either device can acceptably be utilized for body
water estimations, however it appears BIS is more
accepted 26, 28, 30, 33. It’s important to keep in mind the
underlying principles for how these impedance based
devices were developed and they’re primarily for body
water assessment, although they can predict body fat
% (BF%), other body composition methods would be
more acceptable.

Body Composition
There are a number of techniques and methods
available for measuring body composition, specifically
How To Read Research: A Biolayne Guide 29
give a more accurate estimate of BF% because it the group average BF% the error in BF% estimation
can account for body fat distribution, some people could be only 2%. These are arbitrary numbers and don’t
hold more fat in their lower body compared to upper reflect the true error rates of skinfolds, those will vary
body. The sum of these site measurements are added depending on the equation, population and criterion
together and plugged into a prediction equation to method being used for comparison. Nonetheless,
estimate body density, which is then plugged into the skinfolds are the most cost-effective method and with
Siri equation to estimate body fat percentage (BF%) a skilled technician and correct equations, they can
34
. There are a number of body density prediction provide an accurate estimate of body composition.
equations available and it’s important to use a population
specific equation because the coefficients used in the A-mode Ultrasound
calculations can produce inaccurate estimations for A-mode ultrasound uses ultrasonography technology,
individuals with varying body fat levels. When using which transmits a signal through the skin and tissues
an appropriate population specific equation, skinfold and the reflection of the signal at tissue boundaries is
fairly accurately predicts BF% (± 3-4%) 35. The great transmitted back as an “echo”. There is also another
thing about skinfold is not only the low cost, but you type of ultrasound known as “B-mode” (we’ll cover
can track site-specific changes
to gauge the rate and location of
fat loss. Additionally, this is one
of the few measurements that
actually assess fat thickness,
most other measures use X-ray
beams and imaging techniques
or electrical currents to assess
fat mass. This technique is
only as accurate and reliable
as the technician who is
performing the test. The
technician must have a lot of
experience developing this skill
to precisely identify anatomical
site location and accurately
measure fat thickness
consistently. When compared
to computed tomography (CT
scan) skinfold shows a strong
correlation when comparing
measurements performed in the abdominal region later), but we’re specifically referring to A-mode
36
. However, studies comparing skinfolds to the gold ultrasound. Bodymetrix has developed a handheld
standard 4C model, results indicate large individual portable device that is used similarly to how skinfolds
error rates, but acceptable group average values 37, 39. are conducted. The device can be used to measure
Meaning, when you test one person the error rate can as few or as many sites as desired, simply select the
be much higher compared to measuring and averaging equation and number of sites from a drop-down menu
the BF% of a group of people. For example, you could in the software. This technique also relies on the skill
compare skinfolds to another method and see an over of the technician. One of the primary benefits is being
or under estimation in BF% by 6%, but when comparing less invasive since it does not include “pinching” the

How To Read Research: A Biolayne Guide 30


subject and while the cost is much less expensive weighing was previously considered the gold standard
than other sophisticated laboratory equipment, it and criterion method to validate other methods, now
is still more expensive than skinfolds. The unique we have more non-invasive techniques available that
aspect of this device is that it can also produce an can provide greater BF% accuracy.
image of the muscle and fat layers. This device has
not been validated to measure muscle thickness, but Multi-compartment Models
some researchers suggest it could be a useful tool for The advancement of technology and how we understand
measuring acute changes in muscle thickness 40. For body composition has led to the development of more
body composition it hasn’t been validated adequately accurate and precise assessment techniques. Multi-
to the same degree as other measures, but studies compartment models are considered the criterion
show strong agreement between skinfold and air for validating other methods of body composition
displacement plethysmography (ADP) 41, 42. estimates 43. By including more measures and multi-
compartments we can reduce the assumptions made
Body Volume Measurement regarding various tissues weights and volumes,
Underwater weighing (UWW) and air displacement leading to a more precise estimate by measuring
plethysmography (ADP) accomplished via Bod Pod, are them. It would be reasonable to assume that by
used to measure body volume by applying Archimedes introducing more measurements the error rates
principle, which allows for calculation of body density. associated with those measurements could reduce
Body density can then be used in an equation (generally the accuracy, but research shows these error rates are
the Siri equation) to calculate BF%. Underwater negligible 44. Multi-compartment models range from
weighing is conducted by having the subject sit on the traditional 2C all the way up to 6-compartments
a flimsy carriage that is connected to a scale (it’s like (6C). The 4-compartment (4C) model is viewed as
a human produce scale) and lowers them into a pool the gold standard for body composition assessment
of water. The subject’s nose is pinched closed and 45
. The 4C model will be accomplished based on the
they are instructed to blow out all of their air as they instrumentation that labs have available, but generally
are slowly submerged into the pool in a fetal-like it is accomplished using a DXA and a measurement of
position. The testing procedure for this technique body water (generally BIA). It has not been established
is probably the worst compared to others. Imagine if increasing the complexity of these methods
exhaling all of your air while hunched over, remaining justifies the potential benefits 46. The more complex
as still as possible, while being lowered into a pool and sophisticated these models become, the cost of
while researchers attempt to record your weight. Prior testing increases due to the instrumentation needed
to being submerged in water, researchers measure to measure various tissues, making it impossible for
residual lung volume to account for air trapped in the some research labs.
lungs after full exhalation. The Bod Pod is very similar
to underwater weighing, except using air, and involves Dual-Energy X-Ray
a much more comfortable testing procedure; although Absorptiometry (DXA)
those who are claustrophobic may not agree. Subjects Dual-Energy X-Ray Absorptiometry (DXA) is a common
are placed in a large plastic “pod” like device with a and popular method to test body composition. DXA
small window. While sitting on a small seat wearing machines were originally developed for bone mass
a swim cap, body volume is measured within a few assessment. Now they have become a common method
minutes by subtracting the initial volume of the empty for body composition testing, if labs are fortunate
chamber by the reduced air volume with a person enough to have the funding to support the high cost
inside. This method estimates body composition very associated with them. DXA is a practical and non-
closely to hydrostatic or underwater weighing, since invasive way to measure body fat percentage. Subjects
they use similar underlying principles. Underwater comfortably lie supine on a table for the 10-15 minute

How To Read Research: A Biolayne Guide 31


2 Compartment Model 3 Compartment Model

4 Compartment
Model

5 Compartment Model 6 Compartment Model


How To Read Research: A Biolayne Guide 32
test while two low-energy X-ray beams (with minimal As you can see all of these techniques and methods
radiation exposure) slowly pass across the body. carry some limitations and while some methods may
The computer software generates an image of the be more accurate or precise, they are all acceptable
underlying tissues and quantifies bone mineral content methods for estimating fat and fat-free mass. Since
(BMC), total fat mass, and FFM 21. Additionally, DXA has a large portion of FFM is muscle, you may see some
the ability to perform regional body tissue analysis to of these techniques used to infer increases in FFM as
determine if specific areas of the body have lower or increases in muscle growth (hypertrophy) 51. However,
higher body fat or BMC. Many believe DXA scans are a there are more direct and appropriate methods
superior method for testing BF%. However, if certain available to assess hypertrophy.
variables are not accounted for and if DXA scans are
not performed correctly (like any measure) there is
potential for high error rates. DXA scans operate under
a 3-compartment model, splitting body weight into:
body fat, fat-free mass (FFM) and BMC. Body water Protein Metabolism
fluctuates throughout the day based on water and
glycogen stores and these fluctuations can lead to In the body, protein is in a continuous state of
large error rates because DXA fails to account for body breakdown and synthesis, this simultaneous process
water. This is supported by previous research showing is known as protein turnover. In a typical 70kg male,
a 3C model with a body water measurement produces about 0.3kg of protein is degraded and replaced each
smaller error rates than DXA when compared to a 4C day to avoid the breakdown of stored protein 22. Protein
model 28. When looking at group level comparisons, metabolism is a complex and intricate process that
DXA seems to have pretty good accuracy compared to requires sophisticated laboratory equipment and
the gold standard 4-compartment model 28. However, testing techniques. Here we describe a few methods
when looking at individual comparisons or changes, that are commonly used to assess protein turnover.
the error rates can be much higher, especially if
individuals differ in certain characteristics such as sex, Isotopic Tracer Method
size, fatness or nutritional status 28, 47. The error rates Muscle Protein Synthesis (MPS) is one of the more
of DXA scans will vary from study to study depending complicated measures to explain. To assess the rate of
on methodological differences of the study design, MPS scientists often use a ‘tracer’, which is a molecule
but research has shown that DXA error rates can be that they can track and ‘see’ which tissues it ends
as high as 8-10%, which is similar to the error rates up in. In the case of MPS we use either a radioactive
of hydrostatic weighing 48. While DXA shows a strong (less common) or stable isotope form of an amino acid
correlation to CT scans, DXA still underestimated fat to measure MPS. You may remember from general
weights by 5kgs 49. The accuracy of DXA has also been chemistry that an isotope is an atom that has a different
questioned when evaluating weight loss changes from number of neutrons than normal, which increases its
a study that simulated weight gain by wrapping lard weight. Since it’s heavier than a normal molecule we
around subjects and performing a DXA scan. Results can use a gas chromatography mass spectroscopy
showed that the DXA scans quantified the lard as bone (GCMS) to separate it from the ‘normal’ molecules. A
mineral content rather than fat 50. For these reasons, common amino acid isotope used to assess MPS is
results should be interpreted with caution from D-5 Phenylalanine (an amino acid). D-5 means the 5th
studies using exclusively a DXA scan to evaluate weight carbon on the phenylalanine is deuterated hydrogen
changes. Instead, researchers should incorporate DXA which contains an extra neutron, thus making it heavier
scans into a 4C model that also accounts for body than normal phenylalanine. D-5 Phenylalanine is often
water to more accurately estimate body composition chosen as a tracer because it is not metabolized by the
changes. muscle (although various other amino acids are used

How To Read Research: A Biolayne Guide 33


as well), so it can be assumed that any D-5 that winds pestle), homogenized, and then taken through various
up in muscle protein did so due to MPS. chemical reactions in order to separate the protein
bound amino acids from the intracellular amino acids
To assess MPS, typically the amino acid isotope is (this is usually done by adding perchloric acid to the
infused or injected into the bloodstream of the subject sample). The intracellular amino acids and peptide
that is undergoing whatever treatment is being bound amino acids are then taken through several
provided. The tracer will then be taken up by the muscle other chemical reactions to prepare them for the GCMS
in the form of intracellular amino acids or incorporated and then run through the GCMS which allows scientists
into proteins via MPS. This ratio of peptide bound to determine the concentrations of the tracer in the
tracer vs. intracellular tracer forms the basis behind muscle and intracellular fluid by separating the tracer
determining the ‘rate’ of MPS. To put it in more practical from the normal amino acid on the GCMS (the gas
terms, if the tracer is found in greater concentrations chromatograph helps separate the isotope based on
in muscle proteins in one treatment group vs. another, weight since it’s heavier). Then the concentrations
it is likely that the first treatment group has higher of the tracer in each sample can be determined by
rates of MPS since more of the tracer wound up there. comparing them to standardized concentration
samples that are also run through the GCMS. Once we
The actual equation of MPS is a bit more complicated have the concentrations of these samples, we can plug
than this and for bolus injections of isotopes (usually them into our equation to determine MPS.
done in rodents) the equation is: MPS (%/hr) =
MPS = (Eb x 100)/(Ea x t) where t is the time interval Easy right? We doubt anyone is saying that and we can
between isotope injection and snap freezing of muscle assure you that it’s not. The entire process is extremely
expressed in hours and Eb and Ea are the enrichments sensitive to error, which takes around 2 weeks to
of 2 H5-phenylalanine in hydrolyzed tissue protein and analyze ~100 samples and is a minefield for potential
in muscle free amino acids, respectively. In the case of errors. Scientists have to be borderline obsessive about
infusing a tracer the equation is portrayed as: MPS (%/ handling their samples and execution of reactions in
hr) = (Ep2 - Ep1)/(Eic)/(t 100) where Ep2 and Ep1 are the order to ensure good data.
protein-bound enrichments from muscle biopsies at
time 2 (Ep2) and previous muscle biopsy at time 1 h (Ep1). Nitrogen Balance
Eic is the mean intracellular phenylalanine enrichment Nitrogen balance is the difference between nitrogen
from the biopsies and t is the tracer incorporation time. intake and nitrogen excretion. A negative nitrogen
balance occurs when nitrogen excretion is greater
We realize these equations probably look quite than nitrogen intake and vice versa. A neutral nitrogen
daunting but the only thing you need to know is that balance is said to occur when nitrogen intake is equal
we are comparing the incorporation of the tracer at to nitrogen excretion. Protein contains roughly 16%
one time point vs. another time point to see how much nitrogen content on average, so by knowing protein
has been incorporated into muscle tissue and in what intake we can then calculate nitrogen intake 22. Nitrogen
timeframe. If we have that information and we have excretion on the other hand, is more complicated
the intracellular concentrations of that tracer, then we to measure and control for. Nitrogen excretion can
can determine the rate of MPS. Once a biopsy (human occur through urine, feces, sweat, and skin 22. One of
testing) or sacrifice (animal studies) is performed and the primary drawbacks to this method is attempting
the muscle tissue is taken, it is immediately frozen to quantify nitrogen excretion which can often lead
in liquid nitrogen to ‘freeze’ all metabolic processes to an underestimation of total nitrogen excretion.
so that there is now a ‘snapshot’ of the muscle Another drawback of the nitrogen balance method is
metabolism. The tissue is then later ‘powdered’ (fancy the effects of dietary intakes. During caloric restriction
word for grinding it into a powder with a mortar and an increase in nitrogen excretion can occur, even when

How To Read Research: A Biolayne Guide 34


protein intake is high and when protein intake increases acid that cannot be metabolized or produced in muscle
nitrogen excretion generally increases as well. These tissue and monitoring how much goes in and out of
drawbacks can lead researchers to overestimate the muscle 22, 25. Phenylalanine, tyrosine and lysine
nitrogen intake and underestimate nitrogen excretion, are not metabolized in muscle, but most often you’ll
leading to inaccurate estimates of nitrogen balance see phenylalanine used 22. Arteries carry blood and
22
. The majority of nitrogen stored in the body resides nutrients to the skeletal muscle and waste products or
in skeletal muscle tissue and is often used to assess nutrients are carried out of the skeletal muscle through
muscle protein metabolism, however this method is the veins. By inserting a catheter into the vein and artery
more indicative of whole-body protein turnover and of the leg or arm, researchers can then measure the
does not specify tissue specific protein metabolism. concentration of phenylalanine in the veins and arteries
This is important because while skeletal muscle mass is at those locations. Then muscle protein synthesis is
the largest source of nitrogen in the body, the turnover determined by the disappearance of phenylalanine
rate for skeletal muscle is very slow at only ~1% per day in arterial blood (signifying phenylalanine being
whereas the liver and gut tissues turn over at 30-80% deposited in muscle protein), and the appearance of
per day. Due to this, nitrogen balance changes often phenylalanine in venous blood signifies muscle protein
reflect what is occuring in those tissues vs. muscle breakdown 22. The obvious downside to this is you can’t
mass. have catheters inserted in subjects indefinitely. This
technique is employed for short durations, usually only
3-Methylhistidine a few hours after a specific treatment. This only gives a
3-methylhistidine is an amino acid present in actin and small snapshot of what could occur, the problem is that
myosin, which are contractile units of muscle fibers. the observations are generalized or extrapolated into
3-methylhistidine can be measured from a muscle long-term changes.
biopsy, in the blood or more commonly in the urine.
Unlike the nitrogen balance method, 3-methylhistidine It’s common for some of these methods to be
can be used as a urinary marker for muscle protein implemented together in some studies to generate
breakdown, since roughly 90% of 3-methylhistidine a more reliable outlook of protein metabolism due
is located in skeletal muscle 22, 23. When skeletal to the limitations associated with each technique.
muscle is broken down, 3-methylhistidine is excreted Since protein synthesis leads to more muscle
through the urine because it cannot be recycled from mass and protein breakdown leads to less, a better
degraded contractile proteins 22. However, as much approach for investigating the long-term changes of
as 25% of urinary 3-methylhistidine could come from protein metabolism may be to specifically measure
other nonmuscle sources 24. Another limitation of muscle growth.
this method is that 3-methylhistidine is present in
dietary meat. So large intakes or increase in dietary
meat consumption would increase urinary excretion
giving researchers inaccurate data. So long as these
limitations are accounted for and controlled it can be
a viable method to assess skeletal muscle degradation.

The Arteriovenous Net


Balance Technique
Unlike the nitrogen balance method for measuring
protein metabolism, this technique can measure rates
of protein synthesis and breakdown that occur within
the muscle tissue. This is accomplished using an amino

How To Read Research: A Biolayne Guide 35


been shown to be reliable and valid for assessing
Hypertrophy changes in muscle cross sectional area (CSA) [55]. MRI

Measurements is a large tunnel-like machine you see in most medical


tv shows. These machines use electromagnetic fields
and radio waves to generate detailed images of the
B-mode Ultrasound organs and tissues within the body. Unlike CT scans,
The most common device you will encounter for MRI’s don’t use ionizing radiation, whereas CT scans
assessing muscle thickness changes is the B-mode do emit a small amount of radiation exposure. MRI
ultrasound. This is the same type of ultrasound device can be used for a variety of measurements including,
that’s used for measuring fetal development during total and subcutaneous adipose tissue assessment,
pregnancy. Similar to A-mode ultrasound, the device muscle’s lean and fat components, muscle thickness,
probe converts electrical energy into high-frequency and muscle volume 21. MRI is viewed as a reference
sound waves that pass through the skin surface standard for regional muscle mass analysis and is the
and underlying tissues, which reflect from the bone most accurate in terms of assessing changes in gross
surface to produce an echo 21. Compared to A-mode, muscle size 51, 56. The few downsides associated with
B-mode is more expensive and technically demanding, MRI (aside from the high cost) include its inability to
it also produces a higher resolution image that assess the molecular adaptations that occur within
provides more detail and tissue differentiation 21. This muscle fibers and they fail to evaluate the metabolic
method of assessing muscle thickness is non-invasive, and underlying mechanisms of muscle tissue 51, 57.
can be done quickly and is less expensive than most
other measures of muscle growth. Like the skinfold Muscle Biopsy
technique, this assessment is skill dependent and Muscle biopsies are a safe procedure accomplished
relies on the error rate of the technician. Hypertrophy using an anesthetic to numb the site, then a large
can vary through different regions of the same muscle pencil sized needle is inserted through the skin and
and B-mode measurements only represent the site underlying subcutaneous tissues and fascia to reach
specific region that’s measured, it is not indicative of the skeletal muscle tissue sample that is clipped and
hypertrophy of the entire muscle 51, 52. removed. Muscle biopsy samples can be used to assess
microscopic and molecular changes to skeletal muscle.
Advanced Imaging Techniques When evaluating microscopic changes, the sample is
The two types of advanced imaging techniques we frozen, thinly sliced and attached to a slide and stained
will briefly discuss are computed tomography (CT) and (depending on the method used), to determine fiber
Magnetic Resonance Imaging (MRI). These are highly cross sectional area (fCSA) or fiber type-specific cross
complex and expensive pieces of equipment, which is sectional area 51. Molecular assessment of muscle
why you’ll rarely see them used for body composition growth takes it another step deeper than microscopic
or muscle growth research. These two methods are and analyzes the changes in protein sub-fractions (the
as close as we can get to human cadaver analysis components that make up muscle fibers) such as actin,
in free-living subjects, with their advanced imaging myosin, or other sarcoplasmic protein concentrations 51.
techniques they allow for visualizing and quantifying When evaluating molecular changes there are a
organs and tissues such as muscle and fat 54. CT scans variety of different methods and protocols available.
use ionizing radiation X-ray beams that pass through Limitations to muscle biopsies share some similarities
tissues with differing densities, which generates to B-mode ultrasound. Muscle biopsies only measure
cross-sectional, 2-dimensional radiographic images the site where the sample was extracted and any
of body segments 21. Using these images researchers observed changes are also assumed to occur in the
can determine total tissue area, tissue thickness and surrounding fibers, as previously mentioned, muscle
volume of tissues within an organ 21. CT scans have growth can vary throughout the muscle. Additionally,
How To Read Research: A Biolayne Guide 36
the difference in tissue processing methods between (REE) is an estimation of the amount of energy an
labs and the lack of standardization makes it difficult individual expends at rest (laying on a bed) over a 24-
to compare findings between studies 51. Lastly, it’s hour period. This estimation is derived from analysis
impossible to perform a biopsy in the same location of the volume of air breathed during a specified
twice, so biopsy samples within the same study could period of time and the composition of expired air 21.
be comparing the changes of different regions of the The most accepted measure to determine REE is
measured muscle. For a more comprehensive and via indirect calorimetry, using a device known as
in-depth review of measurements relating to muscle a metabolic cart. The metabolic cart analyzes air
hypertrophy we strongly suggest a review by Haun et volume and composition the participant is breathing.
al. (2018) 51. To accomplish this, the metabolic cart includes a
computer interface to display data output recorded
by a device that continuously measures the subject’s
Energy Expenditure expired air, a flow-measuring device to record the
amount of air volume breathed and a small gas
Energy expenditure is essentially a measure of heat chamber that analyzes the oxygen and carbon dioxide
production. Cellular metabolism results in heat composition of expired air 21. The subject lies supine
production and measuring the body’s rate of heat on a table with a facemask or a plastic canopy that
production gives a direct assessment of metabolic collects the air breathed which travels through a long
rate 21. We can measure heat production directly or tube to the metabolic cart. The device then estimates
indirectly by measuring the exchange of gases (carbon the number of calories per day the participant uses
dioxide and oxygen). at rest, based on the volume of air breathed and the
composition of expired air, accounting for ambient air
Indirect Calorimetry temperature and composition. Substrate utilization is
Resting metabolic rate or resting energy expenditure accomplished within the same test and is calculated
How To Read Research: A Biolayne Guide 37
from the volume of carbon dioxide produced divided by 5 hours) 21. These labeled isotopes serve as tracers
the volume of oxygen consumed, known as respiratory and can be measured as they leave the body through
quotient (RQ). The RQ value is used to determine if a sweat, urine, pulmonary vapor, and carbon dioxide
greater percentage of calories burned come from fat (CO2). The difference between elimination rates of
or carbohydrates. This method of energy expenditure the two isotopes is determined using an isotope ratio
is non-invasive and takes approximately 20 minutes mass spectrometer and allows for an estimate of
to complete, with the first 5 minutes discarded for total CO2 production 21. During the observation period
calibration purposes and the remaining 15 minutes (several days to weeks), researchers measure a urine
used to extrapolate the data into a 24-hour period. or saliva sample for concentrations of the enriched
isotopes for estimation of CO2
production rate. Researchers
then use this estimated carbon
dioxide production rate and the
subject’s RQ to calculate energy
expenditure. This technique has
a high cost associated with it,
which results in low sample sizes
and doesn’t allow for evaluation
of day to day variations in
energy expenditure. However,
this method allows prolonged
assessment periods that don’t
interfere with everyday life or
physical activity. This method
also serves as a criterion to
validate other methods since
its accuracy averages between
3-5% when compared to direct
measurements of energy
expenditure in controlled
settings 21
. Drawbacks to
The obvious drawback to this method is the high cost this technique are that it does not assess what is
associated with the device and requiring the subject contributing to changes in energy expenditure (BMR vs.
to be at rest (not sleeping) for 20 minute time periods. NEAT vs. TEF vs. Exercise) and has been demonstrated
This method also requires careful calibration between to possibly overestimate energy expenditure in low
tests and controlling for variables by testing subjects carb diets 64.
fasted, prior to any food or drink consumption.
Direct Calorimetry
Doubly Labeled Water Technique Direct calorimetry is the most controlled and accurate
The doubly labeled water technique involves consuming measure available for estimating energy expenditure.
a quantity of water with a known concentration of This is accomplished using a metabolic ward or
non-radioactive stable isotope forms of hydrogen and metabolic chamber that houses subjects in a room
oxygen 21. This method estimates average daily energy sized chamber. The chamber has an inlet for oxygen
expenditure in free-living conditions once the isotopes to flow into the chamber and an outlet for CO2 to exit.
have distributed throughout all bodily fluids (roughly There is also a layer of water surrounding the chamber
How To Read Research: A Biolayne Guide 38
and as the subject’s heat is dissipated it warms that method, especially if blood testing is not an option.
layer of water. By knowing the volume of water and However, blood testing will give a better indication of the
the temperature change of the water, researchers can secreted hormone concentration. We have frequently
then calculate heat production. Then calculate energy noticed in the fitness industry many individuals place
expenditure based on heat production. This type of hormones on a pedestal and unreasonably emphasize
measurement is highly expensive, which is why few labs hormone data. While hormone results are objective
have this available for measuring energy expenditure. and excellent physiological outcomes, they shouldn’t be
For this reason, you will often find energy expenditure exaggerated. There are a few things you want to consider
to be measured using indirect methods. When this type when evaluating hormone results. Most hormones
of measurement is used in studies, you’ll see they use are secreted based on a variety of stimuli, while other
small sample sizes to account for the high cost. But hormones follow specific secretory daily cycles (diurnal
these types of studies will always be more powerful pattern) or several week cycles 21. If not performed
than a study that uses an indirect measurement. properly, one single blood draw will fail to account for
the specific secretory pattern of certain hormones and
won’t tell you anything about the changes that occurred
following a treatment. Even with multiple testing
points, the secretory pattern must be acknowledged
Hormones or flaws in the analysis and interpretation of results can
occur. Acute changes in hormones don’t necessarily
Hormones are chemical messengers synthesized in lead to long-term adaptations. A great example of
specific glands and transported in the blood to targeted this is the hormone hypothesis for muscle growth.
cells or receptors to elicit a physiological response. It was commonly believed that the acute increases
Hormone secretion rarely occurs at a constant rate in anabolic hormones like testosterone and growth
and adjusts rapidly to meet the demands of the body 21. hormone following resistance training leads to greater
Various sources can impact a hormone secretion rate muscle growth. However, this has been discredited by
depending on the magnitude of chemical stimulatory or a comprehensive review that explains how acute post-
inhibitory input 21. The secreted amount of a hormone exercise increases in systemic hormones are not a proxy
is indicative of its blood plasma concentration 21. measure for increased muscle growth 59. Rather, these
Hormones are most commonly tested from blood transient increases in hormone concentrations are
draws and analyzed based on their blood serum more likely due to changes in fuel demand and increased
concentrations. Some hormones can be tested through fuel mobilization to support exercise. Hormone data
saliva which introduces a less invasive and more cost- gives us an objective measure for assessing underlying
effective method. Cortisol is commonly assessed from physiological responses to certain treatments, but
saliva and has been shown to have a linear correlation it only gives us a snapshot of physiological changes
with blood concentrations 58. However, the correlation during the specific measurement points. Therefore,
was low and blood concentrations could not be inferred it’s imperative to have a comprehensive understanding
from salivary cortisol concentrations 58. Also, the of specific hormones and underlying physiology when
concentrations of hormones in saliva are much less than evaluating hormone data.
the concentrations in the blood, which could indicate
that salivary measures are more indirect and imply
passive diffusion rather than active secretion 58. There
are a number of factors that can impact the results of
salivary hormone testing, but when procedures are
standardized and certain variables are accounted for,
salivary hormone testing can be a good and acceptable

How To Read Research: A Biolayne Guide 39


placement; if variables like these are not accounted for
Muscle Excitation it can make comparisons between different exercises
inappropriate 60. The take home message from this
Electromyography (EMG) discussion is that EMG data can be very messy and
Electromyography is a measure of how the misconstrued with inappropriate conclusions, but
neuromuscular system is behaving 60. In exercise that’s not to say that all EMG studies/data are useless.
science, EMG is commonly used to investigate variables The previously mentioned confounding variables need
such as muscle activation, force production, muscle to be controlled when using EMG as a primary outcome.
recruitment, muscle strength and hypertrophy (which Comparing different exercises using a within-subject
is problematic as we’ll discuss). Surface EMG (sEMG) is and within-muscle (comparing pre- and post-test
the most frequently used device and is highly sensitive results from the same person and same muscle in the
to increases and decreases in voltage that occur on the same testing session) design may provide more reliable
muscle fiber membrane 60 Small electrodes are placed data on muscular excitation and force production when
over the muscle group/s of interest on the surface amplitude signals are appropriately normalized and
of the skin. The electrodes transmit the detected other variables are controlled 60. While EMG data can be
electrical impulses to a computer that displays a useful for understanding the neuromuscular system,
graphical representation of the voltage amplitude the conclusions and recommendations are currently
readings. Great caution is needed when reading limited by lack of longitudinal studies 60.
studies using EMG as a primary outcome due to the
complicated nature of sEMG and lack of longitudinal
work 60. Amplitudes measured with sEMG are the
most frequently reported metric in EMG experiments,
which are a measure of excitation, they are not a Strength Testing
direct measure of activation and sEMG amplitudes
by themselves cannot be used to infer motor unit Strength is a skill and highly specific, not only to the
recruitment or rate coding 60 In other words, sEMG type of exercise, but also specific to the intensity
cannot tell us if a certain exercise is recruiting more and rep range you consistently train at. This can be
muscle fibers or if the muscle fibers that are activated problematic when attempting to measure and compare
are firing at a faster rate. Additionally, the passive strength adaptations between groups exposed to
properties of muscle allow force production to occur different training programs.
with a corresponding sEMG amplitude reading of zero,
indicating sEMG amplitudes cannot reliably predict Repetition Maximum (RM)
muscle force during dynamic tasks 60, 61. Repetition maximum (RM) is the most commonly used
test of strength you’ll encounter in the literature. An RM
It has been assumed that greater sEMG amplitudes test can be used for any number of repetitions to assess
from certain exercises can be used to predict long- the maximum amount of weight a subject can lift for a
term adaptations in strength and muscle hypertrophy, specified number of repetitions, most often a 1RM test
this is currently unknown and conclusions should be is utilized. This type of test lends itself to a certain level
interpreted with caution due to sEMG’s inability to of subjectivity because load selection is dependent on
account for muscle properties and the number of other the research staff who are supervising. If researchers
variables that can impact hypertrophy and strength over or underestimate the load change it can lead to
adaptations 60. There are a number of factors that a subject not achieving a true maximum and falling
can influence sEMG amplitudes aside from muscular short due to fatigue from repeated max attempts.
effort such as muscle length, contraction type, So long as proper standardization is applied with an
contraction speed, tissue conductivity, and electrode established protocol, subjectivity can be minimized.
How To Read Research: A Biolayne Guide 40
Generally, most protocols involve a few warm-up sets reported for various contraction types (eccentric,
with progressively heavier weight in each set. After concentric, isometric, etc.).
a few minutes rest between sets the subject would
then attempt a near maximal attempt. During each
completed attempt, the weight is increased based
on the researcher’s discretion. A skilled researcher
should find a 1RM within three to five attempts. Some Psychometrics
researchers suggest if the only measure of strength in
a study is a 1RM, it may overlook strength adaptations Psychometrics measure psychological constructs such
because the 1RM is a skill and will improve most when as, moods, behaviors, and personality traits. These are
training closely reflects the 1RM test 62. For example, if measured using various types of questionnaires, surveys
you have one group training closer to their 1RM during a and interviews. These types of measures fall under
training period, theoretically they would perform better descriptive research and are widely used in education and
at a 1RM test than a group who trains further away from behavioral sciences3. Questionnaires and surveys can
their 1RM, making it difficult to compare adaptations. utilize open-ended, closed questions or a combination
For this reason, it may be more appropriate to include a of the two. Open-ended questionnaires provide more
test that both groups are inexperienced at performing. opportunity for subjects to elaborate or provide detailed
Tests using dynamometers can accomplish this and information about their feelings or ideas. For example,
provide a more objective measure. “Why did you struggle to adhere with your diet?”. While
these types of questions can gather a lot of detailed
Dynamometry information, they require considerable time and are
There are various types of dynamometers used difficult to score or compare answers between subjects
in research. It can range from spring or hydraulic or groups. Closed questions require a specific response
loaded dynamometers to highly sophisticated and commonly are yes, or no questions. These types of
computerized dynamometers that can isolate various questionnaires are relatively faster to administer, and
types of contractions and force outputs. The spring score compared to open-ended questionnaires. With
and hydraulic dynamometers are usually used for a an appropriate scoring system closed question surveys
measure of forearm isometric strength using a hand can be used to compare answers between subjects
grip dynamometer. These are simply a handle that or groups of subjects. Closed questions also include
you squeeze and hold for a few seconds to measure different formats such as scaled, ranking or categorical
the pounds or kilograms of force that you generate. questions. A very common iteration of a scaled
The more sophisticated computerized devices offer questionnaire is known as a visual analog scale (VAS)
more functions and provide a more comprehensive which has a line with corresponding answer choices
evaluation of muscle function and strength. These and equal intervals between answers that indicate
can range from simple handgrip dynamometers the strength of agreement or disagreement with a
to leg extension and mechanized squat devices. statement 3. The problem with some questionnaires and
These computerized devices like the knee extension surveys is how questions are worded. Some questions
dynamometer, can tightly control the range of may be worded in a way that subjects may feel there is a
motion, duration of each rep and the force applied “right” or “wrong” answer and change their true response
throughout different ranges of motion. These devices based on trying to satisfy the questionnaire. So, it’s
can measure a number of variables including maximal important they’re developed with appropriate wording
voluntary isometric contraction, rate of isometric that doesn’t bias the subject to a certain answer and
force development, power, torque, and velocity. even the order in which the questions are placed can
Generally, you’ll see rate of force development, play a role. Also, the more a subject repeats a specific
maximal voluntary contraction and peak power survey or questionnaire the more likely it is to bias their
How To Read Research: A Biolayne Guide 41
answers, since they become more familiar with the Once again, we’d like to reiterate that this is not a
questionnaire. For this reason, it’s important to have complete list of all of the available measurements in
a proper amount of time in between testing points exercise and nutrition research. There are many others
for questionnaires and surveys. It’s not a requirement available, but generally these are the ones you will
for questionnaires to be validated prior to its use in an frequently encounter when reading through nutrition
experiment, but it would carry greater importance if it and exercise science research publications. When
was. Generally, a lot of questionnaires used in exercise measuring any outcome, it’s best to use a combination
and nutrition science have been validated in the field
of measures whenever possible. This can help to
of psychology or other medical related fields which
provide a more comprehensive evaluation of the
indicates they are acceptable for use, but a lot of times
outcomes of interest and control for more variables.
these questionnaires should be developed specifically
Obviously, that’s not always possible for some labs
for the sample being studied. There are many different
and they have to deal with the equipment they have
psychometric questionnaires and surveys available. You’ll
most often see a number of different likert or VAS scales available. Which means, as consumers of research
used, along with others like the profile of mood states we need to critically evaluate the methods that are
questionnaire, the three factor eating questionnaire, or used in experiments and take them for what they’re
the Pittsburgh sleep quality index. Psychometrics can be worth. Again, just because a study uses a less valid and
a very useful, cost-effective tool and easily implemented reliable method, doesn’t mean we should throw it out.
with other more objective based measures to provide an Instead we can use it as a small piece of evidence and
in-depth evaluation. compare with other studies.

How To Read Research: A Biolayne Guide 42


Closing Remarks
There is no substitute for spending years in a lab in your area would be grateful to have another helping
actually working on a study, but hopefully this handbook hand. Simply reaching out to a professor in your area
has provided insight into the research process. who is conducting research that interests you can help
Participating in research is really the only way you guide you in the right direction.
can appreciate and understand how much goes into
conducting and publishing a study. Unfortunately We hope you’ll consider subscribing to our research
not everyone has the opportunity to attend higher review as part of the [Biolayne.com](http://Biolayne.
education, which was the primary motive for this com) membership. Each month we review 5 scientific
handbook. To share our knowledge and understanding studies related to training, nutrition, supplements,
of research based on our experiences conducting muscle growth, fat loss, and other topics related
studies in nutrition and exercise science. Keep in mind to health and fitness. Our goal is to provide our own
that this is a relatively short and non-comprehensive opinions, criticisms, dig into the important nuances,
guide to what goes into conducting and publishing a but also summarize the studies into a digestible format
study. Please take a look at some of our references for the non-scientist. Our mission for this review is to
which includes some excellent books worth investing establish a resource that allows individuals to stay up
in if you want to learn more about the research process. to date on current research and aware of the general
consensus of specific topics without a major time
If you want to be more active in research but you don’t commitment.
have the opportunity to attend a University, many labs

How To Read Research: A Biolayne Guide 43


References
1. Oxford University Press. (n.d.). Research English definition and 17. Head, M. L., Holman, L., Lanfear, R., Kahn, A. T., & Jennions, M. D.
meaning. Lexico Dictionaries | English. (2015). The extent and consequences of p-hacking in science. PLoS
biology, 13(3), e1002106.
2. Tuckman, B. W., & Harper, B. E. (2012). Conducting educational
research. Rowman & Littlefield Publishers. 18. KRLEŽA-JERIĆ, K. A. R. M. E. L. A. (2014). Sharing of clinical trial data
and research integrity. Periodicum biologorum, 116(4), 337-339.
3. Thomas, J. R., Nelson, J. K., & Silverman, S. J. (2015). Research
methods in physical activity. Human kinetics. 19. Moon, J. R., Eckerson, J. M., Tobkin, S. E., Smith, A. E., Lockwood,
C. M., Walter, A. A., Cramer, J. T., Beck, T. W., & Stout, J. R. (2009).
4. Hopkins, W. G. (2000). Quantitative research design. Sportscience, Estimating body fat in NCAA Division I female athletes: a five-
4(1), 1-8. compartment model validation of laboratory methods. European
journal of applied physiology, 105(1), 119–130.
5. Draper, C. E. (2009). Role of qualitative research in exercise science
and sports medicine. South African Journal of Sports Medicine, 21(1), 20. Smith-Ryan, A. E., Mock, M. G., Ryan, E. D., Gerstner, G. R., Trexler, E.
27-28. T., & Hirsch, K. R. (2017). Validity and reliability of a 4-compartment
body composition model using dual energy x-ray absorptiometry-
6. Barré-Sinoussi, F., & Montagutelli, X. (2015). Animal models are derived body volume. Clinical nutrition (Edinburgh, Scotland), 36(3),
essential to biological research: issues and perspectives. Future 825–830.
science OA, 1(4).
21. McArdle, W. D., Katch, F. I., Katch, V. L. (2015). Exercise Physiology:
7. Baxter, P., & Jack, S. (2008). Qualitative Case Study Methodology: Nutrition, Energy, and Human Performance. United Kingdom: Wolters
Study Design and Implementation for Novice Researchers. The Kluwer Health/Lippincott Williams & Wilkins.
Qualitative Report, 13(4), 544-559.
22. Campbell, B. (Ed.). (2013). Sports nutrition: Enhancing athletic
8. Cooper, H., Hedges, L. V., & Valentine, J. C. (Eds.). (2019). The performance. CRC Press.
handbook of research synthesis and meta-analysis. Russell Sage
Foundation. 23. RRooyackers, O. E., & Nair, K. S. (1997). Hormonal regulation of human
muscle protein metabolism. Annual review of nutrition, 17, 457–485.
9. Glass, G. V. (1976). Primary, secondary, and meta-analysis of research.
Educational researcher, 5(10), 3-8. 24. Afting, E. G., Bernhardt, W., Janzen, R. W., & Röthig, H. J.
(1981). Quantitative importance of non-skeletal-muscle N tau-
10. Ware, W. B., Ferron, J. M., & Miller, B. M. (2013). Introductory methylhistidine and creatine in human urine. The Biochemical journal,
statistics: A conceptual approach using R. Routledge. 200(2), 449–452.

11. Lakens, D. (2013). Calculating and reporting effect sizes to facilitate 25. Katsanos, C. S., Chinkes, D. L., Sheffield-Moore, M., Aarsland, A.,
cumulative science: a practical primer for t-tests and ANOVAs. Kobayashi, H., & Wolfe, R. R. (2005). Method for the determination of
Frontiers in psychology, 4, 863. the arteriovenous muscle protein balance during non-steady-state
blood and muscle amino acid concentrations. American journal of
12. Cohen, J. (1988). Statistical power analysis for the social sciences physiology. Endocrinology and metabolism, 289(6), E1064–E1070.
(2nd ed.). Routledge.
26. Kerr, A., Slater, G., Byrne, N., & Chaseling, J. (2015). Validation of
13. King, L. (2018). Preparing better graphs. Journal Of Public Health And Bioelectrical Impedance Spectroscopy to Measure Total Body Water
Emergency, 2(1). in Resistance-Trained Males. International journal of sport nutrition
and exercise metabolism, 25(5), 494–503.
14. Morton, R. W., Murphy, K. T., McKellar, S. R., Schoenfeld, B. J.,
Henselmans, M., Helms, E., Aragon, A. A., Devries, M. C., Banfield, 27. Schoenfeld, B. J., Nickerson, B. S., Wilborn, C. D., Urbina, S. L.,
L., Krieger, J. W., & Phillips, S. M. (2018). A systematic review, Hayward, S. B., Krieger, J., Aragon, A. A., & Tinsley, G. M. (2020).
meta-analysis and meta-regression of the effect of protein Comparison of Multifrequency Bioelectrical Impedance vs. Dual-
supplementation on resistance training-induced gains in muscle Energy X-ray Absorptiometry for Assessing Body Composition
mass and strength in healthy adults. British journal of sports Changes After Participation in a 10-Week Resistance Training
medicine, 52(6), 376–384. Program. Journal of strength and conditioning research, 34(3),
678–688.
15. Cumming, G., Fidler, F., & Vaux, D. L. (2007). Error bars in
experimental biology. The Journal of cell biology, 177(1), 7–11. 28. Graybeal, A. J., Moore, M. L., Cruz, M. R., & Tinsley, G. M. (2020).
Body Composition Assessment in Male and Female Bodybuilders:
16. Mlinarić, A., Horvat, M., & Šupak Smolčić, V. (2017). Dealing with the A 4-Compartment Model Comparison of Dual-Energy X-Ray
positive publication bias: Why you should really publish your negative Absorptiometry and Impedance-Based Devices. Journal of strength
results. Biochemia medica, 27(3), 030201. and conditioning research, 34(6), 1676–1689.

How To Read Research: A Biolayne Guide 44


29. Haas, V., Schütz, T., Engeli, S., Schröder, C., Westerterp, K., & programme in women. Clinical physiology and functional imaging,
Boschmann, M. (2012). Comparing single-frequency bioelectrical 37(6), 663–668.
impedance analysis against deuterium dilution to assess total body
water. European journal of clinical nutrition, 66(9), 994–997. 43. Wang, Z., Pi-Sunyer, F. X., Kotler, D. P., Wielopolski, L., Withers, R.
T., Pierson, R. N., Jr, & Heymsfield, S. B. (2002). Multicomponent
30. Moon J. R. (2013). Body composition in athletes and sports nutrition: methods: evaluation of new and traditional soft tissue mineral models
an examination of the bioimpedance analysis technique. European by in vivo neutron activation analysis. The American journal of clinical
journal of clinical nutrition, 67 Suppl 1, S54–S59. nutrition, 76(5), 968–974.

31. Matias, C. N., Santos, D. A., Gonçalves, E. M., Fields, D. A., Sardinha, 44. Friedl, K. E., DeLuca, J. P., Marchitelli, L. J., & Vogel, J. A. (1992).
L. B., & Silva, A. M. (2013). Is bioelectrical impedance spectroscopy Reliability of body-fat estimations from a four-compartment model
accurate in estimating total body water and its compartments in elite by using density, body water, and bone mineral measurements. The
athletes?. Annals of human biology, 40(2), 152–156. American journal of clinical nutrition, 55(4), 764–770.

32. Cole, K.S. Permeability and impermeability of cell membranes for ions 45. Wilson, J. P., Strauss, B. J., Fan, B., Duewer, F. W., & Shepherd, J.
in Cold Spring Harbor Symposia on Quantitative Biology. 1940. Cold A. (2013). Improved 4-compartment body-composition model for a
Spring Harbor Laboratory Press. clinically accessible measure of total body protein. The American
journal of clinical nutrition, 97(3), 497–504.
33. Matthie, J. R. (2008). Bioimpedance measurements of human body
composition: critical analysis and outlook. Expert review of medical 46. Nickerson, B. S., & Tinsley, G. M. (2018). Utilization of BIA-Derived
devices, 5(2), 239-261. Bone Mineral Estimates Exerts Minimal Impact on Body Fat Estimates
via Multicompartment Models in Physically Active Adults. Journal of
34. Siri, W. E., Brozek, J., & Henschel, A. (1961). Techniques for measuring clinical densitometry : the official journal of the International Society
body composition. Washington, DC: National Academy of Sciences, for Clinical Densitometry, 21(4), 541–549.
223-224.
47. Williams, J. E., Wells, J. C., Wilson, C. M., Haroun, D., Lucas, A., &
35. Withers, R. T., Craig, N. P., Bourdon, P. C., & Norton, K. I. (1987). Fewtrell, M. S. (2006). Evaluation of Lunar Prodigy dual-energy X-ray
Relative body fat and anthropometric prediction of body density absorptiometry for assessing body composition in healthy persons
of male athletes. European journal of applied physiology and and patients by comparison with the criterion 4-component model.
occupational physiology, 56(2), 191–200. The American journal of clinical nutrition, 83(5), 1047–1054.

36. Orphanidou, C., McCargar, L., Birmingham, C. L., Mathieson, J., & 48. Clasey, J. L., Kanaley, J. A., Wideman, L., Heymsfield, S. B., Teates,
Goldner, E. (1994). Accuracy of subcutaneous fat measurement: C. D., Gutgesell, M. E., Thorner, M. O., Hartman, M. L., & Weltman, A.
comparison of skinfold calipers, ultrasound, and computed (1999). Validity of methods of body composition assessment in young
tomography. Journal of the American Dietetic Association, 94(8), and older men and women. Journal of applied physiology (Bethesda,
855–858. Md. : 1985), 86(5), 1728–1738.

37. van Marken Lichtenbelt, W. D., Hartgens, F., Vollaard, N. B., Ebbing, 49. Kullberg, J., Brandberg, J., Angelhed, J. E., Frimmel, H., Bergelin, E.,
S., & Kuipers, H. (2004). Body composition changes in bodybuilders: Strid, L., Ahlström, H., Johansson, L., & Lönn, L. (2009). Whole-body
a method comparison. Medicine and science in sports and exercise, adipose tissue analysis: comparison of MRI, CT and dual energy X-ray
36(3), 490–497. absorptiometry. The British journal of radiology, 82(974), 123–130.

38. Evans, E. M., Saunders, M. J., Spano, M. A., Arngrimsson, S. A., Lewis, 50. Tothill, P., & Hannan, W. J. (2000). Comparisons between Hologic
R. D., & Cureton, K. J. (1999). Body-composition changes with diet and QDR 1000W, QDR 4500A, and Lunar Expert dual-energy X-ray
exercise in obese women: a comparison of estimates from clinical absorptiometry scanners used for measuring total body bone and soft
methods and a 4-component model. The American journal of clinical tissue. Annals of the New York Academy of Sciences, 904, 63–71.
nutrition, 70(1), 5–12.
51. Haun, C. T., Vann, C. G., Roberts, B. M., Vigotsky, A. D., Schoenfeld,
39. Peterson, M. J., Czerwinski, S. A., & Siervogel, R. M. (2003). B. J., & Roberts, M. D. (2019). A Critical Evaluation of the Biological
Development and validation of skinfold-thickness prediction Construct Skeletal Muscle Hypertrophy: Size Matters but So Does the
equations with a 4-compartment model. The American journal of Measurement. Frontiers in physiology, 10, 247.
clinical nutrition, 77(5), 1186–1191.
52. Vigotsky, A. D., Schoenfeld, B. J., Than, C., & Brown, J. M. (2018).
40. Kuehne, T. E., Yitzchaki, N., Jessee, M. B., Graves, B. S., & Methods matter: the relationship between strength and hypertrophy
Buckner, S. L. (2019). A comparison of acute changes in muscle depends on methods of measurement and analysis. PeerJ, 6, e5071.
thickness between A-mode and B-mode ultrasound. Physiological
measurement, 40(11), 115004. 53. Haun, C. T., Vann, C. G., Mobley, C. B., Roberson, P. A., Osburn, S. C.,
Holmes, H. M., Mumford, P. M., Romero, M. A., Young, K. C., Moon, J.
41. Wagner D. R. (2013). Ultrasound as a tool to assess body fat. Journal R., Gladden, L. B., Arnold, R. D., Israetel, M. A., Kirby, A. N., & Roberts,
of obesity, 2013, 280713. M. D. (2018). Effects of Graded Whey Supplementation During
Extreme-Volume Resistance Training. Frontiers in nutrition, 5, 84.
42. Schoenfeld, B. J., Aragon, A. A., Moon, J., Krieger, J. W., & Tiryaki-
Sonmez, G. (2017). Comparison of amplitude-mode ultrasound versus 54. Ward L. C. (2018). Human body composition: yesterday, today, and
air displacement plethysmography for assessing body composition tomorrow. European journal of clinical nutrition, 72(9), 1201–1207.
changes following participation in a structured weight-loss

How To Read Research: A Biolayne Guide 45


55. Verdijk, L. B., Gleeson, B. G., Jonkers, R. A., Meijer, K., Savelberg, H.
H., Dendale, P., & van Loon, L. J. (2009). Skeletal muscle hypertrophy 60. Vigotsky, A. D., Halperin, I., Lehman, G. J., Trajano, G. S., &
following resistance training is accompanied by a fiber type-specific Vieira, T. M. (2018). Interpreting Signal Amplitudes in Surface
increase in satellite cell content in elderly men. The journals of Electromyography Studies in Sport and Rehabilitation Sciences.
gerontology. Series A, Biological sciences and medical sciences, Frontiers in physiology, 8, 985.
64(3), 332–339.
61. Roberts, T. J., & Gabaldón, A. M. (2008). Interpreting muscle function
56. Smeulders, M. J., van den Berg, S., Oudeman, J., Nederveen, A. J., from EMG: lessons learned from direct measurements of muscle
Kreulen, M., & Maas, M. (2010). Reliability of in vivo determination of force. Integrative and comparative biology, 48(2), 312–320.
forearm muscle volume using 3.0 T magnetic resonance imaging.
Journal of magnetic resonance imaging : JMRI, 31(5), 1252–1255. 62. Buckner, S. L., Jessee, M. B., Mattocks, K. T., Mouser, J. G., Counts,
B. R., Dankel, S. J., & Loenneke, J. P. (2017). Determining Strength:
57. Hellerstein, M., & Evans, W. (2017). Recent advances for measurement A Case for Multiple Methods of Measurement. Sports medicine
of protein synthesis rates, use of the ‘Virtual Biopsy’ approach, and (Auckland, N.Z.), 47(2), 193–195.
measurement of muscle mass. Current opinion in clinical nutrition
and metabolic care, 20(3), 191–200. 63. Norton, L. E., Wilson, G. J., Layman, D. K., Moulton, C. J., & Garlick,
P. J. (2012). Leucine content of dietary proteins is a determinant of
58. Rantonen, P. J., Penttilä, I., Meurman, J. H., Savolainen, K., Närvänen, postprandial skeletal muscle protein synthesis in adult rats. Nutrition
S., & Helenius, T. (2000). Growth hormone and cortisol in serum and & metabolism, 9(1), 67.
saliva. Acta odontologica Scandinavica, 58(6), 299–303.
64. Hall, K. D., Guo, J., Chen, K. Y., Leibel, R. L., Reitman, M. L.,
59. West, D. W., Burd, N. A., Staples, A. W., & Phillips, S. M. (2010). Human Rosenbaum, M., Smith, S. R., & Ravussin, E. (2019). Methodologic
exercise-mediated skeletal muscle hypertrophy is an intrinsic considerations for measuring energy expenditure differences
process. The international journal of biochemistry & cell biology, between diets varying in carbohydrate using the doubly labeled water
42(9), 1371–1375. method. The American journal of clinical nutrition, 109(5), 1328–1334.

© Copyright 2022 Biolayne Technologies LLC 46

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy