## 4. Basic Statistical Concepts

**Traditional design:**Examine 2 groups of participants, pretest measure, apply intervention to one group, post-test measure, look for statistical differences.

**1. Probability (p):**What amount of correlation is necessary in order for there to be support for a hypothesis?

**What amount is of a measure due to chance?**

- Prior to analysis the researcher must decide upon their level of acceptance.
- The vast majority of the time the probability level of
**0.05**is used. - A "p" of 0.05 means that you expect to find a result of this magnitude
**by chance**only 5 in 100 times. Or conversely, if you carried out the experiment 100 times you would expect to find a result of this magnitude 95 times. You therefore have 95% confidence in your result. - This is often expressed as
**Statistical Significance -- a result is determined to be statistically significant.** - The fact that statistical significance is reached, only demonstrates sufficient statistical power, not clinical significance.
- There is no p = 0.06 = almost significant.

__Take away: Watch for a p-value equal to or not greater than 0.05.__

**2. Correlation (r):**

**Correlational analysis involves quantifying the relationship between 2 variables**:

- Exercise physiologists look at the relationship between exercise intensity and VO2max
- Biomechanists look how changes in LE contact joint forces are related to changes in gait kinematics.

Correlations can also be negative (-1.0=perfect), for instance VO2max against time of a 1 mile run.

**The stronger the correlation, whether positive or negative, the better the ability to predict.**

__Take away: The closer to 1.0 or -1.0 the correlational values (r) in a study, the more valid it is.__

**Validity:**The ability of a study to reflect the

**true**state of the variables being tested in the population of interest.

**Internal Validity:**Challenge to keep the environment as controlled as possible.

**External Validity:**Challenge to keep the environment as realistic and natural as possible.

Do you notice how the two can easily cause a conflict?

**Example 1 -- Internal Validity:**

2 scenarios: A researcher wants to compare the efficacy of 2 stretching programs on knee ROM in ACL reconstruction surgery patients.

- Scenario 1: 20 patients, she demos the exercises, there are 2 programs to choose from to do at home, the patients return to be assessed.
- Scenario 2: Patients are assessed and randomly divided in 2 groups, they perform the programs under supervision in the clinic, are not allowed any other programs, then they are re-assessed.

**Example 2 -- External Validity:**

- Participants need to be a representative sample of the chosen population (e.g. college-aged males, post-menopausal women). Population should not be too general because that makes it harder to find exact representation for!
- Sample sizes: 10 students do not represent college students population of US.

__Take away: Check a study’s protocols and how well it does with controlling all the variables in its methods. Also, watch for sample sizes and types, this is a limitation often found in research!__

Internal validity also plays an important role in choosing

**tests or measures**.. You want to choose the test or measure that most reliably and accurately measures your variable. Most measures have a

**Gold Standard,**which is the most accurate current tool to measure.

For instance DXA scans are the gold standard for measuring body composition. You most likely don’t have a DXA scan at home, so you want to get as close as possible to the gold standard while keeping the measures feasible. In this case skin-fold caliper measure is a strong alternative as it has a strong correlation (r-value) to DXA scan.

This counts for measuring various strengths, flexibility, muscle contractions, load measures etc.

__Take away: Watch for a study’s test protocols and check whether the chosen protocols are the most valid ones.__

**Reliability – the repeatability of a test or measure.**Imagine you’re using a new, inexpensive instrument to measure body composition through bioelectric impedance. You take 3 data samples per participant and they each vary widely with all participants. You question the ability of the instrument to assess body composition in a consistent and repeatable way, so you need to make sure you're using the most reliable instrument you can afford. You also want to make sure your instrument is being used correctly and that the tester (rater) is appropriately trained. These 2 considerations lead to the following types of reliability:

**Test-retest reliability:**One of the ways a test can be shown to be reliable is if test scores from day 1 are highly similar to those of another day. This consistency is important.

**Interrater reliability (Objectivity):**2 raters score the same participants on the same test and those scores are correlated. You want these scores to be as close to r=1.0 as possible.

**Reliability is an important part of validity.**

__Take-away: Test-retesting and more than 1 rater are important tools for the strength of a study.__