A Guide to Calculating Power
"Hey McFly! Those boards don’t work on water! Unless you’ve got power!"
This statement by Griff's buddies on Back to the Future II is applicable to the research domain. It goes something like this: “Hey McFly! That research won’t work on wishful thinking, unless you’ve got power!”
Statistical power refers to the probability of correctly rejecting the null hypothesis of no effect. It is essential that researchers know their statistical power before launching into a research project. Low statistical power may lead one to conclude that there is no effect from a treatment when there is (called a Type II error), while an “overpowered” study may lead one to conclude that a significant effect has practical or clinical significance when it does not.
Concern over statistical power is a relatively recent phenomenon. Studies have shown that several past studies in the social and health sciences have been underpowered. Many of these had only a 20-30% chance of correctly rejecting the null hypotheses, possibly leading researchers to incorrectly conclude that treatment effects were not real. An awareness of this problem has led most institutional review boards (IRBs) and granting agencies to require power and sample size calculations before approving studies. But many people do not know where to go to calculate power.
There are a number of commercial power and sample size programs available. PASS, SPSS Power, and NQuery are a few examples. However, these programs are costly (each in the $1000 range). There are also several freeware power and sample size calculators available, but most of these are limited in the number of available power calculations.
I have used several power and sample size programs. My favorite is G*Power. G*Power was created by faculty at the Institute for Experimental Psychology in Dusseldorf, Germany. It offers a wide variety of calculations along with graphics and protocol statement outputs. Best of all, it is free! The developers released version 3.1.3 in 2010. Terms of use and a downloadable zip file are available here.
After downloading the program you may ask yourself, “How do I use it?” There are limited resources. The developers have a tutorial on using G*Power, but it is sparse in some places and thus may be difficult for some people to follow. I created an easy-to-follow guide for using GPower 3.x. The guide is included below. It is a work in progress and I will update it and add more analyses as time permits. Several of the G*Power examples on this page have been checked against power calculations in SPSS, NQuery, and PASS with excellent results.
I cannot guarantee the completeness and correctness of this material and users assume all risks associated with using the guide. If you have any comments or suggestions on improving the guide, please let me know. (Recently added: power analysis for repeated measures, multivariate ANOVA. See #50 below.)
A Guide to Using GPower
Exact Tests
The main characteristic of exact methods is that the statistical tests are based on exact probability statements that are valid for any sample size, thus you may use exact power calculations for any sample size. These calculations should at least be used when sample sizes are small and/or there is no equality of variance.
1. Correlation: Bivariate (2 continuous variables) Normal Model
Test whether an r value is statistically different from zero or a known pop r value.
Example: Is a correlation of 0.75 between hours studied and test score significantly different from zero? Or, does my sample's r value of 0.75 differ from a population's r value of 0.65?
Tails = 1 or 2 (use 2-tail if the r value could be pos or neg; otherwise use 1-tail)
Correlation p H1 (corr. value assuming H1) = 0.75 (note that r is the effect size)
Alpha = .05 (or .01)
Power = desired level
Correlation p Ho (corr. value assuming Ho) = usually 0. However, you may enter any other r value if you want to compare a known null hypothesis population r value (e.g., 0.65) against your sample's r value (0.75).
2. Linear Multiple Regression: Random Model (see also #46 below)
To test whether a group of predictors significantly predicts an outcome variable.
Example: Do IV1, IV2, IV3, and IV4 significantly predict a DV?
Tails = 1 or 2
H1 p2 = click “Determine” to estimate the population multiple correlation coefficient. Choose “from predictor correlations. Enter number of predictors. Click on “specify matrices” and enter IV’s correlations with the DV. Calculate p2 and then accept values.
H0 p2 = null hypothesis multiple correlation coefficient (usually 0)
Power = enter desired power level
Number of predictors = enter number of IVs, in this case 4 IVs.
3. Proportion: Difference from Constant (binomial test, one sample case)
To test whether a sample proportion differs from a population proportion. Especially use the exact test when n*po*qo < 5 or n*po and n*(1-po) < 5, (where po equals the probability of an event occuring and qo equals the probability of an event not occuring.)
Example: The prevalence of breast cancer among middle aged women in the general population is .02. The breast cancer rate among a sample of women who have a sister with breast cancer is .05. What sample size is needed to detect a significant difference between the population and sample proportions? (To claim that the rate of cancer among women with sister history is 2.5 times [.05/.02] higher than those without sister history?)
Tails = 2 or 1 (if the direction of difference from the Ho(P1) value is known, choose a 1-tail test)
Effect size g = Click Determine. Enter P1 the Ho prop(.02), and P2 the H1(alternative) prop(.05). Choose one of the "Calc P2 from..." techniques (they all give the same effect size), synch the values, and then calculate effect size g. In this case, g = 0.03
Alpha = .05
Power = .90
Constant proportion = Ho prop (.02) which is the same as P1
4. Proportions: Inequality, 2 Dependent Groups (McNemar’s)
Compare 2 dependent proportions (people in both groups have been paired/matched).
Example 1: How many people are needed to test whether the proportion of people who quit smoking in a hypnotism smoking cessation program (expected proportion of hypnosis quitters Prop.hyp=.76) is different than the proportion of people who quit smoking by chewing Big Red gum (expected proportion of gum quitters Prop.gum=.54) where both groups are matched by age, sex, and smoking history. Example 2: Based on clinical experience, we've found that about 0.76 of immunocompromized patients test positive for a certain infection. We've also found that this proportion drops to about 0.54 in the same patients after a drug to boost immunity is given. We want to prove the drug's effectiveness by testing the patients for infection (infected vs. not infected) before and after administration of the drug. How many patients do we need to show a difference in before and after infections?
Tails = 1 or 2 (use 1-tail if the difference is expected to go in one direction).
OR = [Ph/(1-Ph)]/[Pg/(1-Pg)] = [.76/(1-.76)]/[.54/(1-.54)] = 3.17/1.17 = 2.70
Alpha = .05
Power = desired level
Prop. Discordant Pairs = estimate how many pairs will not have the same outcome? For instance, pos/pos and neg/neg are concordant pairs, while pos/neg are discordant pairs. Enter the proportion of pos/neg discordant pairs. In healthcare, knowing the sensitivity of tests may help estimate this value.
5. Proportions: Inequality, 2 Independent Groups (Fisher’s Exact)
Compare 2 independent proportions.
Example: Based on previous data, the expected proportion of students passing a stats course taught by psychology teachers is 0.85. The expected proportion of students passing the same stats class taught by mathematics teachers is 0.95. How many participants are needed to detect a significant difference between the 2 proportions in a prospective study? (Note that this also works with retrospective studies where one wants to know how many cases to extract from a database for both groups.)
Tail = 1 or 2
Prop 1 = 0.85 (You do not have to click on “Determine”)
Prop 2 = 0.95
Alpha = .05
Power = choose your level.
6. Proportions: Inequality, 2 independent groups (unconditional)
Not sure about this one. It uses one proportion and an OR.
7. Proportions: Inequality, (offset) 2 independent groups (unconditional)
Not sure about this one. It uses conditional probabilities.
8. Proportion: sign test (binomial test)
Test whether the occurence of a binary outcome in a population is 60%. (This calculation can also be done with #9 below.)
Example: How many cases are needed to test whether the number of female students in a group of college students differs from 60% (P1)?
Tails = 1 or 2 (select 1 if the expected difference is unidirectional).
Effect size g = calculate the expected effect size where g = (Prop2 - .50). So in this example, g = (proportion of non-females [males] minus .40). If we expect .40 males. then g = (.60 - .40) = .20.
Alpha = .05 in most cases
Power = select your level (usually > .80)
9. Generic Binomial test
Compare proportions of a binary variable
Example: How many cases are needed to show that there are more females (expected prop = .60) than males (expected prop = .40) included in a binary variable called gender?
Proportion p2 = if this represented males, you would enter the expected proportion of males in the “gender” variable (e.g., 0.40)
Alpha = select your level (.01 or .05)
Power = select your level
Proportiion p1 = if this represents females, you would enter proportion of expected females in the “gender” variable (e.g., 0.60)
T-Tests
Cohen’s Effect Size Conventions for “d”
d = 0.20 (small)
d = 0.50 (medium)
d = 0.80 + (large)
10. Correlation: Point Biserial Model
Tests whether a correlation coefficient is significantly different from zero, when one variable is continuous and the other is dichotomous.
Example: How many participants are needed to determine whether an expected r = 0.30 is significantly different from zero when correlating test scores (continuous) and gender (dichotomous)?
Tails = 1 or 2 (use 2-tail if the r value could be pos or neg; otherwise use 1-tail)
Effect Size |r| = enter the correlation coefficient 0.30 (no need to click on “determine”)
Alpha = .05
Power = choose your power
11. Linear Bivariate Regression: one group, size of slope
Determine whether the slope for a predictor variable is significantly different from 0.
12. Linear Bivariate Regression: 2 groups, difference between intercepts
13. Linear Bivariate Regression: 2 groups, difference between slopes
14. Linear Multiple Regression: Fixed model, single regression coefficient
15. Means: Difference between 2 dependent groups
Within, dependent, correlated, paired samples t-test.
Example: What sample size is needed for comparing before and after scores on depression to test whether an antidepressant works. Before treatment mean score = 45 (SDbefore = 2.1); After treatment mean score = 32 (SDafter = 1.6).
Tails = 1 or 2
Effect Size dz = Click “determine” and enter “before” data for group 1, and “after” data for group 2, and enter correlation between the 2 sets of data.
Alpha = .05
Power = select desired level
16. Means: Difference between 2 independent groups
Between, independent groups t-test.
Example: What sample size is needed to compare control and treatment groups?
Tails = 1 or 2
Effect size d = use top pane if sample sizes are not equal (using SDpooled). Use bottom pane if sample sizes are equal (balanced).
Alpha = .05
Power = select desired level
Allocation ratio = ratio of sample sizes (enter 1 if sample sizes are expected to be equal).
17. Means: Difference from constant (single sample t-test)
One sample t-test.
Example: Compare a sample mean against a null hypothesis population mean.
Tails = 1 or 2
Effect size d = enter H1 and H0 means. Enter estimated sigma σ using sample SD.
Alpha = .05
Power = select desired power level
18. Wilcoxon Signed-Ranks Tests (matched pairs)
Non-parametric test for comparing 2 matched groups.
19. Wilcoxon Signed-Ranks Test (one/within sample case)
Non-parametric test for comparing within group data.
20. Wilcoxon Rank-Sum or MWU (2 independent groups)
Non-parametric test for comparing 2 independent groups.
21. Generic t-test
No a priori calculations
Chi-Square Tests
(no chi-square test for independence in here)
Cohen’s Effect Size Conventions for “w”
w = 0.10 (small)
w = 0.30 (medium)
w = 0.50 (large)
22. Goodness of Fit: Contingency Tables
Chi-Square test for Goodness of Fit
Example: Observed number of people belonging to groups A, B, C, and D are compared against expected values.
Tails = 1 or 2
Effect Size w = select “determine”. Number of cells refers to # of categories, in this case there are 4 (A, B, C, D). P(H0) is the column for expected observed values. P(H1) is the column for observed values. The proportions in each column must add up to 1. The 2 cells above equal P(H0) and P(H1) are for entering an equal proportion for the respective cells in one column and then click the ‘equal’ button. Don’t know about the normalize buttons. “Auto calc. last cell” button computes final proportion for the last cell in a column so that the total is 1.0.
Alpha = .05
Power = select desired level
DF = (# categories – 1). In this example 4-1 = 3.
23. Variance: difference from constant (one case)
Not sure about this one
24. Generic X2 Test
No a priori calculations
Z Tests
Effect size conventions for the correlation coefficient r.
r = small 0.10
r = medium 0.30
r = large 0.50
25. Correlation: Tetrachoric Model
Correlate 2 artificially dichotomized variables
26. Correlation: 2 Dependent Pearson r’s (common index)
Correlate Pearson correlation coefficients from 2 dependent samples.
27. Correlation: 2 Dependent Pearson r’s (no common index)
(not sure how this differs from #2)
28. Correlation: 2 independent Pearson r’s
Compare 2 Pearson correlation coefficients from 2 independent samples.
Example: Test whether the correlation between hours studied and test score for group A is statistically different than the correlation between hours studied and test score for group B.
Tails = 1 or 2
Effect size q = click ‘determine’ and then enter both r’s
Alpha = .05
Power = select desired level
Allocation ratio n2/n1 = enter ratio of participants in group A to group B.
29.A Logistic Regression for a continuous predictor (with or without other covariates)
Find out if a continuous predictor is a significant predictor of a binary outcome variable.
Example: Test whether BMI influences mortality (yes 1, no 0) in patients with chronic illness with or without other covariates (e.g., comorbidities, gender, etc.)
Tails: 1 or 2
Click on Options tab at the bottom to enter effect size as OR or 2 probabilities. Since OR = (p1(1-p2))/(p2(1-p1)), this example will use the 2 probabilities option.
p1 -> Pr(Y=1 | X=1) H1. What is the prob of death (Y=1) when the main predictor (BMI) is one SD unit (i.e., one z-score) above its mean, and all other covariates, if applicable, are set to their mean values. Let's say that p1 = 0.25.
p0 -> Prob(Y=1 | X=1) Ho. What is the prob death (Y=1) when the predictor and all other covariates, if applicable, are set to their mean values. Let's say that p0 = 0.15.
Alpha = .05
Power = select desired level
R-squared other X = Enter the expected squared multiple correlation coefficient (R^2) between the main preedictor and all other covariates. If there are no other covariates, enter 0. (This can be found by regressing the main predictor onto all other covariates.) So how much of the variance in the main predictor is acocunted for by variability in other covariates? If there is just one other covariate, age, and it explains 10% of the variability in BMI, then enter .10. You may find R^2 using GPower's calculation tool in linear multiple regression (the p^2 under Exact or Multivariate tests).
X-Distribution = select normal unless there are reasons to think that the main predictor is distributed differently.
X param mu = the z-score population mean of predictor X (BMI) = 0.
X param sigma = the z-score population SD of predictor X (BMI) = 1.
29.B. Logistic Regression for a dichotomous predictor (with or without other covariates)
Find out if a dichotomous predictor is a significant predictor of a binary outcome.
Example. Test whether smoking (yes vs. no) influences mortality with no other covariates or with other covariates.
Tails = 1 or 2
Click on Options tab at the bottom to enter effect size as OR or 2 probabilities. Since OR = (p1(1-p2))/(p2(1-p1)), this example will use the 2 probabilities option.
p1 -> Pr(Y=1 | X=1) H1. The probability of death (Y=1) given that someone smokes (X=1). Let's assume that p1 = 0.18.
p0 -> Prob(Y=1 | X=1) Ho. The probability of death (Y=1) given that someone is a non-smoker (X=0). Let's assume that p2 = 0.06. [Note that the makers of G*Power should change Prob(Y=1|X=1)Ho to Prob(Y=1|X=0)]
Alpha = .05
Power = select desired level
R-squared other X = enter the expected squared multiple correlation coefficient (R^2) between BMI and all other covariates. Find R^2 using the calculation tool in linear multiple regression (the p^2 under Exact or Multivariate F tests) or select a convention (small= .10, medium= .30, large= .50) and square it. For instance, if the expected correlation of BMI with other covariates is small (~.20), square it (.04). If there are no other covariates, enter 0.0.
X-Distribution = binomial predictor
X param pi = The proportion of cases where X=1. Are the samples for X=0 and X=1 balanced/equal? If balanced, enter 0.50. If 75% of the cases are X=1, then enter 0.75.
30.A. Poisson Regression for a continuous predictor (with or without other covariates)
Find out whether a continuous predictor variable influences count data collected over the same period of time.
Example. Can we predict number of classes missed in a semester based on age?
Tails = 1 or 2
Exp(B1) = enter the increase in the response rate beyond base rate [Exp(Bo)] that you want to detect, with every one unit change in the main predictor. For example, if you wanted to detect an increase of 25% in absentees with every 1 year increase in age, then enter 1.25. If you wanted to detect a 75% increase in absentees, then enter 1.75, etc.
Alpha = .05 or .01
Power = select your level of power (I usually start at 90% but go no lower than 80%).
Base rate Exp(Bo) = the baseline response rate that is expected when the predictor equals the mean. For example, we expect the absentee base rate to be .05 (5%) at the mean age level. (Note that the base rate can refer to the death rate, survival rate, accident rate, or hazard rate, etc.)
Mean exposure = the mean unit of time over which the counts are collected. For example, if one year then enter 1; if 60 days then enter 60.
R^2 other X = enter the expected squared multiple correlation between the main predictor and other covariates, if applicable. If there are no other predictors/covariates then enter 0.
X Distribution = enter the shape of the underlying distribution for the main predictor. Usually 'normal' when the predictor is continuous.
X parm mu = 0
X parm sigma = 1
30.B. Poisson Regression for a binary predictor (with or without other covariates)
Find out whether a dichotomous predictor variable influences count data collected over the same period of time.
Example. Can we predict number of classes missed in a semester based on gender?
Tails = 1 or 2
Exp(B1) = enter the increase in the response rate beyond the base rate [Exp(Bo)] that you want to detect, with a change in the main predictor. For example, if you wanted to detect an increase of 25% in absentees among males (compared with females), then enter 1.25. If you wanted to detect a 75% increase in absentees, then enter 1.75, etc.
Alpha = .05 or .01
Power = select your level of power (I usually start at 90% and go no lower than 80%).
Base rate Exp(Bo) = the baseline response rate that is expected when the predictor equals zero (female). For example, we expect the absentee base rate to be .05 (5%) for females. (Note that the base rate can refer to the death rate, survival rate, accident rate, or hazard rate, etc.)
Mean exposure = the mean unit of time over which the counts are collected. For example, if one year then enter 1; if 60 days then enter 60.
R^2 other X = enter the expected squared multiple correlation between the main predictor and other covariates, if applicable. If there are no other predictors/covariates then enter 0.
X Distribution = binomial (shape of the underlying distribution for the main predictor)
X parm pi = Enter the proportion of total cases belonging to group 1 (males). If sample sizes are equal, then enter 0.50.
31. Proprotions: Difference between 2 independent proportions
Compare 2 proportions from 2 independent groups
Example: The proportion of divorced Baptists in a sample of 100 is 0.22, and the proportion of divorced Catholics in another sample of 100 is 0.31. Is there a significant difference?
Tails = 1 or 2
Proportion 2 = enter Catholic proportion 0.31
Proportion 1 = enter Baptist proportion 0.22
Alpha = .05
Power = select desired level
Allocation ratio n2/n1 = enter ratio of participants in both groups (i.e., 100/100 = 1.0)
32. Generic Z Test
F-Tests
Cohen’s univariate effect size conventions for “f”
f = 0.10 (small)
f = 0.25 (medium)
f = 0.40 (large)
33. ANCOVA: Fixed effects, main effects, and interactions
34. ANOVA: Fixed Effects, omnibus, one-way
One-Way between (fixed effects) groups ANOVA
Example: We want to compare mean scores on an algebra test for students who took the test listening to Rock, Country, and Rap.
Determine Effect Size = Select Procedure > effect size from mean. Enter number of levels (groups) of the fixed variable being compared (in this case 3 music groups). Enter expected SD for all groups (assuming homogeneity of variance). Enter expected mean test scores in the table along with expected sample sizes for each. If sample sizes are equal, then enter the amount in “equal n” and then click (in this case we expect 12 participants per group). Click “calculate effect size” and transfer to main window.
Alpha = .05
Power = select desired power level (use “post hoc” to enter sample size)
Number of Groups = already inserted from effect size calculations, so 3.
35. ANOVA: Fixed effects, special, main effects and interactions
Two-way (or higher) between (fixed effects) groups ANOVA (single analysis good for both main effects and interactions)
Example: We want to see if there is a difference in test scores based on gender (female vs. male) and race (Caucasian, Hispanic, Black, Native) thus making a 2x4 analysis.
Determine Effect Size = Select Procedure > direct method. Enter partial eta squared (n2) which is the effect size measure indicating the total variance explained by the IVs, main effects, and interactions. Click “calculate effect size” and transfer to main window.
Alpha = .05
Power = desired level (select “post hoc” to enter sample size)
Numerator df = this specifies which main effect or interaction you are testing for. It is found by taking the number of levels and subtracting one. In this case, enter 4-1 = 3 df if testing for race, 2-1=1 df if testing for gender, and (2-1)*(4-1) = 3 df if testing for the interaction.
Number of groups = found by multiplying the levels in both factors (in this case 2x4=8)
36. ANOVA: Repeated measures, between factors
RMANOVA (just for comparing levels of a between factor like gender)
Example: The same students took 3 tests under 3 different music conditions (rock, country, and rap). We want to know if there is a significant effect for gender (males vs. females), the between factors variable.
Determine Effect Size = Select Procedure > effect size from mean. Enter number of levels of the fixed variable being compared (in this case 2 genders). Enter expected SD for all groups (assuming homogeneity of variance). Enter expected mean test scores in the table along with expected sample sizes for each. If sample sizes are equal, then enter the amount in “equal n” and then click “calculate effect size” and transfer to main window.
Alpha = .05
Power = desired level (select “post hoc” to enter sample size)
Number of Groups = 2
Repetitions = 3 music conditions
Correlation among repeated measures = enter approximate correlation
37. ANOVA: Repeated measures, within factors
RMANOVA (just for comparing levels of a within factor variable like days)
Example: The same students took 3 tests under 3 different music conditions (rock, country, and rap). We want to know if there is a significant effect for music condition, the within factor variable. Two between factors groups for gender (male vs. female).
Determine Effect Size = Select Procedure > direct method. Enter partial eta squared (n2) which is the effect size measure indicating the total variance explained by the IVs, main effects, and interactions. Click “calculate effect size” and transfer to main window. Eta squared size conventions: small = .01; medium = .06; large = 0.14.
Alpha = .05 (for one tail)
Power = desired level (select “post hoc” to enter sample size)
Number of groups = of the between subjects factor, in this case 2 for gender
Repetitions = number of repeated measures, in this case 3 for music condition
Correlation among repeated measures = whatever you think this might be
Nonsphericity correction e = 1.0 if sphericity assumption is met, something else if not met. (Highest value is 1.0, and lowest value = 1/[repetitions – 1].)
(Sphericity assumption in univariate RMANOVA – When the repeated measures are transformed by a set of orthogonal weights, they should be uncorrelated with each other but have equal variances. This is the sphericity assumption. If the design includes a between-subjects factor, then sphericity must be met. How well this assumption is met is determined by the Epsilon statistic which ranges from 0 to 1, with 1 being perfect sphericity and 0 being complete violation. About .75 or higher is usually acceptable in most RMANOVA designs. Failure to meet the sphericity assumption increases the Type I error rate. When sphericity assumption is met, the univariate test is more powerful than the multivariate test. If the assumption is not met, you may use epsilon multipliers (see SPSS printout) although these may be too conservative, or you may use the multivariate methods below which do not require the sphericity assumption.)
38. ANOVA: Repeated measures, within-between interaction
RMANOVA (just for testing the interaction of within and between variables)
Example: The same students took 3 tests under 3 different music conditions (rock, country, and rap). The within factors is music and the between factors is gender. We want to know if there is a significant effect for the interaction between music and gender.
Determine Effect Size = Select Procedure > direct method. Enter partial eta squared (n2) which is the effect size measure indicating the total variance explained by the IVs, main effects, and interactions. Click “calculate effect size” and transfer to main window. Eta squared size conventions: small = .01; medium = .06; large = 0.14.
Alpha = .05 (for one tail)
Power = desired level (select ‘post hoc’ to enter sample size)
Number of groups = of the between subjects factor, in this case 2 for gender
Repetitions = number of repeated measures, in this case 3 for music condition
Correlation among repeated measures = whatever you think this might be
39. Hotelling’s T2: One group mean vector
Multivariate analysis for comparing within group data on 2 or more DVs.
Example: We want to compare patients’ pre-treatment measures with post-treatment measures based on 2 outcome variables Y1 and Y2 (you may have more than 2 DVs).
Determine Effect Size = enter number of response vbls (in this case 2). There are 3 techniques for calculating effect size. I prefer the variance-covariance matrix (#1) approach.
#1. Variance-covariance matrix (preferred approach): Enter vector/column means for the differences between pre and post outcome data vectors, and fill in the variance-covariance matrix for the differences vectors (var. in diagonal and cov. in off-diagonals). In most cases, "multiply all means by" should be set to 1.
Pre-intervention Post intervention Differences
E.G., pre1 pre2 post1 post2 Diff1 Diff2
5 8 4 8 1 0
4 9 4 7 0 2
5 7 6 6 -1 1
6 8 5 7 1 1
Vector means: 0.25 1.0
Variances: 0.92 0.67
Covariance: -0.33
Correlation: -0.42
Covariance matrix of differences:
Y1 Y2
Y1 .92 -.33
Y2 -.33 .67
#2 & 3. SD and correlation (other approaches): Enter vector means for the differences between pre and post outcome data vectors, and fill in the SD-correlation matrix for the differences vectors (SDs in diagonal and corr. in off-diagonals). (Don’t know about “autocorr” and “multiply all means by” windows right now, although the latter should probably be set to 1).
Alpha = .05 (for one tail)
Power = set desired level (use post hoc to enter sample size)
Response variables = enter number of outcome variables, in this case 2.
40. Hotelling’s T2: Two group mean vectors
Multivariate analysis for comparing 2 independent groups on 2 or more DVs.
Example: We want to compare patients who get therapy 1 with patients getting therapy 2 based on 2 outcome variables Y1 and Y2 (you may have more than 2 DVs).
Determine Effect Size = Enter the number of response variables, in this case 2. There are three input techniques. I prefer method #1.
#1. Variance–Covariance matrix (prefered approach): Enter vector/column means & fill in total variance-covariance matrix for Y1 and Y2 data vectors.
E.G., Therapy Group 1 Therapy Group 2
Y1 Y2 Y1 Y2
3 6 6 7
2 4 4 8
4 2 5 6
The mean vectors are:
Y1 Y2
Group1 3 4
Group2 5 7
Here’s the total (pooled) covariance matrix representing the population common covariance matrix (1.0 is the pooled variance for both Y1 columns, 2.50 is the pooled variance for both Y2 columns, and -0.75 is the pooled covariance for both groups). Found by adding the SSCP matrixes for both groups and dividing by total degrees of freedom.
Y1 Y2
Y1 1.00 -.750
Y2 -.750 2.50
#2 & 3. SD and corr. matrix (other approaches): Enter vector/column means and fill in the SDs & correlation matrix (SDs in diagonal, corr. in off-diagonals). (Don’t know about “autocorr” and “multiply all means by” windows right now, although the latter should probably be set to 1).
Alpha = .05 (for one tail)
Power = set desired level (use post hoc to enter sample size)
Allocation ratio N2/N1 = sample size for group 2 divide by sample size for group 1
Response variables = enter number of outcome variables, in this case 2
41. MANOVA – Global Effects
Multivariate analysis for comparing 2 or more independent groups (fixed factors) when we have 2 or more DVs.
Example: We want to compare 4 different groups of patients getting therapies A, B, C, and D based on 5 outcome variables Y1, Y2, Y3, Y4, and Y5.
Options = Select Muller & Peterson (1984) method (used by SPSS)
Determine Effect Size = Enter Pillai’s trace based on analysis with preliminary data set. Enter number of groups, in this case 4. Enter number of response (DV) variables, in this case 5. Enter total sample size, in this case 4 groups x 5 patients/group = 20. Calculate effect size and return to main window. Calculations for f2 = [Pillai’s Trace V / (s – V)], where “s” equals the smaller of either number of DVs or number of groups minus 1. f2 Effect Size Conventions: Small = (.10)^2 = .01; Medium = (.25)^2 = .06; Large = (.40)^2 = 0.16.
Alpha = .05
Power = set desired level (choose post hoc to enter sample size)
Number of Groups = in this case 4
Response Variables = number of DVs, in this case 5
42. MANOVA – Special Effects and Interactions
Multivariate analysis for comparing the interaction of within and fixed factors (factorial design) when we have 2 or more DVs.
Example: We want to compare 4 groups of patients getting therapies A, B, C, and D based on 5 outcome variables Y1, Y2, Y3, Y4, and Y5, with sex (M vs. F) as a between subjects factor.
Options = Select Muller & Peterson (1984) method (used by SPSS)
Determine Effect Size = Enter Pillai’s trace based on analysis with preliminary data set. Enter number of groups, in this case 4. Enter number of response (DV) variables, in this case 5. Enter total sample size, in this case 4 groups x 5 patients/group = 20. Calculate effect size and return to main window. Calculations for f2 = [Pillai’s Trace V / (s – V)], where “s” equals the smaller of either number of DVs or number of groups minus 1. f2 Effect Size Conventions: Small = (.10)^2 = .01; Medium = (.25)^2 = .06; Large = (.40)^2 = 0.16.
Alpha = .05
Power = set desired level (choose post hoc to enter sample size)
Number of Groups = in this case 4
Response Variables = number of DVs, in this case 5
43. MANOVA: Repeated Measures, Between Factors
Testing between factor effects in univariate RMANOVA using the multivariate approach, when sphericity assumption is not met (see #50 below).
44. MANOVA: Repeated Measures, Within Factors
Testing within factor effects in univariate RMANOVA using the multivariate approach, when sphericity assumption is not met (see #50 below).
45. MANOVA: Repeated Measures, Within-Between Interaction
Testing interaction of within and between factors in univariate RMANOVA using the multivariate approach, when sphericity assumption is not met (see #50 below).
46. Linear Multiple Regression: Fixed model, R2 deviation from zero (see also #2 above)
Evaluate whether a group of predictors significantly predicts a DV. This is done by testing the null hypothesis that the proportion of variance in a DV explained by a set of predictors (R-squared) equals zero.
Example: Do IV1, IV2, and IV3 significantly predict DV?
Determine effect size f2-> Two methods. Use method (a) when you want to just enter the squared multiple correlation coefficient representing the amount of variability in DV accounted for by the IVs. Or use method (b) when you want to enter the specific correlations (a more rigorous approach).
Method (a). Enter the expected squared multiple correlation coefficient p^2 (i.e., R^2) which is the amount of variablility in the DV explained by the predictors (R^2 conventions: 0.06 small, 0.25 medium, 0.80 high), and then calculate the effect size f^2 by clicking "calculate". G*power does the following calculation to get f^2 = R^2/1-R^2. f^2 effect size conventions are 0.02 small, 0.15 medium, 0.35 high. An aternative to entering R^2 is to enter an f^2 effect size convention on the main page in the box next to "Effect size f2".
Method (b). Find the squared multiple correlation coefficient p^2 by specifying the correlations between the the predictors and the outcome, and then specifying the correlation matrix between predictors.
E.G. With 3 predictors, the correlations between IV1, IV2, IV3 and the DV may be . . .
IV1 IV2 IV3
DV .23 .16 .24
And the correlation matrix between the predictors may be . . .
IV1 IV2 IV3
IV1 1.0 .20 .45
IV2 .20 1.0 .31
IV3 .45 .31 1.0
Calculate the effect size and transfer to main window (Note that f^2 effect size conventions are: small=0.02, medium=0.15, and large=0.35)
Alpha = 0.05 or 0.01
Power = something higher than 0.79
Number of Predictors = enter the number of IVs, in this case 3.
47. Linear Multiple Regression: Fixed model, R2 increase
48. Variance: Test of equality (2 sample case)
49. Generic F Test
50. Repeated Measures Multivariate ANOVA
This design involves measuring more than one outcome (dependent) variable in the same group of people on three or more occasions. For instance, researchers might measure blood pressure, heart rate, and oxygen saturation (3 DVs) in the same 10 patients at four different time periods (start trial, 4 months, 8 months, and 12 months) to test the accumulative effect of drug A. In another example, researchers might evaluate obsessive compulsive tendencies, anxiety, and depression in the same group of participants under three different treatment levels of drug B (0 mg, 250 mg, 500 mg) administered at three different time periods (1 month, 2 months, 3 months).
Unfortunately G*Power cannot power the RMMANOVA. If you thought that the “F Test: MANOVA: repeated measures” function can power RMMANOVA, you are not alone. I am sorry to say that it cannot. The “F Test: MANOVA: repeated measures” command is the multivariate equivalent of the univariate ANOVA that is used when the univariate analysis violates the sphericity assumption. So what should you do if you need to power a RMMANOVA design? Just follow the 5 steps below.
Step 1. Estimate the means and standard deviations for each outcome variable at each time measurement period. These should be based on similar studies found in the existing literature, professional/clinical experience, and/or preliminary data gathering. (Note: if you do not have this information, then you need to run a pilot/exploratory study first.)
Step 2. Use a random number generator to create fictional data sets for each outcome at each time period (e.g., Motulsky’s GraphPad website has a free random number generator). The data should be generated assuming a normal distribution (an assumption of ANOVA). How many data points should you generate in each set? The number of data points should equal the number of participants you are planning on using in the study.
Step 3. Copy the fictional data and paste it into an Excel spreadsheet. When all the data sets are in Excel, copy the entire data set and paste it into SPSS. (Note that SPSS will not accept some data straight from a random number generator, but Excel will, and you can copy and paste data from Excel into SPSS.)
Step 4. Run a GLM multivariate analysis on the fictional data and ‘ask’ SPSS to estimate power.
Step 5. Repeat steps 2-4 a few times and then calculate the mean of the SPSS power estimates.
There you have it! An estimate of power for any RMMANOVA. If your mean power is <.80, try adding more data points/participants in step 2.
This statement by Griff's buddies on Back to the Future II is applicable to the research domain. It goes something like this: “Hey McFly! That research won’t work on wishful thinking, unless you’ve got power!”
Statistical power refers to the probability of correctly rejecting the null hypothesis of no effect. It is essential that researchers know their statistical power before launching into a research project. Low statistical power may lead one to conclude that there is no effect from a treatment when there is (called a Type II error), while an “overpowered” study may lead one to conclude that a significant effect has practical or clinical significance when it does not.
Concern over statistical power is a relatively recent phenomenon. Studies have shown that several past studies in the social and health sciences have been underpowered. Many of these had only a 20-30% chance of correctly rejecting the null hypotheses, possibly leading researchers to incorrectly conclude that treatment effects were not real. An awareness of this problem has led most institutional review boards (IRBs) and granting agencies to require power and sample size calculations before approving studies. But many people do not know where to go to calculate power.
There are a number of commercial power and sample size programs available. PASS, SPSS Power, and NQuery are a few examples. However, these programs are costly (each in the $1000 range). There are also several freeware power and sample size calculators available, but most of these are limited in the number of available power calculations.
I have used several power and sample size programs. My favorite is G*Power. G*Power was created by faculty at the Institute for Experimental Psychology in Dusseldorf, Germany. It offers a wide variety of calculations along with graphics and protocol statement outputs. Best of all, it is free! The developers released version 3.1.3 in 2010. Terms of use and a downloadable zip file are available here.
After downloading the program you may ask yourself, “How do I use it?” There are limited resources. The developers have a tutorial on using G*Power, but it is sparse in some places and thus may be difficult for some people to follow. I created an easy-to-follow guide for using GPower 3.x. The guide is included below. It is a work in progress and I will update it and add more analyses as time permits. Several of the G*Power examples on this page have been checked against power calculations in SPSS, NQuery, and PASS with excellent results.
I cannot guarantee the completeness and correctness of this material and users assume all risks associated with using the guide. If you have any comments or suggestions on improving the guide, please let me know. (Recently added: power analysis for repeated measures, multivariate ANOVA. See #50 below.)
A Guide to Using GPower
Exact Tests
The main characteristic of exact methods is that the statistical tests are based on exact probability statements that are valid for any sample size, thus you may use exact power calculations for any sample size. These calculations should at least be used when sample sizes are small and/or there is no equality of variance.
1. Correlation: Bivariate (2 continuous variables) Normal Model
Test whether an r value is statistically different from zero or a known pop r value.
Example: Is a correlation of 0.75 between hours studied and test score significantly different from zero? Or, does my sample's r value of 0.75 differ from a population's r value of 0.65?
Tails = 1 or 2 (use 2-tail if the r value could be pos or neg; otherwise use 1-tail)
Correlation p H1 (corr. value assuming H1) = 0.75 (note that r is the effect size)
Alpha = .05 (or .01)
Power = desired level
Correlation p Ho (corr. value assuming Ho) = usually 0. However, you may enter any other r value if you want to compare a known null hypothesis population r value (e.g., 0.65) against your sample's r value (0.75).
2. Linear Multiple Regression: Random Model (see also #46 below)
To test whether a group of predictors significantly predicts an outcome variable.
Example: Do IV1, IV2, IV3, and IV4 significantly predict a DV?
Tails = 1 or 2
H1 p2 = click “Determine” to estimate the population multiple correlation coefficient. Choose “from predictor correlations. Enter number of predictors. Click on “specify matrices” and enter IV’s correlations with the DV. Calculate p2 and then accept values.
H0 p2 = null hypothesis multiple correlation coefficient (usually 0)
Power = enter desired power level
Number of predictors = enter number of IVs, in this case 4 IVs.
3. Proportion: Difference from Constant (binomial test, one sample case)
To test whether a sample proportion differs from a population proportion. Especially use the exact test when n*po*qo < 5 or n*po and n*(1-po) < 5, (where po equals the probability of an event occuring and qo equals the probability of an event not occuring.)
Example: The prevalence of breast cancer among middle aged women in the general population is .02. The breast cancer rate among a sample of women who have a sister with breast cancer is .05. What sample size is needed to detect a significant difference between the population and sample proportions? (To claim that the rate of cancer among women with sister history is 2.5 times [.05/.02] higher than those without sister history?)
Tails = 2 or 1 (if the direction of difference from the Ho(P1) value is known, choose a 1-tail test)
Effect size g = Click Determine. Enter P1 the Ho prop(.02), and P2 the H1(alternative) prop(.05). Choose one of the "Calc P2 from..." techniques (they all give the same effect size), synch the values, and then calculate effect size g. In this case, g = 0.03
Alpha = .05
Power = .90
Constant proportion = Ho prop (.02) which is the same as P1
4. Proportions: Inequality, 2 Dependent Groups (McNemar’s)
Compare 2 dependent proportions (people in both groups have been paired/matched).
Example 1: How many people are needed to test whether the proportion of people who quit smoking in a hypnotism smoking cessation program (expected proportion of hypnosis quitters Prop.hyp=.76) is different than the proportion of people who quit smoking by chewing Big Red gum (expected proportion of gum quitters Prop.gum=.54) where both groups are matched by age, sex, and smoking history. Example 2: Based on clinical experience, we've found that about 0.76 of immunocompromized patients test positive for a certain infection. We've also found that this proportion drops to about 0.54 in the same patients after a drug to boost immunity is given. We want to prove the drug's effectiveness by testing the patients for infection (infected vs. not infected) before and after administration of the drug. How many patients do we need to show a difference in before and after infections?
Tails = 1 or 2 (use 1-tail if the difference is expected to go in one direction).
OR = [Ph/(1-Ph)]/[Pg/(1-Pg)] = [.76/(1-.76)]/[.54/(1-.54)] = 3.17/1.17 = 2.70
Alpha = .05
Power = desired level
Prop. Discordant Pairs = estimate how many pairs will not have the same outcome? For instance, pos/pos and neg/neg are concordant pairs, while pos/neg are discordant pairs. Enter the proportion of pos/neg discordant pairs. In healthcare, knowing the sensitivity of tests may help estimate this value.
5. Proportions: Inequality, 2 Independent Groups (Fisher’s Exact)
Compare 2 independent proportions.
Example: Based on previous data, the expected proportion of students passing a stats course taught by psychology teachers is 0.85. The expected proportion of students passing the same stats class taught by mathematics teachers is 0.95. How many participants are needed to detect a significant difference between the 2 proportions in a prospective study? (Note that this also works with retrospective studies where one wants to know how many cases to extract from a database for both groups.)
Tail = 1 or 2
Prop 1 = 0.85 (You do not have to click on “Determine”)
Prop 2 = 0.95
Alpha = .05
Power = choose your level.
6. Proportions: Inequality, 2 independent groups (unconditional)
Not sure about this one. It uses one proportion and an OR.
7. Proportions: Inequality, (offset) 2 independent groups (unconditional)
Not sure about this one. It uses conditional probabilities.
8. Proportion: sign test (binomial test)
Test whether the occurence of a binary outcome in a population is 60%. (This calculation can also be done with #9 below.)
Example: How many cases are needed to test whether the number of female students in a group of college students differs from 60% (P1)?
Tails = 1 or 2 (select 1 if the expected difference is unidirectional).
Effect size g = calculate the expected effect size where g = (Prop2 - .50). So in this example, g = (proportion of non-females [males] minus .40). If we expect .40 males. then g = (.60 - .40) = .20.
Alpha = .05 in most cases
Power = select your level (usually > .80)
9. Generic Binomial test
Compare proportions of a binary variable
Example: How many cases are needed to show that there are more females (expected prop = .60) than males (expected prop = .40) included in a binary variable called gender?
Proportion p2 = if this represented males, you would enter the expected proportion of males in the “gender” variable (e.g., 0.40)
Alpha = select your level (.01 or .05)
Power = select your level
Proportiion p1 = if this represents females, you would enter proportion of expected females in the “gender” variable (e.g., 0.60)
T-Tests
Cohen’s Effect Size Conventions for “d”
d = 0.20 (small)
d = 0.50 (medium)
d = 0.80 + (large)
10. Correlation: Point Biserial Model
Tests whether a correlation coefficient is significantly different from zero, when one variable is continuous and the other is dichotomous.
Example: How many participants are needed to determine whether an expected r = 0.30 is significantly different from zero when correlating test scores (continuous) and gender (dichotomous)?
Tails = 1 or 2 (use 2-tail if the r value could be pos or neg; otherwise use 1-tail)
Effect Size |r| = enter the correlation coefficient 0.30 (no need to click on “determine”)
Alpha = .05
Power = choose your power
11. Linear Bivariate Regression: one group, size of slope
Determine whether the slope for a predictor variable is significantly different from 0.
12. Linear Bivariate Regression: 2 groups, difference between intercepts
13. Linear Bivariate Regression: 2 groups, difference between slopes
14. Linear Multiple Regression: Fixed model, single regression coefficient
15. Means: Difference between 2 dependent groups
Within, dependent, correlated, paired samples t-test.
Example: What sample size is needed for comparing before and after scores on depression to test whether an antidepressant works. Before treatment mean score = 45 (SDbefore = 2.1); After treatment mean score = 32 (SDafter = 1.6).
Tails = 1 or 2
Effect Size dz = Click “determine” and enter “before” data for group 1, and “after” data for group 2, and enter correlation between the 2 sets of data.
Alpha = .05
Power = select desired level
16. Means: Difference between 2 independent groups
Between, independent groups t-test.
Example: What sample size is needed to compare control and treatment groups?
Tails = 1 or 2
Effect size d = use top pane if sample sizes are not equal (using SDpooled). Use bottom pane if sample sizes are equal (balanced).
Alpha = .05
Power = select desired level
Allocation ratio = ratio of sample sizes (enter 1 if sample sizes are expected to be equal).
17. Means: Difference from constant (single sample t-test)
One sample t-test.
Example: Compare a sample mean against a null hypothesis population mean.
Tails = 1 or 2
Effect size d = enter H1 and H0 means. Enter estimated sigma σ using sample SD.
Alpha = .05
Power = select desired power level
18. Wilcoxon Signed-Ranks Tests (matched pairs)
Non-parametric test for comparing 2 matched groups.
19. Wilcoxon Signed-Ranks Test (one/within sample case)
Non-parametric test for comparing within group data.
20. Wilcoxon Rank-Sum or MWU (2 independent groups)
Non-parametric test for comparing 2 independent groups.
21. Generic t-test
No a priori calculations
Chi-Square Tests
(no chi-square test for independence in here)
Cohen’s Effect Size Conventions for “w”
w = 0.10 (small)
w = 0.30 (medium)
w = 0.50 (large)
22. Goodness of Fit: Contingency Tables
Chi-Square test for Goodness of Fit
Example: Observed number of people belonging to groups A, B, C, and D are compared against expected values.
Tails = 1 or 2
Effect Size w = select “determine”. Number of cells refers to # of categories, in this case there are 4 (A, B, C, D). P(H0) is the column for expected observed values. P(H1) is the column for observed values. The proportions in each column must add up to 1. The 2 cells above equal P(H0) and P(H1) are for entering an equal proportion for the respective cells in one column and then click the ‘equal’ button. Don’t know about the normalize buttons. “Auto calc. last cell” button computes final proportion for the last cell in a column so that the total is 1.0.
Alpha = .05
Power = select desired level
DF = (# categories – 1). In this example 4-1 = 3.
23. Variance: difference from constant (one case)
Not sure about this one
24. Generic X2 Test
No a priori calculations
Z Tests
Effect size conventions for the correlation coefficient r.
r = small 0.10
r = medium 0.30
r = large 0.50
25. Correlation: Tetrachoric Model
Correlate 2 artificially dichotomized variables
26. Correlation: 2 Dependent Pearson r’s (common index)
Correlate Pearson correlation coefficients from 2 dependent samples.
27. Correlation: 2 Dependent Pearson r’s (no common index)
(not sure how this differs from #2)
28. Correlation: 2 independent Pearson r’s
Compare 2 Pearson correlation coefficients from 2 independent samples.
Example: Test whether the correlation between hours studied and test score for group A is statistically different than the correlation between hours studied and test score for group B.
Tails = 1 or 2
Effect size q = click ‘determine’ and then enter both r’s
Alpha = .05
Power = select desired level
Allocation ratio n2/n1 = enter ratio of participants in group A to group B.
29.A Logistic Regression for a continuous predictor (with or without other covariates)
Find out if a continuous predictor is a significant predictor of a binary outcome variable.
Example: Test whether BMI influences mortality (yes 1, no 0) in patients with chronic illness with or without other covariates (e.g., comorbidities, gender, etc.)
Tails: 1 or 2
Click on Options tab at the bottom to enter effect size as OR or 2 probabilities. Since OR = (p1(1-p2))/(p2(1-p1)), this example will use the 2 probabilities option.
p1 -> Pr(Y=1 | X=1) H1. What is the prob of death (Y=1) when the main predictor (BMI) is one SD unit (i.e., one z-score) above its mean, and all other covariates, if applicable, are set to their mean values. Let's say that p1 = 0.25.
p0 -> Prob(Y=1 | X=1) Ho. What is the prob death (Y=1) when the predictor and all other covariates, if applicable, are set to their mean values. Let's say that p0 = 0.15.
Alpha = .05
Power = select desired level
R-squared other X = Enter the expected squared multiple correlation coefficient (R^2) between the main preedictor and all other covariates. If there are no other covariates, enter 0. (This can be found by regressing the main predictor onto all other covariates.) So how much of the variance in the main predictor is acocunted for by variability in other covariates? If there is just one other covariate, age, and it explains 10% of the variability in BMI, then enter .10. You may find R^2 using GPower's calculation tool in linear multiple regression (the p^2 under Exact or Multivariate tests).
X-Distribution = select normal unless there are reasons to think that the main predictor is distributed differently.
X param mu = the z-score population mean of predictor X (BMI) = 0.
X param sigma = the z-score population SD of predictor X (BMI) = 1.
29.B. Logistic Regression for a dichotomous predictor (with or without other covariates)
Find out if a dichotomous predictor is a significant predictor of a binary outcome.
Example. Test whether smoking (yes vs. no) influences mortality with no other covariates or with other covariates.
Tails = 1 or 2
Click on Options tab at the bottom to enter effect size as OR or 2 probabilities. Since OR = (p1(1-p2))/(p2(1-p1)), this example will use the 2 probabilities option.
p1 -> Pr(Y=1 | X=1) H1. The probability of death (Y=1) given that someone smokes (X=1). Let's assume that p1 = 0.18.
p0 -> Prob(Y=1 | X=1) Ho. The probability of death (Y=1) given that someone is a non-smoker (X=0). Let's assume that p2 = 0.06. [Note that the makers of G*Power should change Prob(Y=1|X=1)Ho to Prob(Y=1|X=0)]
Alpha = .05
Power = select desired level
R-squared other X = enter the expected squared multiple correlation coefficient (R^2) between BMI and all other covariates. Find R^2 using the calculation tool in linear multiple regression (the p^2 under Exact or Multivariate F tests) or select a convention (small= .10, medium= .30, large= .50) and square it. For instance, if the expected correlation of BMI with other covariates is small (~.20), square it (.04). If there are no other covariates, enter 0.0.
X-Distribution = binomial predictor
X param pi = The proportion of cases where X=1. Are the samples for X=0 and X=1 balanced/equal? If balanced, enter 0.50. If 75% of the cases are X=1, then enter 0.75.
30.A. Poisson Regression for a continuous predictor (with or without other covariates)
Find out whether a continuous predictor variable influences count data collected over the same period of time.
Example. Can we predict number of classes missed in a semester based on age?
Tails = 1 or 2
Exp(B1) = enter the increase in the response rate beyond base rate [Exp(Bo)] that you want to detect, with every one unit change in the main predictor. For example, if you wanted to detect an increase of 25% in absentees with every 1 year increase in age, then enter 1.25. If you wanted to detect a 75% increase in absentees, then enter 1.75, etc.
Alpha = .05 or .01
Power = select your level of power (I usually start at 90% but go no lower than 80%).
Base rate Exp(Bo) = the baseline response rate that is expected when the predictor equals the mean. For example, we expect the absentee base rate to be .05 (5%) at the mean age level. (Note that the base rate can refer to the death rate, survival rate, accident rate, or hazard rate, etc.)
Mean exposure = the mean unit of time over which the counts are collected. For example, if one year then enter 1; if 60 days then enter 60.
R^2 other X = enter the expected squared multiple correlation between the main predictor and other covariates, if applicable. If there are no other predictors/covariates then enter 0.
X Distribution = enter the shape of the underlying distribution for the main predictor. Usually 'normal' when the predictor is continuous.
X parm mu = 0
X parm sigma = 1
30.B. Poisson Regression for a binary predictor (with or without other covariates)
Find out whether a dichotomous predictor variable influences count data collected over the same period of time.
Example. Can we predict number of classes missed in a semester based on gender?
Tails = 1 or 2
Exp(B1) = enter the increase in the response rate beyond the base rate [Exp(Bo)] that you want to detect, with a change in the main predictor. For example, if you wanted to detect an increase of 25% in absentees among males (compared with females), then enter 1.25. If you wanted to detect a 75% increase in absentees, then enter 1.75, etc.
Alpha = .05 or .01
Power = select your level of power (I usually start at 90% and go no lower than 80%).
Base rate Exp(Bo) = the baseline response rate that is expected when the predictor equals zero (female). For example, we expect the absentee base rate to be .05 (5%) for females. (Note that the base rate can refer to the death rate, survival rate, accident rate, or hazard rate, etc.)
Mean exposure = the mean unit of time over which the counts are collected. For example, if one year then enter 1; if 60 days then enter 60.
R^2 other X = enter the expected squared multiple correlation between the main predictor and other covariates, if applicable. If there are no other predictors/covariates then enter 0.
X Distribution = binomial (shape of the underlying distribution for the main predictor)
X parm pi = Enter the proportion of total cases belonging to group 1 (males). If sample sizes are equal, then enter 0.50.
31. Proprotions: Difference between 2 independent proportions
Compare 2 proportions from 2 independent groups
Example: The proportion of divorced Baptists in a sample of 100 is 0.22, and the proportion of divorced Catholics in another sample of 100 is 0.31. Is there a significant difference?
Tails = 1 or 2
Proportion 2 = enter Catholic proportion 0.31
Proportion 1 = enter Baptist proportion 0.22
Alpha = .05
Power = select desired level
Allocation ratio n2/n1 = enter ratio of participants in both groups (i.e., 100/100 = 1.0)
32. Generic Z Test
F-Tests
Cohen’s univariate effect size conventions for “f”
f = 0.10 (small)
f = 0.25 (medium)
f = 0.40 (large)
33. ANCOVA: Fixed effects, main effects, and interactions
34. ANOVA: Fixed Effects, omnibus, one-way
One-Way between (fixed effects) groups ANOVA
Example: We want to compare mean scores on an algebra test for students who took the test listening to Rock, Country, and Rap.
Determine Effect Size = Select Procedure > effect size from mean. Enter number of levels (groups) of the fixed variable being compared (in this case 3 music groups). Enter expected SD for all groups (assuming homogeneity of variance). Enter expected mean test scores in the table along with expected sample sizes for each. If sample sizes are equal, then enter the amount in “equal n” and then click (in this case we expect 12 participants per group). Click “calculate effect size” and transfer to main window.
Alpha = .05
Power = select desired power level (use “post hoc” to enter sample size)
Number of Groups = already inserted from effect size calculations, so 3.
35. ANOVA: Fixed effects, special, main effects and interactions
Two-way (or higher) between (fixed effects) groups ANOVA (single analysis good for both main effects and interactions)
Example: We want to see if there is a difference in test scores based on gender (female vs. male) and race (Caucasian, Hispanic, Black, Native) thus making a 2x4 analysis.
Determine Effect Size = Select Procedure > direct method. Enter partial eta squared (n2) which is the effect size measure indicating the total variance explained by the IVs, main effects, and interactions. Click “calculate effect size” and transfer to main window.
Alpha = .05
Power = desired level (select “post hoc” to enter sample size)
Numerator df = this specifies which main effect or interaction you are testing for. It is found by taking the number of levels and subtracting one. In this case, enter 4-1 = 3 df if testing for race, 2-1=1 df if testing for gender, and (2-1)*(4-1) = 3 df if testing for the interaction.
Number of groups = found by multiplying the levels in both factors (in this case 2x4=8)
36. ANOVA: Repeated measures, between factors
RMANOVA (just for comparing levels of a between factor like gender)
Example: The same students took 3 tests under 3 different music conditions (rock, country, and rap). We want to know if there is a significant effect for gender (males vs. females), the between factors variable.
Determine Effect Size = Select Procedure > effect size from mean. Enter number of levels of the fixed variable being compared (in this case 2 genders). Enter expected SD for all groups (assuming homogeneity of variance). Enter expected mean test scores in the table along with expected sample sizes for each. If sample sizes are equal, then enter the amount in “equal n” and then click “calculate effect size” and transfer to main window.
Alpha = .05
Power = desired level (select “post hoc” to enter sample size)
Number of Groups = 2
Repetitions = 3 music conditions
Correlation among repeated measures = enter approximate correlation
37. ANOVA: Repeated measures, within factors
RMANOVA (just for comparing levels of a within factor variable like days)
Example: The same students took 3 tests under 3 different music conditions (rock, country, and rap). We want to know if there is a significant effect for music condition, the within factor variable. Two between factors groups for gender (male vs. female).
Determine Effect Size = Select Procedure > direct method. Enter partial eta squared (n2) which is the effect size measure indicating the total variance explained by the IVs, main effects, and interactions. Click “calculate effect size” and transfer to main window. Eta squared size conventions: small = .01; medium = .06; large = 0.14.
Alpha = .05 (for one tail)
Power = desired level (select “post hoc” to enter sample size)
Number of groups = of the between subjects factor, in this case 2 for gender
Repetitions = number of repeated measures, in this case 3 for music condition
Correlation among repeated measures = whatever you think this might be
Nonsphericity correction e = 1.0 if sphericity assumption is met, something else if not met. (Highest value is 1.0, and lowest value = 1/[repetitions – 1].)
(Sphericity assumption in univariate RMANOVA – When the repeated measures are transformed by a set of orthogonal weights, they should be uncorrelated with each other but have equal variances. This is the sphericity assumption. If the design includes a between-subjects factor, then sphericity must be met. How well this assumption is met is determined by the Epsilon statistic which ranges from 0 to 1, with 1 being perfect sphericity and 0 being complete violation. About .75 or higher is usually acceptable in most RMANOVA designs. Failure to meet the sphericity assumption increases the Type I error rate. When sphericity assumption is met, the univariate test is more powerful than the multivariate test. If the assumption is not met, you may use epsilon multipliers (see SPSS printout) although these may be too conservative, or you may use the multivariate methods below which do not require the sphericity assumption.)
38. ANOVA: Repeated measures, within-between interaction
RMANOVA (just for testing the interaction of within and between variables)
Example: The same students took 3 tests under 3 different music conditions (rock, country, and rap). The within factors is music and the between factors is gender. We want to know if there is a significant effect for the interaction between music and gender.
Determine Effect Size = Select Procedure > direct method. Enter partial eta squared (n2) which is the effect size measure indicating the total variance explained by the IVs, main effects, and interactions. Click “calculate effect size” and transfer to main window. Eta squared size conventions: small = .01; medium = .06; large = 0.14.
Alpha = .05 (for one tail)
Power = desired level (select ‘post hoc’ to enter sample size)
Number of groups = of the between subjects factor, in this case 2 for gender
Repetitions = number of repeated measures, in this case 3 for music condition
Correlation among repeated measures = whatever you think this might be
39. Hotelling’s T2: One group mean vector
Multivariate analysis for comparing within group data on 2 or more DVs.
Example: We want to compare patients’ pre-treatment measures with post-treatment measures based on 2 outcome variables Y1 and Y2 (you may have more than 2 DVs).
Determine Effect Size = enter number of response vbls (in this case 2). There are 3 techniques for calculating effect size. I prefer the variance-covariance matrix (#1) approach.
#1. Variance-covariance matrix (preferred approach): Enter vector/column means for the differences between pre and post outcome data vectors, and fill in the variance-covariance matrix for the differences vectors (var. in diagonal and cov. in off-diagonals). In most cases, "multiply all means by" should be set to 1.
Pre-intervention Post intervention Differences
E.G., pre1 pre2 post1 post2 Diff1 Diff2
5 8 4 8 1 0
4 9 4 7 0 2
5 7 6 6 -1 1
6 8 5 7 1 1
Vector means: 0.25 1.0
Variances: 0.92 0.67
Covariance: -0.33
Correlation: -0.42
Covariance matrix of differences:
Y1 Y2
Y1 .92 -.33
Y2 -.33 .67
#2 & 3. SD and correlation (other approaches): Enter vector means for the differences between pre and post outcome data vectors, and fill in the SD-correlation matrix for the differences vectors (SDs in diagonal and corr. in off-diagonals). (Don’t know about “autocorr” and “multiply all means by” windows right now, although the latter should probably be set to 1).
Alpha = .05 (for one tail)
Power = set desired level (use post hoc to enter sample size)
Response variables = enter number of outcome variables, in this case 2.
40. Hotelling’s T2: Two group mean vectors
Multivariate analysis for comparing 2 independent groups on 2 or more DVs.
Example: We want to compare patients who get therapy 1 with patients getting therapy 2 based on 2 outcome variables Y1 and Y2 (you may have more than 2 DVs).
Determine Effect Size = Enter the number of response variables, in this case 2. There are three input techniques. I prefer method #1.
#1. Variance–Covariance matrix (prefered approach): Enter vector/column means & fill in total variance-covariance matrix for Y1 and Y2 data vectors.
E.G., Therapy Group 1 Therapy Group 2
Y1 Y2 Y1 Y2
3 6 6 7
2 4 4 8
4 2 5 6
The mean vectors are:
Y1 Y2
Group1 3 4
Group2 5 7
Here’s the total (pooled) covariance matrix representing the population common covariance matrix (1.0 is the pooled variance for both Y1 columns, 2.50 is the pooled variance for both Y2 columns, and -0.75 is the pooled covariance for both groups). Found by adding the SSCP matrixes for both groups and dividing by total degrees of freedom.
Y1 Y2
Y1 1.00 -.750
Y2 -.750 2.50
#2 & 3. SD and corr. matrix (other approaches): Enter vector/column means and fill in the SDs & correlation matrix (SDs in diagonal, corr. in off-diagonals). (Don’t know about “autocorr” and “multiply all means by” windows right now, although the latter should probably be set to 1).
Alpha = .05 (for one tail)
Power = set desired level (use post hoc to enter sample size)
Allocation ratio N2/N1 = sample size for group 2 divide by sample size for group 1
Response variables = enter number of outcome variables, in this case 2
41. MANOVA – Global Effects
Multivariate analysis for comparing 2 or more independent groups (fixed factors) when we have 2 or more DVs.
Example: We want to compare 4 different groups of patients getting therapies A, B, C, and D based on 5 outcome variables Y1, Y2, Y3, Y4, and Y5.
Options = Select Muller & Peterson (1984) method (used by SPSS)
Determine Effect Size = Enter Pillai’s trace based on analysis with preliminary data set. Enter number of groups, in this case 4. Enter number of response (DV) variables, in this case 5. Enter total sample size, in this case 4 groups x 5 patients/group = 20. Calculate effect size and return to main window. Calculations for f2 = [Pillai’s Trace V / (s – V)], where “s” equals the smaller of either number of DVs or number of groups minus 1. f2 Effect Size Conventions: Small = (.10)^2 = .01; Medium = (.25)^2 = .06; Large = (.40)^2 = 0.16.
Alpha = .05
Power = set desired level (choose post hoc to enter sample size)
Number of Groups = in this case 4
Response Variables = number of DVs, in this case 5
42. MANOVA – Special Effects and Interactions
Multivariate analysis for comparing the interaction of within and fixed factors (factorial design) when we have 2 or more DVs.
Example: We want to compare 4 groups of patients getting therapies A, B, C, and D based on 5 outcome variables Y1, Y2, Y3, Y4, and Y5, with sex (M vs. F) as a between subjects factor.
Options = Select Muller & Peterson (1984) method (used by SPSS)
Determine Effect Size = Enter Pillai’s trace based on analysis with preliminary data set. Enter number of groups, in this case 4. Enter number of response (DV) variables, in this case 5. Enter total sample size, in this case 4 groups x 5 patients/group = 20. Calculate effect size and return to main window. Calculations for f2 = [Pillai’s Trace V / (s – V)], where “s” equals the smaller of either number of DVs or number of groups minus 1. f2 Effect Size Conventions: Small = (.10)^2 = .01; Medium = (.25)^2 = .06; Large = (.40)^2 = 0.16.
Alpha = .05
Power = set desired level (choose post hoc to enter sample size)
Number of Groups = in this case 4
Response Variables = number of DVs, in this case 5
43. MANOVA: Repeated Measures, Between Factors
Testing between factor effects in univariate RMANOVA using the multivariate approach, when sphericity assumption is not met (see #50 below).
44. MANOVA: Repeated Measures, Within Factors
Testing within factor effects in univariate RMANOVA using the multivariate approach, when sphericity assumption is not met (see #50 below).
45. MANOVA: Repeated Measures, Within-Between Interaction
Testing interaction of within and between factors in univariate RMANOVA using the multivariate approach, when sphericity assumption is not met (see #50 below).
46. Linear Multiple Regression: Fixed model, R2 deviation from zero (see also #2 above)
Evaluate whether a group of predictors significantly predicts a DV. This is done by testing the null hypothesis that the proportion of variance in a DV explained by a set of predictors (R-squared) equals zero.
Example: Do IV1, IV2, and IV3 significantly predict DV?
Determine effect size f2-> Two methods. Use method (a) when you want to just enter the squared multiple correlation coefficient representing the amount of variability in DV accounted for by the IVs. Or use method (b) when you want to enter the specific correlations (a more rigorous approach).
Method (a). Enter the expected squared multiple correlation coefficient p^2 (i.e., R^2) which is the amount of variablility in the DV explained by the predictors (R^2 conventions: 0.06 small, 0.25 medium, 0.80 high), and then calculate the effect size f^2 by clicking "calculate". G*power does the following calculation to get f^2 = R^2/1-R^2. f^2 effect size conventions are 0.02 small, 0.15 medium, 0.35 high. An aternative to entering R^2 is to enter an f^2 effect size convention on the main page in the box next to "Effect size f2".
Method (b). Find the squared multiple correlation coefficient p^2 by specifying the correlations between the the predictors and the outcome, and then specifying the correlation matrix between predictors.
E.G. With 3 predictors, the correlations between IV1, IV2, IV3 and the DV may be . . .
IV1 IV2 IV3
DV .23 .16 .24
And the correlation matrix between the predictors may be . . .
IV1 IV2 IV3
IV1 1.0 .20 .45
IV2 .20 1.0 .31
IV3 .45 .31 1.0
Calculate the effect size and transfer to main window (Note that f^2 effect size conventions are: small=0.02, medium=0.15, and large=0.35)
Alpha = 0.05 or 0.01
Power = something higher than 0.79
Number of Predictors = enter the number of IVs, in this case 3.
47. Linear Multiple Regression: Fixed model, R2 increase
48. Variance: Test of equality (2 sample case)
49. Generic F Test
50. Repeated Measures Multivariate ANOVA
This design involves measuring more than one outcome (dependent) variable in the same group of people on three or more occasions. For instance, researchers might measure blood pressure, heart rate, and oxygen saturation (3 DVs) in the same 10 patients at four different time periods (start trial, 4 months, 8 months, and 12 months) to test the accumulative effect of drug A. In another example, researchers might evaluate obsessive compulsive tendencies, anxiety, and depression in the same group of participants under three different treatment levels of drug B (0 mg, 250 mg, 500 mg) administered at three different time periods (1 month, 2 months, 3 months).
Unfortunately G*Power cannot power the RMMANOVA. If you thought that the “F Test: MANOVA: repeated measures” function can power RMMANOVA, you are not alone. I am sorry to say that it cannot. The “F Test: MANOVA: repeated measures” command is the multivariate equivalent of the univariate ANOVA that is used when the univariate analysis violates the sphericity assumption. So what should you do if you need to power a RMMANOVA design? Just follow the 5 steps below.
Step 1. Estimate the means and standard deviations for each outcome variable at each time measurement period. These should be based on similar studies found in the existing literature, professional/clinical experience, and/or preliminary data gathering. (Note: if you do not have this information, then you need to run a pilot/exploratory study first.)
Step 2. Use a random number generator to create fictional data sets for each outcome at each time period (e.g., Motulsky’s GraphPad website has a free random number generator). The data should be generated assuming a normal distribution (an assumption of ANOVA). How many data points should you generate in each set? The number of data points should equal the number of participants you are planning on using in the study.
Step 3. Copy the fictional data and paste it into an Excel spreadsheet. When all the data sets are in Excel, copy the entire data set and paste it into SPSS. (Note that SPSS will not accept some data straight from a random number generator, but Excel will, and you can copy and paste data from Excel into SPSS.)
Step 4. Run a GLM multivariate analysis on the fictional data and ‘ask’ SPSS to estimate power.
Step 5. Repeat steps 2-4 a few times and then calculate the mean of the SPSS power estimates.
There you have it! An estimate of power for any RMMANOVA. If your mean power is <.80, try adding more data points/participants in step 2.