Procedures Described:
ANOVA (Analysis of Variance)
CROSSTABS (Frequency Tables & Chi-Square)
DISCRIMINANT (Discriminant Analysis)
FACTOR (Factor Analysis)
FREQUENCIES (Frequency Distribution)
LIST (List Variables)
PARTIAL CORR (Partial Correlation)
PEARSON CORR (Pearson Correlation)
PLOT (Scatterplot)
REGRESSION (Multiple Regression)
T-TEST (t-Test)
Analysis of Variance provides a test of the effects of different levels of one or more categorical variables (the "independent variables") on another continuous variable (the "dependent variable"). This is done in the same general way that a t-test determines the significance of the difference between groups. However, instead of two levels of one variable (as in the case of the t-test), ANOVA can simultaneously analyze the effects of up to five variables with two or more levels of each one. For example, one could determine how the size of a group (large, medium and small) and gender (male and female) affects verbal aggression in a group discussion. A major advantage of ANOVA is that along with the effects of the individual variables (main effects) it provides a test of the combined effects of two or more variables (i.e., interactive effects).
Crosstabulation computes and displays crosstabulation tables comparing two or more discrete variables. These tables indicate to what extent characteristics of one variable and characteristics of another variable occur together. To determine whether the results are "significant" or not, a chi-square test (or similar test) must be done. This computes the cell frequencies which would be expected if no relationship was present and campares them to the actual values. The larger the difference, the larger the chi-square value and the more significant the results. For example, using CROSSTABS and chi-square you can determine whether blacks or whites are more likely to live in inner city areas vs. suburban areas (i.e., the relationship between race and place of residence). The reason a correlation coefficient is not used is because the variables are categorical not continuous.
Discriminant Analysis is a method to distinguish between two or more
categories or groups, using a weighted combination of continuous variables.
For example, if you wanted to use age, height and intelligence to predict
whether someone is likely to be a paranoid, schizophrenic or a manic-depressive,
you would use a discriminant analysis. It does much the same thing
as multiple regression, but is used to discriminate categories rather than
predict a continuous variable. It also gives you "functions" which
are analogous to the "factors" in factor analysis.
Factor Analysis reduces a larger number of variables to a smaller number of general dimensions or factors. It does this through the use of intercorrelations between the initial variables. Roughly speaking, those variables which are highly correlated are lumped together into categories (factors). For example, a factor analysis of a questionnaire with forty items might reveal that instead of measuring 40 independent variables, the questionnaire actually measures five or six factors defined by groups of the individual items.
This procedure computes and presents one-way frequency distribution tables for discrete (categorical) or continuous variables. The standard frequency table consists of the absolute frequency, relative frequency, adjusted frequency and cumulative frequency for each variable value. This procedure is also used in calculating a variety of descriptive statistics and is the only way for the median to be obtained in SPSSx. In addition, histograms and bar graphs can be obtained with this procedure.
List is not a statistical procedure but simply lists the values of specified
variables for the subjects (cases) selected. Among other things,
this procedure is useful for checking for errors in your program.
You can calculate your "computed" variables by hand for one subject and
then compare these and all the subject's variables with the values listed
by the LIST procedure.
Partial correlation produces a quantitative index (correlation coefficient) of the extent and direction of the relationship between two variables with the effects of one or more other variables eliminated. For example, one might want to know how the number of churches in various towns and the number of crimes are related. Since it is known, however, that the size of the town effects both variables (and the correlation between them) a partial correlation could be used to eliminate this effect and obtain a more valid estimate of how the number of churches and the number of crimes are actually related.
Pearson product moment correlation coefficient (r) is a mathematical
index of the relationship between 2 continuous variables. The values
for "r" fall between -1.00 and +1.00 and this number tells the strength
of the relationship (the bigger the decimal number the stronger it is)
and the direction of the relationship (whether "r" is negative or positive).
A negative "r" value indicates that low scores on one variable go along
with high scores on the other, while a positive "r" indicates that high
scores on one variable go along with high scores on the other.
Regression analysis allows you to generate an equation to predict scores
on one variable from knowledge of scores on one (simple regression) or
more (multiple regression) other variables. For example, you may
wish to predict success of freshman in Math 3 and 4 at the U of S with
information about SAT scores, high school class standing, and height.
Multiple correlations are also produced by this procedure, and are
simply a numerical index of the relationship between all of
the predictor variables (SATs, H.S. standing and height) and the criterion
variable (average grade in Math 3 and 4).
All variables must be continuous.
This procedure computes t-values and their corresponding significance levels for either independent or dependent samples.
1. Independent samples: The t-value computed for independent samples compares the means of 2 different groups. The average of the test scores for one group is compared with the average for the other group to determine if the difference is statistically significant. The value obtained is presented along with the degrees of freedom and the two-tailed probability level (the direction of the relationship is not specified).
2. Paired samples: The t-value computed for dependent samples
compares 2 means of the same group or matched groups. For example,
each person is tested twice, before and after treatment, and these means
can be compared to see if the difference is significant. The value
for t is presented along with the degrees of freedom and the two-tailed
probability level. This procedure can only be used with repeated
measures of the same variable or two variables measured on the same scale.