SPSSx STATISTICAL PROCEDURES

Procedures Described:

ANOVA (Analysis of Variance)
CROSSTABS (Frequency Tables & Chi-Square)
DISCRIMINANT (Discriminant Analysis)
FACTOR (Factor Analysis)
FREQUENCIES (Frequency Distribution)
LIST (List Variables)
PARTIAL CORR (Partial Correlation)
PEARSON CORR (Pearson Correlation)
PLOT (Scatterplot)
REGRESSION (Multiple Regression)
T-TEST (t-Test)
 
 

I.  ANOVA  

Analysis of Variance provides a test of the effects of different levels of one or more categorical variables (the "independent variables") on another continuous variable (the "dependent variable").  This is done in the same general way that a t-test determines the significance of the difference between groups.  However, instead of two levels of one variable (as in the case of the t-test), ANOVA can simultaneously analyze the effects of up to five variables with two or more levels of each one.  For example, one could determine how the size of a group (large, medium and small) and gender (male and female) affects verbal aggression in a group discussion.  A major advantage of ANOVA is that along with the effects of the individual variables (main effects) it provides a test of the combined effects of two or more variables (i.e., interactive effects).

II. CROSSTABS

Crosstabulation computes and displays crosstabulation tables comparing two or more discrete variables.  These tables indicate to what extent characteristics of one variable and characteristics of another variable occur together.  To determine whether the results are "significant" or not, a chi-square test (or similar test) must be done.  This computes the cell frequencies which would be expected if no relationship was present and campares them to the actual values.  The larger the difference, the larger the chi-square value and the more significant the results.  For example, using CROSSTABS and chi-square you can determine whether blacks or whites are more likely to live in inner city areas vs. suburban areas (i.e., the relationship between race and place of residence).  The reason a correlation coefficient is not used is because the variables are categorical not continuous.

III.  DISCRIMINANT

Discriminant Analysis is a method to distinguish between two or more categories or groups, using a weighted combination of continuous variables.  For example, if you wanted to use age, height and intelligence to predict whether someone is likely to be a paranoid, schizophrenic or a manic-depressive, you would use a discriminant analysis.  It does much the same thing as multiple regression, but is used to discriminate categories rather than predict a continuous variable.  It also gives you "functions" which are analogous to the "factors" in factor analysis.
 
 

IV.  FACTOR

Factor Analysis reduces a larger number of variables to a smaller number of general dimensions or factors.  It does this through the use of intercorrelations between the initial variables.  Roughly speaking, those variables which are highly correlated are lumped together into categories (factors).  For example, a factor analysis of a questionnaire with forty items might reveal that instead of measuring 40 independent variables, the questionnaire actually measures five or six factors defined by groups of the individual items.

V.  FREQUENCIES

This procedure computes and presents one-way frequency distribution tables for discrete (categorical) or continuous variables.  The standard frequency table consists of the absolute frequency, relative frequency, adjusted frequency and cumulative frequency for each variable value.  This procedure is also used in calculating a variety of descriptive statistics and is the only way for the median to be obtained in SPSSx.  In addition, histograms and bar graphs can be obtained with this procedure.

VI. LIST

List is not a statistical procedure but simply lists the values of specified variables for the subjects (cases) selected.  Among other things, this procedure is useful for checking for errors in your program.  You can calculate your "computed" variables by hand for one subject and then compare these and all the subject's variables with the values listed by the LIST procedure.
 

VII.  PARTIAL CORR

Partial correlation produces a quantitative index (correlation coefficient) of the extent and direction of the relationship between two variables with the effects of one or more other variables eliminated.  For example, one might want to know how the number of churches in various towns and the number of crimes are related.  Since it is known, however, that the size of the town effects both variables (and the correlation between them) a partial correlation could be used to eliminate this effect and obtain a more valid estimate of how the number of churches and the number of crimes are actually related.

VIII.  PEARSON CORR

Pearson product moment correlation coefficient (r) is a mathematical index of the relationship between 2 continuous variables.  The values for "r" fall between -1.00 and +1.00 and this number tells the strength of the relationship (the bigger the decimal number the stronger it is) and the direction of the relationship (whether "r" is negative or positive).  A negative "r" value indicates that low scores on one variable go along with high scores on the other, while a positive "r" indicates that high scores on one variable go along with high scores on the other.
 

IX.  REGRESSION

Regression analysis allows you to generate an equation to predict scores on one variable from knowledge of scores on one (simple regression) or more (multiple regression) other variables.  For example, you may wish to predict success of freshman in Math 3 and 4 at the U of S with information about SAT scores, high school class standing, and height.
Multiple correlations are also produced by this procedure, and are simply a numerical index of the relationship between all of
the predictor variables (SATs, H.S. standing and height) and the criterion variable (average grade in Math 3 and 4).
All variables must be continuous.

X.  T-TEST

This procedure computes t-values and their corresponding significance levels for either independent or dependent samples.

1. Independent samples:  The t-value computed for independent samples compares the means of 2 different groups.  The average of the test scores for one group is compared with the average for the other group to determine if the difference is statistically significant. The value obtained is presented along with the degrees of freedom and the two-tailed probability level (the direction of the relationship is not specified).

2. Paired samples:  The t-value computed for dependent samples compares 2 means of the same group or matched groups.  For example, each person is tested twice, before and after treatment, and these means can be compared to see if the difference is significant.  The value for t is presented along with the degrees of freedom and the two-tailed probability level.  This procedure can only be used with repeated measures of the same variable or two variables measured on the same scale.