0.1 Introduction
SPSS download student version: link
SPSS canvas data file: link
SPSS tutorial in english: link
SPSS tutorial in swedish: link
Observe that instructions become less and less detailed as you work your way through the tasks.
Open SPSS. Log into the course website at Canvas. Download the file Communicative Organizations Database.
SUPER IMPORTANT!! Save your data file and all changes so you can use it for the following workshops! Also save your output files or copy paste relevant parts to another document if you want to remember what you did in the previous workshop!
1 Workshop: Descriptive statistics
2 Workshop: Data transformation
2.1 Recoding variables
- Create a cross table with
Role
(nr 5) as the column variable, andorg_gossip
(nr 17) as the row variable. Do managers have more of a problem with gossip than other groups?
- Recode org_gossip into a variable with only three values: disagree, neither, and agree. First, note the different possible values for the variable. Choose
Transform -> Recode into different variables
from the drop-down menu. Insert the variable into the box titledInput variable -> Output variable
. In the fields Name and Label, writeorg_gossip_cat
(in both fields). Click Change. Click Old and New Values. In Old Value, write 1 and in New Value, write 1. Click Add. Repeat with the other values (2=1, 3=2, 4=3, 5=3). Click Continue and OK. You have now created a new variableorg_gossip_cat
with 3 categories instead of 5.
Go to variable view. Look up the new variable (it’s at the bottom of the list). Mark the empty field in the column Values. Click the three dots. In the pop-up window, write in the new values: In Value, write 1 and in Label Disagree. Repeat for values 2 (Neither) and 3 (Agree). Click OK.
Create a frequency table for the new variable AND for the old variable. Do the numbers match? This is a way of knowing whether you recoded correctly.
Create a cross table with Role as the column variable and the new variable
org_gossip_cat
as the row variable. Do the differences appear more clearly now?
2.2 Computing variables
The variable for
Age
is measured in birth years. We will now recode it into measuring age in years of age. Your current age is 2024 – your birth year. We shall perform this calculation for all respondents. The survey was conducted in late 2015 – early 2016 (see Falkheimer et al., 2017). We will use 2016 for all respondents.Choose
Transform -> Compute variable
. In the field Target Variable, write the name of your new variable. Remember to name it in a way that is understandable to you! Click Type and Label and choose Use expression as label. Write “2016 – Age” in the field Numeric Expression. Click OK.
[1] 1967
Min. 1st Qu. Median Mean 3rd Qu. Max. NA's
17.0 40.0 49.0 48.1 57.0 74.0 1684
- Inspect the new variable. What is the mean and median age? Does it correspond with the original variable?
2.3 Creating an index
Consider the variables 31-35 (
femcomm_performance, femcomm_mgmt, femcomm_opportunity, femcomm_support, and femcomm_barriers
). Run frequency tables for all of them. What do they measure? Do they seem to have something in common? (Hint: they do.)Look at the individual values of the variables. Do they all have the same number of values? Are they coded in the same direction?
Run a reliability test to see whether the variables are suitable for combining into an index. Choose
Analyze -> Scale -> Reliability analysis
from the drop-down menu. Transfer the variables into the field Items. Click Statistics. Choose Scale if item deleted, Continue and OK.Inspect the table Reliability Statistics. Crombach’s Alpha should have a value above 0,7. In the table Item – Total statistics you can see how the Crombach’s Alpha varies when one item is excluded.
Create a summative index variable by choosing
Transform -> Compute Variable
from the drop-down menu. Name the index variableindex_femcomm
, genderdiffindex, or something else that allows you to remember what it represents. Simply sum the variables in the field Numerical Expression by double-clicking the first variable, select +, double-click the second variable, and so on. Click OK.Run a frequency table of the new variable and select Mean, Median, Std. deviation, S.E. Mean, and skewness and kurtosis under Statistics. Also produce a Histogram (under Charts) and choose Show normal curve on histogram.
What is the minimum and maximum score? What is the mean and the median? Does the variable seem to be approximately normally distributed?
2.4 A first attempt of hypothesis testing
Are female or male communicators more likely to score high on the index we created above? Let’s find out! Choose Compare Means and enter Gender as the independent variable and our new index as the dependent variable. What is the mean score for men and women, respectively?
Is this a statistically significant difference? Let’s find out! Choose
Analyze -> Compare means -> Independent-Samples t-test
from the drop-down menu. Insert the index variable into the field Test Variable(s). Insert the Gender variable into the field Grouping Variable. Click Define Groups. Write 1 in the field Group 1 and 2 in the field Group 2. This means that we will compare the mean score on the index for the females (1) and the males (2). A t-test is a test for statistical significance which we will talk more about later.Inspect the tables. The first table,
Group Statistics
, tell you what we already know: that women score higher than men on the index. The next table reports the result of two tests:Levene’s test
for equality of variances and thet-test
for equality of means. If Levene’s test is significant (under Sig. check if the value is below 0.05), we assume equal variances. As the test is significant, look at the row for equal variances. What is the value reported under Sig.? If it is below 0.05, the test is significant. If the test is significant, it means that the probability for the difference in mean scores between women and men being an effect of random variation is below 5%.Congratulations, you have distinguished the signal from the noise and found a non-trivial statistically significant relationship between two variables!
THAT’S IT FOR TODAY! REMEMBER TO SAVE YOUR DATA AND OUTPUT FILES (THOSE ARE TWO DIFFERENT OPERATIONS)!