A Simulation of Regression with Random Values
This is SPSS code from Terry.Taerum@ualberta.ca to create simulated data for 40 variables, It loops to randomly assign a 1 to var40 for the first 10 cases and a 0 for the remainder. This part of the program can be used to create randomly initialized variables for any subsequent statistical procedure. In this case, the simulation then executes a regression and a logistic regression for the created variables. This simulation demonstrates that if you use 39 independents in a regression, one or two will be found to be significant in most runs of the simulation, even though the data are, in fact, random.
new file.
input program.
numeric v1 to v40.
vector vx=v1 to v39.
loop #1=1 to 648.
if (#1 le 10) v40=1.
if (#1 gt 10) v40=1.
loop #j=1 to 39.
compute vx(#j)=trunc(5*rv.uniform(0,1)).
end loop.
end case.
end loop.
end file.
end input program.
execute.
regression vars=v1 to v40
/dependent=v40/enter v1 to v39.
logistic regression var=v40
/method=enter v1 to v39.