Learning Outcomes:
After having completed this topic you should be able to:
  • determine when regression analysis is appropriate when analysing a problem
  • explain how regression is used in prediction using the lease square concept
  • interpret the linear regression SPSS output 
    $ $ $ $  ?
LINEAR
REGRESSION
Page 1 of 5
Page 1 of 5

What is Regression Analysis?
Regression analysis is a statistical tool for the investigation of relationships between variables. In research we are interested in ascertaining the causal effect of one variable upon another. For example, the effect of increase in tuition fee on demand for higher education, or the effect of family income on academic performance. To explore such issues, you have to obtain data on the two sets of variables and employ regression to estimate the quantitative effect of the causal variables (e.g. tuition fees, family income) upon the variable that they influence (e.g. demand for places in higher education, academic performance). Regression techniques have long been used in psychology, education, econometrics, business, law and other disciplines.
For purposed of illustration, let us suppose that we wish to identify and quanify the factors that determine income in the labour market. Obviously, there are many factors associated with differences in income across indviduals and these would include the type of occupation, age, experience, educational attainment, motivation, tacit knowledge, gender and even race. Let us restrict our analysis to a single factor and call it Education. Regression analysis with a single explanatory variable is termed "SIMPLE REGRESSION" or "LINEAR REGRESSION". However, we know that to relate the effects of education to income without taking into consideration the other factors mentioned is unrealistic.
Assuming that we represent 'Education' as the number of years in school, we can formulate a hypothesis stating the relationship between 'Education' and 'Income Earned'. It is generally believed that better educated people tend to make more money. Thus, the tentative hypothesis is "Higher levels of education cause higher levels of earnings, other things being equal".


14000


11000


8000


5000



3000


1000
0         3          6          9         12         15         18          21   
                      
                      Education (Years of schooling)    

Income
Earned
per
Month
  $
Then you can plot this informaton for all of the individuals in the sample forming a scatter diagram where each point (shown in blue) represents an individual in the sample. The diagram suggests that higher value of E tend to yield higher values of I, but the relationship is not perfect. It seems that the data from E is not sufficient for an entirely accurate prediction of I. Why?

The diagram also shows that individuals with no education also make positive amounts of money and that education increases earnings above this baseline. In other words, education affects income in a "linear: fashion; i.e. each additional year of schooling add the same amount to income.
To enable prediction, the REGRESSION EQUATION has to be computed. How do you compute the regression equation?

To investigate this hypothesis, let us assume that you have been able to gather data on Education (represented by years of schooling) and Income earnned ($ per month) for various individuals. Let the X axis denote INCOME in years of schooling for each individual and let Y axis denote INCOME which is the individual's earnings in dollars per month.