The following program calculates the statistical coefficients for the following model:
G(Y) = C0 + C1 F1(X1) + C2 F2(X2) + ... + Cn Fn(Xn)
Where X1, X2, ..., and Xn are the independent variables and Y is the dependent variable. In addition, G(), F1(), F2() and so on are optional transformation functions for the regression variables. The program also calculates the coefficient of determination R-Square.
The program displays the following menu:
MULTIPLE LINEAR REGRESSION
==========================
0) QUIT
1) KEYBOARD INPUT
2) DATA STATEMENT INPUT
3) FILE INPUT
4) FILE OUTPUT
5) CALCULATE REGRESSION
SELECT CHOICE BY NUMBER:
Option 1 allows you to enter data from the keyboard. This option performs the following tasks:
1. Prompts you for the number of independent variables.
2. Prompts you for the number of observations.
3. Prompts you for the values in the matrix of independent variables. REMEMBER THAT THE FIRST COLUMN OF THIS MATRIX IS ONES.
4. Prompts you for the values of the dependent variable Y.
Option 2 permits you to obtain data from the program's own DATA statements. The DATA statements must provide values for:
1. The the number of independent variables.
2. The number of observations.
3. The values in the matrix of independent variables. REMEMBER THAT THE FIRST COLUMN OF THIS MATRIX IS ONES.
4. The values of the dependent variable Y.
Option 3 allows you to obtain data from a text file. The program prompts you for the input filename. This text file contains values for the following (each value must appear on a separate text line):
1. The the number of independent variables.
2. The number of observations.
3. The values in the matrix of independent variables. REMEMBER THAT THE FIRST COLUMN OF THIS MATRIX IS ONES.
4. The values of the dependent variable Y.
Once the program reads the data, it asks you if you want to transform the data (using the code in subroutine TRNSF). Enter Y or Yes if you want to process with the data transformation. Otherwise, enter N or No to bypass the transformation step.
Option 4 allows you to store the current data to a text file. The program prompts you for the output filename.
Option 5 triggers the multiple regression calculations which perform the following tasks:
1. Calculates and displays the regression coefficients C(0), C(1), and so on..
2. Calculates and displays the coefficient of determination R-Square.
The DATA statements contain the following the data shown in the next table (note that X0 is a dummy variable that represents the column of 1's):
X0 | X1 | X2 | X3 | Y |
1 | 7 | 25 | 6 | 60 |
1 | 1 | 29 | 15 | 52 |
1 | 11 | 56 | 8 | 20 |
1 | 11 | 31 | 8 | 47 |
1 | 7 | 52 | 6 | 33 |
The above data yield the following results:
C( 0 ) =103.447316589
C( 1 ) =-1.28409650404
C( 2 ) =-1.03692762188
C( 3 ) =-1.33948793673
R2 = 0.998937219108
Here is the BASIC listing:
OPTION TYPO
OPTION NOLET
! MULTIPLE LINEAR REGRESSION
DECLARE NUMERIC I, J, NDATA, NVARS, NVARSP1, C, R2, SUMY, SUMCX, SUMY2
DECLARE STRING A$
DIM X(1,1),Y(1,1),X0(1,1),X1(1,1),Y1(1,1),COEFF(1,1)
NVARS=0
NDATA=0
DO
PRINT
PRINT TAB(20);"MULTIPLE LINEAR REGRESSION"
PRINT TAB(20);"=========================="
PRINT "0) QUIT"
PRINT "1) KEYBOARD INPUT"
PRINT "2) DATA STATEMENT INPUT"
PRINT "3) FILE INPUT"
PRINT "4) FILE OUTPUT"
PRINT "5) CALCULATE REGRESSION"
INPUT PROMPT "SELECT CHOICE BY NUMBER: ":C
IF C=1 THEN
INPUT PROMPT "NUMBER OF X VARS? ": NVARS
INPUT PROMPT "NUMBER OF POINTS? ": NDATA
NVARSP1=NVARS+1
MAT REDIM X(NDATA,NVARSP1),Y(NDATA,1),X0(NVARSP1,NDATA)
MAT REDIM X1(NVARSP1,NVARSP1),Y1(NVARSP1,1),COEFF(NVARSP1,1)
PRINT "ENTER VALUES FOR INDEPENDENT VARIABLES MATRIX X (WITH FIRST COLUM AS ONES):"
MAT INPUT X
PRINT "ENTER VALUES FOR DEPENDENT VARIABLE Y:"
MAT INPUT Y
CALL TRSNF(X(,),Y(,),NDATA,NVARSP1)
ELSEIF C=2 THEN
WHEN ERROR IN
READ NVARS, NDATA
NVARSP1 = NVARS+1
MAT REDIM X(NDATA,NVARSP1),Y(NDATA,1),X0(NVARSP1,NDATA)
MAT REDIM X1(NVARSP1,NVARSP1),Y1(NVARSP1,1),COEFF(NVARSP1,1)
PRINT "MATRIX X"
MAT READ X
MAT PRINT X
PRINT
PRINT "ARRAY Y"
MAT READ Y
MAT PRINT Y
PRINT
DATA 3,5
DATA 1, 7, 25, 6
DATA 1, 1, 29, 15
DATA 1, 11, 56, 8
DATA 1, 11, 31, 8
DATA 1, 7, 52, 6
DATA 60, 52, 20, 47, 33
RESTORE
USE
PRINT "ERROR IN READING FROM DATA STATEMENTS";A$
NDATA = 0
NVARS = 0
END WHEN
ELSEIF C=3 THEN
INPUT PROMPT "ENTER FILENAME? ":A$
WHEN ERROR IN
OPEN #1: NAME A$, ORG TEXT, CREATE OLD, ACCESS INPUT
INPUT #1: NVARS
INPUT #1: NDATA
NVARSP1 = NVARS+1
MAT REDIM X(NDATA,NVARSP1),Y(NDATA,1),X0(NVARSP1,NDATA)
MAT REDIM X1(NVARSP1,NVARSP1),Y1(NVARSP1,1),COEFF(NVARSP1,1)
PRINT "MATRIX X"
FOR I = 1 TO NDATA
FOR J = 1 TO NVARSP1
INPUT #1: X(I,J)
PRINT X(I,J);
NEXT J
PRINT
NEXT I
PRINT
PRINT "ARRAY Y"
FOR I = 1 TO NDATA
INPUT #1: Y(I,1)
PRINT Y(I,1);
NEXT I
PRINT
CLOSE #1
INPUT PROMPT "TRANSFORM DATA? (Y/N) ":A$
IF UCASE$(A$)="Y" OR UCASE$(A$)="YES" THEN
CALL TRSNF(X(,),Y(,),NDATA,NVARSP1)
END IF
USE
PRINT "COULD NOT OPEN OR READ FROM FILE ";A$
NDATA = 0
NVARS = 0
END WHEN
ELSEIF C=4 AND NVARS*NDATA>0 THEN
INPUT PROMPT "ENTER FILENAME? ":A$
WHEN ERROR IN
OPEN #1: NAME A$, ORG TEXT, CREATE NEWOLD, ACCESS OUTIN
ERASE #1
PRINT #1: NVARS
PRINT #1: NDATA
FOR I = 1 TO NDATA
FOR J = 1 TO NVARSP1
PRINT #1: X(I,J)
NEXT J
NEXT I
FOR I = 1 TO NDATA
PRINT #1: Y(I,1)
NEXT I
CLOSE #1
USE
PRINT "COULD NOT OPEN OR WRITE TO FILE ";A$
END WHEN
ELSEIF C=5 AND NVARS*NDATA>0 THEN
MAT X0=TRN(X)
MAT X1=X0*X
MAT Y1=X0*Y
MAT X1=INV(X1)
MAT COEFF=X1*Y1
FOR I=1 TO NVARSP1
PRINT "COEFF(";I-1;")=";COEFF(I,1)
NEXT I
SUMY=0
SUMCX=0
SUMY2=0
FOR I=1 TO NDATA
SUMY=SUMY+Y(I,1)
SUMY2=SUMY2+Y(I,1)^2
NEXT I
FOR I=1 TO NVARSP1
SUMCX=SUMCX+COEFF(I,1)*Y1(I,1)
NEXT I
R2=(SUMCX-SUMY^2/NDATA)/(SUMY2-SUMY^2/NDATA)
PRINT "R^2=";R2
ELSE
IF C<>0 THEN PRINT "INVALID CHOICE"
END IF
IF C<>0 THEN
PRINT "PRESS ANY KEY TO CONTINUE";
GET KEY I
END IF
LOOP UNTIL C=0
PRINT "END OF PROGRAM"
END
SUB TRSNF(X(,),Y(,),NDATA,NVARSP1)
! DATA TRANSFORMATION
END SUB
The subroutine TRNSF allows you to place any required data transformation statements. The current version of the code has that subroutine void of any executable statements. This means that the current multiple regression in strictly linear.
To perform a power regression for all the variables, for example, the subroutine TRNSF would look like:
SUB TRSNF(X(,),Y(),NDATA,NVARSP1)
! DATA TRANSFORMATION
FOR I = 1 TO NDATA
Y(I,1)=LOG(Y(I,1))
NEXT I
FOR I = 1 TO NDATA
FOR J = 2 TO NVARSP1
X(I,J) = LOG(X(I,J))
NEXT J
NEXT I
END SUB
Copyright (c) Namir Shammas. All rights reserved.