The following program calculates the statistical coefficients for the following polynomial model:
G(Y) = C0 + C1 F(X) + C2 F(X)^2 + ... + Cn F(X)^n
Where X is the independent variable and Y is the dependent variable. In addition, G() and F() are optional transformation functions for the regression variables. The program also calculates the coefficient of determination R-Square.
The program displays the following menu:
POLYNOMIAL REGRESSION
=====================
0) QUIT
1) KEYBOARD INPUT
2) DATA STATEMENT INPUT
3) FILE INPUT
4) FILE OUTPUT
5) CALCULATE REGRESSION
SELECT CHOICE BY NUMBER:
Option 1 allows you to enter data from the keyboard. This option performs the following tasks:
1. Prompts you for the polynomial order.
2. Prompts you for the number of observations.
3. Prompts you for the values in the array of independent variable X.
4. Prompts you for the values of the dependent variable Y.
Option 2 permits you to obtain data from the program's own DATA statements. The DATA statements must provide values for:
1. The polynomial order.
2. The number of observations.
3. The values in the array of independent variable X.
4. The values of the dependent variable Y.
Option 3 allows you to obtain data from a text file. The program prompts you for the input filename. This text file contains values for the following (each value must appear on a separate text line):
1. The polynomial order.
2. The number of observations.
3. The values in the matrix of independent variable X.
4. The values of the dependent variable Y.
Once the program reads the data, it asks you if you want to transform the data (using the code in subroutine TRNSF). Enter Y or Yes if you want to process with the data transformation. Otherwise, enter N or No to bypass the transformation step.
Option 4 allows you to store the current data to a text file. The program prompts you for the output filename.
Option 5 triggers the multiple regression calculations which perform the following tasks:
1. Calculates and displays the regression coefficients C(0), C(1), and so on..
2. Calculates and displays the coefficient of determination R-Square.
The DATA statements contain the following the data shown in the next table:
X | Y |
0.8 | 24 |
1.0 | 20 |
1.2 | 10 |
1.4 | 13 |
1.6 | 12 |
The above data yield the following results:
C( 0 ) =47.942858998
C( 1 ) =-9.761906484
C( 2 ) =-41.071429946
C( 3 ) =20.833331924
R2 = 0.868504071094
Here is the BASIC listing:
OPTION TYPO
OPTION NOLET
! MULTIPLE LINEAR REGRESSION
DECLARE NUMERIC I, J, NDATA, NORDER, NORDERP1, C, R2, SUMY, SUMCX, SUMY2
DECLARE STRING A$
DIM X(1,1),Y(1,1),X0(1,1),X1(1,1),Y1(1,1),COEFF(1,1),XD(1,1)
NORDER=0
NDATA=0
DO
PRINT
PRINT TAB(20);"POLYNOMIAL REGRESSION"
PRINT TAB(20);"====================="
PRINT "0) QUIT"
PRINT "1) KEYBOARD INPUT"
PRINT "2) DATA STATEMENT INPUT"
PRINT "3) FILE INPUT"
PRINT "4) FILE OUTPUT"
PRINT "5) CALCULATE REGRESSION"
INPUT PROMPT "SELECT CHOICE BY NUMBER: ":C
IF C=1 THEN
INPUT PROMPT "POLYNOMIAL ORDER? ": NORDER
INPUT PROMPT "NUMBER OF POINTS? ": NDATA
NORDERP1=NORDER+1
MAT REDIM X(NDATA,NORDERP1),Y(NDATA,1),X0(NORDERP1,NDATA),X1(NORDERP1,NORDERP1)
MAT REDIM Y1(NORDERP1,1),COEFF(NORDERP1,1),XD(NDATA,1)
PRINT "ENTER VALUES FOR INDENPENDENT VARIABLE X:"
MAT INPUT XD
PRINT "ENTER VALUES FOR DEPENDENT VARIABLE Y:"
MAT INPUT Y
CALL TRSNF(X(,),Y(,),NDATA,NORDERP1)
ELSEIF C=2 THEN
WHEN ERROR IN
READ NORDER, NDATA
NORDERP1 = NORDER+1
MAT REDIM X(NDATA,NORDERP1),Y(NDATA,1),X0(NORDERP1,NDATA),X1(NORDERP1,NORDERP1)
MAT REDIM Y1(NORDERP1,1),COEFF(NORDERP1,1),XD(NDATA,1)
PRINT "ARRAY X"
FOR I = 1 TO NDATA
READ XD(I,1)
PRINT XD(I,1);
NEXT I
PRINT
PRINT "ARRAY Y"
FOR I = 1 TO NDATA
READ Y(I,1)
PRINT Y(I,1);
NEXT I
PRINT
DATA 3, 5
DATA 0.8, 1, 1.2, 1.4, 1.6
DATA 24, 20, 10, 13, 12
RESTORE
USE
PRINT "ERROR IN READING FROM DATA STATEMENTS";A$
NDATA = 0
NORDER = 0
END WHEN
ELSEIF C=3 THEN
INPUT PROMPT "ENTER FILENAME? ":A$
WHEN ERROR IN
OPEN #1: NAME A$, ORG TEXT, CREATE OLD, ACCESS INPUT
INPUT #1: NORDER
INPUT #1: NDATA
NORDERP1 = NORDER+1
MAT REDIM X(NDATA,NORDERP1),Y(NDATA,1),X0(NORDERP1,NDATA),X1(NORDERP1,NORDERP1)
MAT REDIM Y1(NORDERP1,1),COEFF(NORDERP1,1),XD(NDATA,1)
PRINT "ARRAY X"
FOR I = 1 TO NDATA
INPUT #1: XD(I,1)
PRINT XD(I,1);
NEXT I
PRINT
PRINT "ARRAY Y"
FOR I = 1 TO NDATA
INPUT #1: Y(I,1)
PRINT Y(I,1);
NEXT I
PRINT
CLOSE #1
INPUT PROMPT "TRANSFORM DATA? (Y/N) ":A$
IF UCASE$(A$)="Y" OR UCASE$(A$)="YES" THEN
CALL TRSNF(X(,),Y(,),NDATA,NORDERP1)
END IF
USE
PRINT "COULD NOT OPEN OR READ FROM FILE ";A$
END WHEN
ELSEIF C=4 AND NORDER*NDATA>0 THEN
INPUT PROMPT "ENTER FILENAME? ":A$
WHEN ERROR IN
OPEN #1: NAME A$, ORG TEXT, CREATE NEWOLD, ACCESS OUTIN
PRINT #1: NORDER
PRINT #1: NDATA
FOR I = 1 TO NDATA
PRINT #1: XD(I,1)
NEXT I
FOR I = 1 TO NDATA
PRINT #1: Y(I,1)
NEXT I
CLOSE #1
USE
PRINT "COULD NOT OPEN OR WRITE TO FILE ";A$
END WHEN
ELSEIF C=5 AND NORDER*NDATA>0 THEN
FOR I = 1 TO NDATA
FOR J = 1 TO NORDERP1
X(I,J) = XD(I,1)^(J-1)
NEXT J
NEXT I
MAT X0=TRN(X)
MAT X1=X0*X
MAT Y1=X0*Y
MAT X1=INV(X1)
MAT COEFF=X1*Y1
FOR I=1 TO NORDERP1
PRINT "COEFF(";I-1;")=";COEFF(I,1)
NEXT I
SUMY=0
SUMCX=0
SUMY2=0
FOR I=1 TO NDATA
SUMY=SUMY+Y(I,1)
SUMY2=SUMY2+Y(I,1)^2
NEXT I
FOR I=1 TO NORDERP1
SUMCX=SUMCX+COEFF(I,1)*Y1(I,1)
NEXT I
R2=(SUMCX-SUMY^2/NDATA)/(SUMY2-SUMY^2/NDATA)
PRINT "R^2=";R2
ELSE
IF C<>0 THEN PRINT "INVALID CHOICE"
END IF
IF C<>0 THEN
PRINT "PRESS ANY KEY TO CONTINUE";
GET KEY I
END IF
LOOP UNTIL C=0
PRINT "END OF PROGRAM"
END
SUB TRSNF(X(,),Y(,),NDATA,NORDERP1)
! DATA TRANSFORMATION
END SUB
The subroutine TRNSF allows you to place any required data transformation statements. The current version of the code has that subroutine void of any executable statements. This means that the current multiple regression in strictly linear.
To perform a logarithmic transformation on Y values , for example, the subroutine TRNSF would look like:
SUB TRSNF(X(,),Y(,),NDATA,NORDERP1)
! DATA TRANSFORMATION
FOR I =1 TO NDATA
Y(I,1)=LOG(Y(I,1))
NEXT I
END SUB
Copyright (c) Namir Shammas. All rights reserved.