True BASIC Program to Calculate

General Polynomial Regression

by Namir Shammas

The following program calculates the statistical coefficients for the following polynomial model:

G(Y) = C0 + C1 F(X)  + C2 F(X)^2 + ...  + Cn F(X)^n

Where X is the independent variable and Y is the dependent variable. In addition, G() and F() are optional transformation functions for the regression variables. The program also calculates the coefficient of determination R-Square.

The program displays the following menu:

                        POLYNOMIAL REGRESSION
          =====================
0) QUIT
1) KEYBOARD INPUT
2) DATA STATEMENT INPUT
3) FILE INPUT
4) FILE OUTPUT
5) CALCULATE REGRESSION
SELECT CHOICE BY NUMBER:

Option 1 allows you to enter data from the keyboard. This option performs the following tasks:

1. Prompts you for the polynomial order.

2. Prompts you for the number of observations.

3. Prompts you for the values in the array of  independent variable X.

4. Prompts you for the values of the dependent variable Y.

Option 2 permits you to obtain data from the program's own DATA statements. The DATA statements must provide values for:

1. The polynomial order.

2. The number of observations.

3. The values in the array of  independent variable X.

4. The values of the dependent variable Y.

Option 3 allows you to obtain data from a text file. The program prompts you for the input filename. This text file contains values for the following (each value must appear on a separate text line):

1. The polynomial order.

2. The number of observations.

3. The values in the matrix of  independent variable X.

4. The values of the dependent variable Y.

Once the program reads the data, it asks you if you want to transform the data (using the code in subroutine TRNSF). Enter Y or Yes if you want to process with the data transformation. Otherwise, enter N or No to bypass the transformation step.

Option 4 allows you to store the current data to a text file. The program prompts you for the output filename.

Option 5 triggers the multiple regression calculations which perform the following tasks:

1. Calculates and displays the regression coefficients C(0), C(1), and so on..

2. Calculates and displays the coefficient of determination R-Square.

The DATA statements contain the following the data shown in the next table:

X Y
0.8 24
1.0 20
1.2 10
1.4 13
1.6 12

The above data yield the following results:

C( 0 ) =47.942858998

C( 1 ) =-9.761906484

C( 2 ) =-41.071429946

C( 3 ) =20.833331924

R2 = 0.868504071094

Here is the BASIC listing:

OPTION TYPO
OPTION NOLET
! MULTIPLE LINEAR REGRESSION
DECLARE NUMERIC I, J, NDATA, NORDER, NORDERP1, C, R2, SUMY, SUMCX, SUMY2
DECLARE STRING A$
DIM X(1,1),Y(1,1),X0(1,1),X1(1,1),Y1(1,1),COEFF(1,1),XD(1,1)
NORDER=0
NDATA=0
DO
  PRINT
  PRINT TAB(20);"POLYNOMIAL REGRESSION"
  PRINT TAB(20);"====================="
  PRINT "0) QUIT"
  PRINT "1) KEYBOARD INPUT"
  PRINT "2) DATA STATEMENT INPUT"
  PRINT "3) FILE INPUT"
  PRINT "4) FILE OUTPUT"
  PRINT "5) CALCULATE REGRESSION"
  INPUT PROMPT "SELECT CHOICE BY NUMBER: ":C
  IF C=1 THEN
    INPUT PROMPT "POLYNOMIAL ORDER? ": NORDER
    INPUT PROMPT "NUMBER OF POINTS? ": NDATA
    NORDERP1=NORDER+1
    MAT REDIM X(NDATA,NORDERP1),Y(NDATA,1),X0(NORDERP1,NDATA),X1(NORDERP1,NORDERP1)
    MAT REDIM Y1(NORDERP1,1),COEFF(NORDERP1,1),XD(NDATA,1)
    PRINT "ENTER VALUES FOR INDENPENDENT VARIABLE X:"
    MAT INPUT XD
    PRINT "ENTER VALUES FOR DEPENDENT VARIABLE Y:"
    MAT INPUT Y
    CALL TRSNF(X(,),Y(,),NDATA,NORDERP1)
  ELSEIF C=2 THEN
    WHEN ERROR IN
      READ NORDER, NDATA
      NORDERP1 = NORDER+1
      MAT REDIM X(NDATA,NORDERP1),Y(NDATA,1),X0(NORDERP1,NDATA),X1(NORDERP1,NORDERP1)
      MAT REDIM Y1(NORDERP1,1),COEFF(NORDERP1,1),XD(NDATA,1)
      PRINT "ARRAY X"
      FOR I = 1 TO NDATA
        READ XD(I,1)
        PRINT XD(I,1);
      NEXT I
      PRINT
      PRINT "ARRAY Y"
      FOR I = 1 TO NDATA
        READ Y(I,1)
        PRINT Y(I,1);
      NEXT I
      PRINT
      DATA 3, 5
      DATA 0.8, 1, 1.2, 1.4, 1.6
      DATA 24, 20, 10, 13, 12
      RESTORE
    USE
      PRINT "ERROR IN READING FROM DATA STATEMENTS";A$
      NDATA = 0
      NORDER = 0
    END WHEN
  ELSEIF C=3 THEN
    INPUT PROMPT "ENTER FILENAME? ":A$
    WHEN ERROR IN
      OPEN #1: NAME A$, ORG TEXT, CREATE OLD, ACCESS INPUT
      INPUT #1: NORDER
      INPUT #1: NDATA
      NORDERP1 = NORDER+1
      MAT REDIM X(NDATA,NORDERP1),Y(NDATA,1),X0(NORDERP1,NDATA),X1(NORDERP1,NORDERP1)
      MAT REDIM Y1(NORDERP1,1),COEFF(NORDERP1,1),XD(NDATA,1)
      PRINT "ARRAY X"
      FOR I = 1 TO NDATA
        INPUT #1: XD(I,1)
        PRINT XD(I,1);
      NEXT I
      PRINT
      PRINT "ARRAY Y"
      FOR I = 1 TO NDATA
        INPUT #1: Y(I,1)
        PRINT Y(I,1);
      NEXT I
      PRINT
      CLOSE #1
      INPUT PROMPT "TRANSFORM DATA? (Y/N) ":A$
      IF UCASE$(A$)="Y" OR UCASE$(A$)="YES" THEN
        CALL TRSNF(X(,),Y(,),NDATA,NORDERP1)
      END IF
    USE
        PRINT "COULD NOT OPEN OR READ FROM FILE ";A$
    END WHEN
  ELSEIF C=4 AND NORDER*NDATA>0 THEN
    INPUT PROMPT "ENTER FILENAME? ":A$
    WHEN ERROR IN
      OPEN #1: NAME A$, ORG TEXT, CREATE NEWOLD, ACCESS OUTIN
      PRINT #1: NORDER
      PRINT #1: NDATA
      FOR I = 1 TO NDATA
        PRINT #1: XD(I,1)
      NEXT I
      FOR I = 1 TO NDATA
        PRINT #1: Y(I,1)
      NEXT I
      CLOSE #1
    USE
      PRINT "COULD NOT OPEN OR WRITE TO FILE ";A$
    END WHEN
  ELSEIF C=5 AND NORDER*NDATA>0 THEN
    FOR I = 1 TO NDATA
      FOR J = 1 TO NORDERP1
        X(I,J) = XD(I,1)^(J-1)
      NEXT J
    NEXT I
    MAT X0=TRN(X)
    MAT X1=X0*X
    MAT Y1=X0*Y
    MAT X1=INV(X1)
    MAT COEFF=X1*Y1
    FOR I=1 TO NORDERP1
      PRINT "COEFF(";I-1;")=";COEFF(I,1)
    NEXT I
    SUMY=0
    SUMCX=0
    SUMY2=0
    FOR I=1 TO NDATA
      SUMY=SUMY+Y(I,1)
      SUMY2=SUMY2+Y(I,1)^2
    NEXT I
    FOR I=1 TO NORDERP1
      SUMCX=SUMCX+COEFF(I,1)*Y1(I,1)
    NEXT I
    R2=(SUMCX-SUMY^2/NDATA)/(SUMY2-SUMY^2/NDATA)
    PRINT "R^2=";R2
  ELSE
    IF C<>0 THEN PRINT "INVALID CHOICE"
  END IF

  IF C<>0 THEN
    PRINT "PRESS ANY KEY TO CONTINUE";
    GET KEY I
  END IF
LOOP UNTIL C=0
PRINT "END OF PROGRAM"
END

SUB TRSNF(X(,),Y(,),NDATA,NORDERP1)
! DATA TRANSFORMATION
END SUB

The subroutine TRNSF allows you to place any required data transformation statements. The current version of the code has that subroutine void of any executable statements. This means that the current multiple regression in strictly linear.

To perform a logarithmic transformation on Y values , for example, the subroutine TRNSF would look like:

SUB TRSNF(X(,),Y(,),NDATA,NORDERP1)
! DATA TRANSFORMATION
FOR I =1 TO NDATA
Y(I,1)=LOG(Y(I,1))
NEXT I
END SUB

 

BACK

Copyright (c) Namir Shammas. All rights reserved.