python - Not able to calculate y intercept with statsmodels.api for multiple linear regression -


my data set independent variables following:

>>> reg_data_pd                      b         c 0     0.794527  0.033651  0.352414 1     0.794914  0.001086  0.093222 2     0.794476  0.004711  0.027977 3     0.776916  0.035780  0.023156 4     0.773526  0.002273  0.035269 5     0.797933  0.001838  0.131261 6     0.806997  0.011498  0.180022 7     0.780709  0.000766  0.522399 8     0.779954  0.001397  0.036386 9     0.756837  0.010448  0.035893 10    0.775064  0.029471  0.036798 11    0.787213  0.013467  0.081323 12    0.757511  0.016465  0.021611 13    0.794530  0.004141  0.157539 14    0.783696  0.019909  0.021765 15    0.793892  0.003597  0.063312 16    0.762702  0.003547  0.052479 17    0.780336  0.004958  0.084910 18    0.787005  0.006372  0.048153 19    0.824416  0.000513  0.045102 20    0.790552  0.009652  0.581571 21    0.773064  0.000889  0.263941 22    0.772039  0.021499  0.260455 23    0.780298  0.022814  0.061621 24    0.794924  0.020585  0.020638 25    0.772452  0.085798  0.215673 26    0.784202  0.000013  0.233638 27    0.822010  0.082684  0.028724 28    0.772587  0.027979  0.118953 29    0.765530  0.006655  0.018605 ...        ...       ...       ... 4771  0.968364  0.227303  0.153739 4772  0.968401  0.159052  0.132388 4773  0.959733  0.278948  0.132163 4774  0.957354  0.315088  0.136973 4775  0.954627  0.447764  0.139494 4776  0.952442  0.305559  0.206204 4777  0.948925  0.235244  0.116273 4778  0.953192  0.228221  0.247231 4779  0.952769  0.327529  0.229617 4780  0.954471  0.396722  0.210942 4781  0.955292  0.336075  0.179493 4782  0.950516  0.320840  0.289505 4783  0.950454  0.316647  0.200065 4784  0.947313  0.291446  0.155215 4785  0.945677  0.292084  0.585302 4786  0.951083  0.285946  0.536361 4787  0.943909  0.346754  0.457234 4788  0.941971  0.276125  0.207159 4789  0.945111  0.440802  0.222561 4790  0.951011  0.407192  0.167613 4791  0.925485  0.464954  0.237568 4792  0.926332  0.252929  0.190035 4793  0.931606  0.020075  0.179730 4794  0.929963  0.426511  0.134418 4795  0.941986  0.640994  0.123444 4796  0.943526  0.232498  0.139800 4797  0.945268  0.460201  0.106471 4798  0.953572  0.398044  0.151489 4799  0.947673  0.479376  0.174330 4800  0.952663  0.532027  0.409197  [4801 rows x 3 columns] 

and data set dependent variable is:

>>> yu_pd              y 0     0.290740 1     0.295920 2     0.295920 3     0.192100 4     0.266000 5     0.284700 6     0.284700 7     0.272300 8     0.282680 9     0.243260 10    0.243260 11    0.273150 12    0.273150 13    0.282850 14    0.300325 15    0.192525 16    0.192525 17    0.269620 18    0.286825 19    0.207700 20    0.207700 21    0.292380 22    0.292380 23    0.282600 24    0.278212 25    0.243512 26    0.243512 27    0.309025 28    0.361740 29    0.249520 ...        ... 4771  0.251480 4772  0.287500 4773  0.287500 4774  0.282071 4775  0.313343 4776  0.287463 4777  0.287463 4778  0.298700 4779  0.272920 4780  0.272920 4781  0.371314 4782  0.388429 4783  0.305200 4784  0.305200 4785  0.296725 4786  0.287920 4787  0.271580 4788  0.305486 4789  0.318571 4790  0.337975 4791  0.337975 4792  0.319988 4793  0.192360 4794  0.312871 4795  0.323000 4796  0.347088 4797  0.347088 4798  0.324986 4799  0.184320 4800  0.352100  [4801 rows x 1 columns] 

my code calculating multiple line regression following:

>>> import statsmodels.api sm >>> model = sm.ols(yu_pd,reg_data_pd) >>> results = model.fit() >>> results.summary() <class 'statsmodels.iolib.summary.summary'> """                             ols regression results                             ============================================================================== dep. variable:                      y   r-squared:                       0.896 model:                            ols   adj. r-squared:                  0.896 method:                 least squares   f-statistic:                 1.379e+04 date:                thu, 28 jan 2016   prob (f-statistic):               0.00 time:                        16:45:03   log-likelihood:                 6693.6 no. observations:                4801   aic:                        -1.338e+04 df residuals:                    4798   bic:                        -1.336e+04 df model:                           3                                          covariance type:            nonrobust                                          ==============================================================================                  coef    std err          t      p>|t|      [95.0% conf. int.] ------------------------------------------------------------------------------              0.1933      0.002     78.058      0.000         0.188     0.198 b              0.0135      0.005      2.796      0.005         0.004     0.023 c             -0.0221      0.006     -3.984      0.000        -0.033    -0.011 ============================================================================== omnibus:                      151.028   durbin-watson:                   0.452 prob(omnibus):                  0.000   jarque-bera (jb):              166.568 skew:                           0.430   prob(jb):                     6.77e-37 kurtosis:                       3.306   cond. no.                         6.75 ==============================================================================  warnings: [1] standard errors assume covariance matrix of errors correctly specified. """ 

i have got coefficients 'a', 'b' , 'c' did not value of y intercept.

if have regression model 3 independent variables, why expect there unique definition of y-intercept in datasummary? think notion applies when there single function of single independent variable.

consider following:

import numpy np numpy import random import statsmodels.api sm  x = np.linspace(0,1,100) x = 0.45*x x1 = 0.45*x x3 = np.zeros(200).reshape((100,2)) x3[:,0] = x[:] x3[:,1] = x1[:]  y = 0.45*x  model = sm.ols(y, x3) results = model.fit() print(results.summary())                             ols regression results                             ============================================================================== dep. variable:                      y   r-squared:                         1.000 model:                            ols   adj. r-squared:                  1.000 method:                 least squares   f-statistic:                 5.710e+33 date:                thu, 28 jan 2016   prob (f-statistic):               0.00 time:                        15:11:53   log-likelihood:                 3649.3 no. observations:                 100   aic:                            -7297. df residuals:                      99   bic:                            -7294. df model:                           1                                          covariance type:            nonrobust                                                       coef    std err          t      p>|t|      [95.0% conf. int.] x1             0.5000   6.62e-18   7.56e+16      0.000         0.500     0.500 x2             0.5000   6.62e-18   7.56e+16      0.000         0.500     0.500 

now telling me coefficients indeed slopes of 2 lines. in case, values a, b, , c slopes of 3 lines independent variable input, matching y-variable output.

if of these lines had y-intercept, y = a0 + a*x, x independent variable, solve slope setting x , y equal values obtained deta; example, if point x = y = 1 appears in same row, set x = y = 1, y-int: a0 = 1-a. same other 2 lines fit. believe complete answer question.


Comments

Popular posts from this blog

Hatching array of circles in AutoCAD using c# -

ios - UITEXTFIELD InputView Uipicker not working in swift -