Python中OLS的Newey-West标准错误?

Newey-West standard errors for OLS in Python?(Python中OLS的Newey-West标准错误?)
本文介绍了Python中OLS的Newey-West标准错误?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着跟版网的小编来一起学习吧!

问题描述

我想要一个相关的系数和 Newey-West 标准误差.

I want to have a coefficient and Newey-West standard error associated with it.

我正在寻找可以执行以下 R 代码正在执行的操作的 Python 库(理想情况下,但任何可行的解决方案都可以):

I am looking for Python library (ideally, but any working solutions is fine) that can do what the following R code is doing:

library(sandwich)
library(lmtest)

a <- matrix(c(1,3,5,7,4,5,6,4,7,8,9))
b <- matrix(c(3,5,6,2,4,6,7,8,7,8,9))

temp.lm = lm(a ~ b)

temp.summ <- summary(temp.lm)
temp.summ$coefficients <- unclass(coeftest(temp.lm, vcov. = NeweyWest))

print (temp.summ$coefficients)

结果:

             Estimate Std. Error   t value  Pr(>|t|)
(Intercept) 2.0576208  2.5230532 0.8155281 0.4358205
b           0.5594796  0.4071834 1.3740235 0.2026817

我得到系数并与它们相关联的标准误差.

I get the coefficients and associated with them standard errors.

我看到 statsmodels.stats.sandwich_covariance.cov_hac 模块,但我不知道如何使它与 OLS 一起使用.

I see statsmodels.stats.sandwich_covariance.cov_hac module, but I don't see how to make it work with OLS.

推荐答案

已编辑(2015 年 10 月 31 日)以反映 2015 年秋季 statsmodels 的首选编码风格.

Edited (10/31/2015) to reflect preferred coding style for statsmodels as fall 2015.

statsmodels 0.6.1 版中,您可以执行以下操作:

In statsmodels version 0.6.1 you can do the following:

import pandas as pd
import numpy as np
import statsmodels.formula.api as smf

df = pd.DataFrame({'a':[1,3,5,7,4,5,6,4,7,8,9],
                   'b':[3,5,6,2,4,6,7,8,7,8,9]})

reg = smf.ols('a ~ 1 + b',data=df).fit(cov_type='HAC',cov_kwds={'maxlags':1})
print reg.summary()

                                OLS Regression Results
==============================================================================
Dep. Variable:                      a   R-squared:                       0.281
Model:                            OLS   Adj. R-squared:                  0.201
Method:                 Least Squares   F-statistic:                     1.949
Date:                Sat, 31 Oct 2015   Prob (F-statistic):              0.196
Time:                        03:15:46   Log-Likelihood:                -22.603
No. Observations:                  11   AIC:                             49.21
Df Residuals:                       9   BIC:                             50.00
Df Model:                           1
Covariance Type:                  HAC
==============================================================================
                 coef    std err          z      P>|z|      [95.0% Conf. Int.]
------------------------------------------------------------------------------
Intercept      2.0576      2.661      0.773      0.439        -3.157     7.272
b              0.5595      0.401      1.396      0.163        -0.226     1.345
==============================================================================
Omnibus:                        0.361   Durbin-Watson:                   1.468
Prob(Omnibus):                  0.835   Jarque-Bera (JB):                0.331
Skew:                           0.321   Prob(JB):                        0.847
Kurtosis:                       2.442   Cond. No.                         19.1
==============================================================================

Warnings:
[1] Standard Errors are heteroscedasticity and autocorrelation robust (HAC) using 1 lags and without small sample correction

也可以在拟合模型后使用get_robustcov_results方法:

Or you can use the get_robustcov_results method after fitting the model:

reg = smf.ols('a ~ 1 + b',data=df).fit()
new = reg.get_robustcov_results(cov_type='HAC',maxlags=1)
print new.summary()


                                OLS Regression Results
==============================================================================
Dep. Variable:                      a   R-squared:                       0.281
Model:                            OLS   Adj. R-squared:                  0.201
Method:                 Least Squares   F-statistic:                     1.949
Date:                Sat, 31 Oct 2015   Prob (F-statistic):              0.196
Time:                        03:15:46   Log-Likelihood:                -22.603
No. Observations:                  11   AIC:                             49.21
Df Residuals:                       9   BIC:                             50.00
Df Model:                           1
Covariance Type:                  HAC
==============================================================================
                 coef    std err          z      P>|z|      [95.0% Conf. Int.]
------------------------------------------------------------------------------
Intercept      2.0576      2.661      0.773      0.439        -3.157     7.272
b              0.5595      0.401      1.396      0.163        -0.226     1.345
==============================================================================
Omnibus:                        0.361   Durbin-Watson:                   1.468
Prob(Omnibus):                  0.835   Jarque-Bera (JB):                0.331
Skew:                           0.321   Prob(JB):                        0.847
Kurtosis:                       2.442   Cond. No.                         19.1
==============================================================================

Warnings:
[1] Standard Errors are heteroscedasticity and autocorrelation robust (HAC) using 1 lags and without small sample correction

statsmodels 的默认值与 R 中等效方法的默认值略有不同.通过将 vcov, 调用更改为以下内容,可以使 R 方法等效于 statsmodels 默认值(我在上面所做的):

The defaults for statsmodels are slightly different than the defaults for the equivalent method in R. The R method can be made equivalent to the statsmodels default (what I did above) by changing the vcov, call to the following:

temp.summ$coefficients <- unclass(coeftest(temp.lm, 
               vcov. = NeweyWest(temp.lm,lag=1,prewhite=FALSE)))
print (temp.summ$coefficients)

             Estimate Std. Error   t value  Pr(>|t|)
(Intercept) 2.0576208  2.6605060 0.7733945 0.4591196
b           0.5594796  0.4007965 1.3959193 0.1962142

您仍然可以在 pandas (0.17) 中执行 Newey-West,尽管我认为该计划是在 pandas 中弃用 OLS:

You can also still do Newey-West in pandas (0.17), although I believe the plan is to deprecate OLS in pandas:

print pd.stats.ols.OLS(df.a,df.b,nw_lags=1)

-------------------------Summary of Regression Analysis-------------------------

Formula: Y ~ <x> + <intercept>

Number of Observations:         11
Number of Degrees of Freedom:   2

R-squared:         0.2807
Adj R-squared:     0.2007

Rmse:              2.0880

F-stat (1, 9):     1.5943, p-value:     0.2384

Degrees of Freedom: model 1, resid 9

-----------------------Summary of Estimated Coefficients------------------------
      Variable       Coef    Std Err     t-stat    p-value    CI 2.5%   CI 97.5%
 --------------------------------------------------------------------------------
             x     0.5595     0.4431       1.26     0.2384    -0.3090     1.4280
     intercept     2.0576     2.9413       0.70     0.5019    -3.7073     7.8226
*** The calculations are Newey-West adjusted with lags     1

---------------------------------End of Summary---------------------------------

这篇关于Python中OLS的Newey-West标准错误?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持跟版网!

本站部分内容来源互联网,如果有图片或者内容侵犯了您的权益,请联系我们,我们会在确认后第一时间进行删除!

相关文档推荐

Seasonal Decomposition of Time Series by Loess with Python(Loess 用 Python 对时间序列进行季节性分解)
Resample a time series with the index of another time series(使用另一个时间序列的索引重新采样一个时间序列)
How can I simply calculate the rolling/moving variance of a time series in python?(如何在 python 中简单地计算时间序列的滚动/移动方差?)
How to use Dynamic Time warping with kNN in python(如何在python中使用动态时间扭曲和kNN)
Keras LSTM: a time-series multi-step multi-features forecasting - poor results(Keras LSTM:时间序列多步多特征预测 - 结果不佳)
Python pandas time series interpolation and regularization(Python pandas 时间序列插值和正则化)