您的浏览器禁用了JavaScript(一种计算机语言,用以实现您与网页的交互),请解除该禁用,或者联系我们。[ACT]:Empirical Bayes Estimates of Parameters from the Logistic Regression Model - 发现报告
当前位置:首页/行业研究/报告详情/

Empirical Bayes Estimates of Parameters from the Logistic Regression Model

文化传媒2014-09-12ACT能***
Empirical Bayes Estimates of Parameters from the Logistic Regression Model

A C T Research Report Series 9 7 -6Empirical Bayes Estimates of Parameters from the Logistic Regression ModelWalter M. Houston David J. WoodruffAugust 1997 For additional copies write:ACT Research Report Series POBox 168Iowa City, Iowa 52243-0168© 1997 by ACT, Inc, All rights reserved. EMPIRICAL BAYES ESTIMATES OF PARAMETERS FROM THE LOGISTIC REGRESSION MODELWalter M. Houston David J. Woodruff ABSTRACTMaximum likelihood and least-squares estimates of parameters from the logistic regression model are derived from an iteratively reweighted linear regression algorithm. Empirical Bayes estimates are derived using an m-group regression model to regress the within-group estimates toward common values. The m-group regression model assumes that the parameter vectors from m groups are independent, and identically distributed, observations from a multivariate normal "prior" distribution. Based on asymptotic normality of maximum likelihood estimates, the posterior distributions are multivariate normal. Under the assumption that the parameter vectors from the m groups are exchangeable, the hyperparameters of the common prior distribution are estimated using the EM algorithm. Results from an empirical study of the relative stability of the empirical Bayes and maximum likelihood estimates are consistent with those reported previously for the m-group regression model. Estimators that use collateral information from exchangeable groups to regress within-group parameter estimates toward a common value are more stable than estimators calculated exclusively from within-group data. EMPIRICAL BAYES ESTIMATES OF PARAMETERS FROM THE LOGISTIC REGRESSION MODELLogistic regression is used in many areas of substantive interest in the social and biological sciences to model the conditional expectation (probability) of a binary dependent variable as a function of an observed (or latent) vector of covariates. In some applications, the parameters of the logistic regression function must be estimated from small samples, which can result in parameter estimates with large sampling variability. For example, ACT uses logistic regression to model the probability of success within specific college courses. Using current estimation procedures, a minimum sample size of 45 within-course observations is required. Further, because the estimation algorithm for the parameters of the logistic regression model is iterative, parameter estimates based on small samples may fail to converge, or converge to local, rather than global, stationary points.One way to stabilize parameter estimates is to use collateral information from "exchangeable" groups (Lindley, 1971) to refine the within-group estimates. Bayesian m-group regression models have been shown to increase the prediction accuracy and stability of the parameter estimates, relative to estimates that ignore collateral information.In m-group regression models, the parameter vectors within each of m exchangeable groups are assumed to be independent and identically distributed observations with a common probability density function. When this common distribution is treated as a "prior" distribution, posterior distributions and associated inferences follow from standard Bayesian theory (DeGroot, 1970). The effect of the prior distribution is to regress the within-group maximum likelihood estimates toward common values. The extent of the regression effect is inversely related to the precision of the within-group estimates; as the precision of the within-group estimate decreases, the empirical Bayes estimate moves closer to the prior mean. As the precision increases, the maximum likelihood and empirical Bayes estimates converge. Estimators that regress within- group parameter estimates toward common values (often referred to as "regressed" or "shrinkage" estimators) are also found in classical theory (James & Stein, 1961; Evans & Stark, 1996).The parameters of the prior distribution are often referred to as "hyperparameters", to distinguish them from the parameters of the logistic regression function. There are different methods for estimating the hyperparameters. A fully Bayesian analysis requires a prior density for the hyperparameters (Novick, Jackson, Thayer, and Cole, 1972). The posterior density of the parameter vector is found by integrating the joint posterior density of the parameters and hyperparameters with respect to the hyperparameters. Since these integrals are seldom expressible in a closed form, numerical integration procedures are often required. Markov chain sampling schemes, such as the Gibbs sampler (Tanner, 1993), are currently under investigation as an alternative to the numerical methods used previously.Empirical