Heavy-tailed longitudinal data modeling using copulas

被引:59
作者
Sun, Jiafeng [1 ]
Frees, Edward W. [1 ]
Rosenberg, Marjorie A. [1 ]
机构
[1] Univ Wisconsin, Sch Business, Dept Actuarial Sci Risk Management & Insurance, Madison, WI 53706 USA
基金
美国国家科学基金会; 美国医疗保健研究与质量局;
关键词
healthcare costs; predictive modeling;
D O I
10.1016/j.insmatheco.2007.09.009
中图分类号
F [经济];
学科分类号
02 ;
摘要
In this paper, we consider "heavy-tailed" data, that is, data where extreme values are likely to occur. Heavy-tailed data have been analyzed using flexible distributions such as the generalized beta of the second kind, the generalized gamma and the Burr. These distributions allow us to handle data with either positive or negative skewness, as well as heavy tails. Moreover, it has been shown that they can also accommodate cross-sectional regression models by allowing functions of explanatory variables to serve as distribution parameters. The objective of this paper is to extend this literature to accommodate longitudinal data, where one observes repeated observations of cross-sectional data. Specifically, we use copulas to model the dependencies over time, and heavy-tailed regression models to represent the marginal distributions. We also introduce model exploration techniques to help us with the initial choice of the copula and a goodness-of-fit test of elliptical copulas for model validation. In a longitudinal data context, we argue that elliptical copulas will be typically preferred to the Archimedean copulas. To illustrate our methods, Wisconsin nursing homes utilization data from 1995 to 2001 are analyzed. These data exhibit long tails and negative skewness and so help us to motivate the need for our new techniques. We find that time and the nursing home facility size as measured through the number of beds and square footage are important predictors of future utilization. Moreover, using our parametric model, we provide not only point predictions but also an entire predictive distribution. (C) 2007 Elsevier B.V. All rights reserved.
引用
收藏
页码:817 / 830
页数:14
相关论文
共 48 条
[1]  
[Anonymous], 1983, P CASUALTY ACTUARIAL
[2]  
[Anonymous], 2002, ANAL LONGITUDINAL DA
[3]  
[Anonymous], 1988, Nonlinear regression analysis and its applications, DOI DOI 10.1002/9780470316757
[4]  
Baltagi H.B., 2005, ECONOMETRIC ANAL PAN
[5]   Burr regression and portfolio segmentation [J].
Beirlant, J ;
Goegebeur, Y ;
Verlaak, R ;
Vynckier, P .
INSURANCE MATHEMATICS & ECONOMICS, 1998, 23 (03) :231-250
[6]  
Beirlant J, 2004, STAT EXTREMES
[7]   APPROXIMATE INFERENCE IN GENERALIZED LINEAR MIXED MODELS [J].
BRESLOW, NE ;
CLAYTON, DG .
JOURNAL OF THE AMERICAN STATISTICAL ASSOCIATION, 1993, 88 (421) :9-25
[8]  
Carroll R. J., 1988, TRANSFORMATION WEIGH
[9]  
Cook D.R., 1999, APPL REGRESSION INCL
[10]   APPLICATIONS OF THE GB2 FAMILY OF DISTRIBUTIONS IN MODELING INSURANCE LOSS PROCESSES [J].
CUMMINS, JD ;
DIONNE, G ;
MCDONALD, JB ;
PRITCHETT, BM .
INSURANCE MATHEMATICS & ECONOMICS, 1990, 9 (04) :257-272