9:30 An introduction to copulas for modelling discrete data: Parametric families-methods of inference.
Response variables in biostatistics or actuarial science are often discrete and covariates must be taken into account. Examples of data include, among others, claims in actuarial science, familial data (measurements for each member of an extended multi-generation family) in medical genetics applications, repeated measurements in health studies, item response data in psychometrics applications. These data characteristics create many challenges for copula-based models, which were originally developed for continuous responses. The applications of multivariate copula models for discrete data are limited. Usually we have to trade off between models with limited dependence (e.g. only positive association) and models with flexible dependence but not easy implementation for likelihood inference (Nikoloulopoulos and Karlis, 2008, 2009, 2010). Composite likelihood, see e.g. Zhao and Joe (2005) and Genest et al. (2012), and simulated likelihood (Nikoloulopoulos, 2103b, 2015) methods will be discussed in order to overcome the computational complexities.,
11:00 Coffee/Tea and Refreshments
11:30 The weighted scores method for regression with dependent data.
To generalize existing methodology in the literature for regression models with dependent data, we concentrate on inference for the univariate parameters with dependence treated as nuisance. The proposed weighted scores method (Nikoloulopoulos et al., 2011) is an extension of the generalized estimating equations (GEE) since it can also be applied to families that are not in the class of generalized linear regression models. The GEE method is a non-likelihood approach based on a "working correlation" matrix. But for non-normal variables Pearson's correlations have constraints that depend on the univariate margins, which the GEE method ignores. Furthermore, in general, the specified "working correlation" matrix may not correspond to any multivariate distribution for binary or count data, so regression parameter estimates and p-values may lack a theoretical probabilistic basis. In the absence of a multivariate distribution with specified "working correlation" matrix, broad claims of consistency of GEE estimates are incorrect. Our new method not only generalizes but also overcomes the theoretical flaws associated with the GEE procedure because our "working model" is a proper multivariate model, and the parameters in the weight matrices are interpretable as latent correlations. Further, asymptotic and small-sample efficiency calculations show that our method is robust and nearly as efficient as maximum likelihood for fully specified copula models.
14:00 The weighted scores method in practice
This talk describes the features of the R package weightedScores (Nikoloulopoulos and Joe, 2011), which implements the weighted scores method for regression models with dependent discrete data. Its use is thoroughly illustrated with longitudinal binary and count data.
- Genest, C., Nikoloulopoulos, A. K., Rivest, L.-P., and Fortin, M. (2013). Predicting dependent binary outcomes through logistic regressions and meta-elliptical copulas.Brazilian Journal of Probability and Statistics, 27:265–284.
- Nikoloulopoulos, A. K. (2013a). Copula-based models for multivariate discrete response data. In Durante, F., Ha¨rdle, W., and Jaworski, P., editors, Copulae in Mathematical and Quantitative Finance, pages 231–249. Springer.
- Nikoloulopoulos, A. K. (2013b). On the estimation of normal copula discrete regression models using the continuous extension and simulated likelihood. Journal of Statistical Planning and Inference, 143:1923–1937.
- Nikoloulopoulos, A. K. and Joe, H. (2011). weightedScores: Weighted scores method for regression with dependent data. R package version 0.9.1.
- Nikoloulopoulos, A. K., Joe, H., and Chaganty, N. R. (2011). Weighted scores method for regression models with dependent data. Biostatistics, 12:653–665.
- Nikoloulopoulos, A.K.(2015). Efficient estimation of high-dimensional multivariate normal copula models with discrete spatial responses in Stochastic Environmental Research and Risk Assessment, in press
- Nikoloulopoulos, A. K. and Karlis, D. (2010). Modeling multivariate count data using copulas. Communications in Statistics: Simulation and Computation, 39:172–187.
- Nikoloulopoulos, A. K. and Karlis, D. (2009). Finite normal mixture copulas for multivariate discrete data modeling. Journal of Statistical Planning and Inference, 139:3878–3890.
- Nikoloulopoulos, A. K. and Karlis, D. (2008). Multivariate logit copula model with an application to dental data. Statistics in Medicine, 27:6393–6406.
- Panagiotelis, A., Czado, C., and Joe, H. (2012). Pair copula constructions for multivariate discrete data. Journal of the American Statistical Association, 107:1063–1072.
- Zhao, Y. and Joe, H. (2005). Composite likelihood estimation in multivariate data analysis. The Canadian Journal of Statistics, 33(3):335–356.