Ranking Accuracy for Logistic-GEE Models

Research output: Chapter in Book/Report/Conference proceedingConference article in proceedingAcademicpeer-review

Abstract

The logistic generalized estimating equations (logistic-gee) models have been extensively used for analyzing clustered binary data. However, assessing the goodness-of-fit and predictability of these models is problematic due to the fact that no likelihood is available and the observations can be correlated within a cluster. In this paper we propose a new measure for estimating the generalization performance of the logistic gee models, namely ranking accuracy for models based on clustered data (ramcd). We define ramcd as the probability that a randomly selected positive observation is ranked higher than randomly selected negative observation from another cluster. We propose a computationally efficient algorithm for ramcd. The algorithm can be applied for two cases: (1) when we estimate ramcd as a goodness-of-fit criterion and (2) when we estimate ramcd as a predictability criterion. This is experimentally shown on clustered data from a simulation study and a biomarkers’ study.
Original languageEnglish
Title of host publicationLecture Notes in Computer Science
EditorsH Boström, A Knobbe, C Soares, P Papapetrou
PublisherSpringer
Chapter2
Pages14-25
Volume9897
ISBN (Electronic)978-3-319-46349-0
ISBN (Print)978-3-319-46348-3
DOIs
Publication statusPublished - 21 Sep 2016

Publication series

SeriesAdvances in Intelligent Data Analysis XV
Volume9897
ISSN0302-9743

Cite this

Davarzani, N., Peeters, R., Smirnov, E., Karel, J., & Brunner-la Rocca, H. (2016). Ranking Accuracy for Logistic-GEE Models. In H. Boström, A. Knobbe, C. Soares, & P. Papapetrou (Eds.), Lecture Notes in Computer Science (Vol. 9897, pp. 14-25). Springer. Advances in Intelligent Data Analysis XV, Vol.. 9897 https://doi.org/10.1007/978-3-319-46349-0_2
Davarzani, Nasser ; Peeters, Ralf ; Smirnov, Evgueni ; Karel, Joël ; Brunner-la Rocca, Hans-peter. / Ranking Accuracy for Logistic-GEE Models. Lecture Notes in Computer Science. editor / H Boström ; A Knobbe ; C Soares ; P Papapetrou. Vol. 9897 Springer, 2016. pp. 14-25 (Advances in Intelligent Data Analysis XV, Vol. 9897).
@inproceedings{91b10424900e43ababe37535432e81f9,
title = "Ranking Accuracy for Logistic-GEE Models",
abstract = "The logistic generalized estimating equations (logistic-gee) models have been extensively used for analyzing clustered binary data. However, assessing the goodness-of-fit and predictability of these models is problematic due to the fact that no likelihood is available and the observations can be correlated within a cluster. In this paper we propose a new measure for estimating the generalization performance of the logistic gee models, namely ranking accuracy for models based on clustered data (ramcd). We define ramcd as the probability that a randomly selected positive observation is ranked higher than randomly selected negative observation from another cluster. We propose a computationally efficient algorithm for ramcd. The algorithm can be applied for two cases: (1) when we estimate ramcd as a goodness-of-fit criterion and (2) when we estimate ramcd as a predictability criterion. This is experimentally shown on clustered data from a simulation study and a biomarkers’ study.",
author = "Nasser Davarzani and Ralf Peeters and Evgueni Smirnov and Jo{\"e}l Karel and {Brunner-la Rocca}, Hans-peter",
year = "2016",
month = "9",
day = "21",
doi = "10.1007/978-3-319-46349-0_2",
language = "English",
isbn = "978-3-319-46348-3",
volume = "9897",
series = "Advances in Intelligent Data Analysis XV",
pages = "14--25",
editor = "H Bostr{\"o}m and A Knobbe and C Soares and P Papapetrou",
booktitle = "Lecture Notes in Computer Science",
publisher = "Springer",
address = "United States",

}

Davarzani, N, Peeters, R, Smirnov, E, Karel, J & Brunner-la Rocca, H 2016, Ranking Accuracy for Logistic-GEE Models. in H Boström, A Knobbe, C Soares & P Papapetrou (eds), Lecture Notes in Computer Science. vol. 9897, Springer, Advances in Intelligent Data Analysis XV, vol. 9897, pp. 14-25. https://doi.org/10.1007/978-3-319-46349-0_2

Ranking Accuracy for Logistic-GEE Models. / Davarzani, Nasser; Peeters, Ralf; Smirnov, Evgueni; Karel, Joël; Brunner-la Rocca, Hans-peter.

Lecture Notes in Computer Science. ed. / H Boström; A Knobbe; C Soares; P Papapetrou. Vol. 9897 Springer, 2016. p. 14-25 (Advances in Intelligent Data Analysis XV, Vol. 9897).

Research output: Chapter in Book/Report/Conference proceedingConference article in proceedingAcademicpeer-review

TY - GEN

T1 - Ranking Accuracy for Logistic-GEE Models

AU - Davarzani, Nasser

AU - Peeters, Ralf

AU - Smirnov, Evgueni

AU - Karel, Joël

AU - Brunner-la Rocca, Hans-peter

PY - 2016/9/21

Y1 - 2016/9/21

N2 - The logistic generalized estimating equations (logistic-gee) models have been extensively used for analyzing clustered binary data. However, assessing the goodness-of-fit and predictability of these models is problematic due to the fact that no likelihood is available and the observations can be correlated within a cluster. In this paper we propose a new measure for estimating the generalization performance of the logistic gee models, namely ranking accuracy for models based on clustered data (ramcd). We define ramcd as the probability that a randomly selected positive observation is ranked higher than randomly selected negative observation from another cluster. We propose a computationally efficient algorithm for ramcd. The algorithm can be applied for two cases: (1) when we estimate ramcd as a goodness-of-fit criterion and (2) when we estimate ramcd as a predictability criterion. This is experimentally shown on clustered data from a simulation study and a biomarkers’ study.

AB - The logistic generalized estimating equations (logistic-gee) models have been extensively used for analyzing clustered binary data. However, assessing the goodness-of-fit and predictability of these models is problematic due to the fact that no likelihood is available and the observations can be correlated within a cluster. In this paper we propose a new measure for estimating the generalization performance of the logistic gee models, namely ranking accuracy for models based on clustered data (ramcd). We define ramcd as the probability that a randomly selected positive observation is ranked higher than randomly selected negative observation from another cluster. We propose a computationally efficient algorithm for ramcd. The algorithm can be applied for two cases: (1) when we estimate ramcd as a goodness-of-fit criterion and (2) when we estimate ramcd as a predictability criterion. This is experimentally shown on clustered data from a simulation study and a biomarkers’ study.

U2 - 10.1007/978-3-319-46349-0_2

DO - 10.1007/978-3-319-46349-0_2

M3 - Conference article in proceeding

SN - 978-3-319-46348-3

VL - 9897

T3 - Advances in Intelligent Data Analysis XV

SP - 14

EP - 25

BT - Lecture Notes in Computer Science

A2 - Boström, H

A2 - Knobbe, A

A2 - Soares, C

A2 - Papapetrou, P

PB - Springer

ER -

Davarzani N, Peeters R, Smirnov E, Karel J, Brunner-la Rocca H. Ranking Accuracy for Logistic-GEE Models. In Boström H, Knobbe A, Soares C, Papapetrou P, editors, Lecture Notes in Computer Science. Vol. 9897. Springer. 2016. p. 14-25. (Advances in Intelligent Data Analysis XV, Vol. 9897). https://doi.org/10.1007/978-3-319-46349-0_2