Ranking Accuracy for Logistic-GEE Models

Nasser Davarzani; Ralf Peeters; Evgueni Smirnov; Joël Karel; Hans-peter Brunner-la Rocca

doi:10.1007/978-3-319-46349-0_2

Ranking Accuracy for Logistic-GEE Models

Nasser Davarzani^*, Ralf Peeters, Evgueni Smirnov, Joël Karel, Hans-peter Brunner-la Rocca

^*Corresponding author for this work

Research output: Chapter in Book/Report/Conference proceeding › Chapter › Academic

39 Downloads (Pure)

Abstract

The logistic generalized estimating equations (logistic-gee) models have been extensively used for analyzing clustered binary data. However, assessing the goodness-of-fit and predictability of these models is problematic due to the fact that no likelihood is available and the observations can be correlated within a cluster. In this paper we propose a new measure for estimating the generalization performance of the logistic gee models, namely ranking accuracy for models based on clustered data (ramcd). We define ramcd as the probability that a randomly selected positive observation is ranked higher than randomly selected negative observation from another cluster. We propose a computationally efficient algorithm for ramcd. The algorithm can be applied for two cases: (1) when we estimate ramcd as a goodness-of-fit criterion and (2) when we estimate ramcd as a predictability criterion. This is experimentally shown on clustered data from a simulation study and a biomarkers’ study.

Original language	English
Title of host publication	IDA 2016: Advances in Intelligent Data Analysis XV
Editors	H Boström, A Knobbe, C Soares, P Papapetrou
Publisher	Springer International Publishing AG
Chapter	2
Pages	14-25
Number of pages	12
ISBN (Electronic)	978-3-319-46349-0
ISBN (Print)	978-3-319-46348-3
DOIs	https://doi.org/10.1007/978-3-319-46349-0_2
Publication status	Published - 21 Sept 2016
Event	15th International Symposium on Intelligent Data Analysis (IDA): IDA 2016 - Stockholm, Sweden Duration: 13 Oct 2016 → 15 Oct 2016

Publication series

Series	Lecture Notes in Computer Science
Volume	9897
ISSN	0302-9743

Symposium

Symposium	15th International Symposium on Intelligent Data Analysis (IDA)
Country/Territory	Sweden
City	Stockholm
Period	13/10/16 → 15/10/16

Keywords

Clustered data
Generalized Estimating Equation
Goodness-of-fit
Predictability
Ranking accuracy
OF-FIT TESTS
LONGITUDINAL DATA-ANALYSIS
CONGESTIVE-HEART-FAILURE
MANAGEMENT

Access to Document

10.1007/978-3-319-46349-0_2

Full TextFinal published version, 363 KBLicence: Taverne

Cite this

@inbook{91b10424900e43ababe37535432e81f9,

title = "Ranking Accuracy for Logistic-GEE Models",

abstract = "The logistic generalized estimating equations (logistic-gee) models have been extensively used for analyzing clustered binary data. However, assessing the goodness-of-fit and predictability of these models is problematic due to the fact that no likelihood is available and the observations can be correlated within a cluster. In this paper we propose a new measure for estimating the generalization performance of the logistic gee models, namely ranking accuracy for models based on clustered data (ramcd). We define ramcd as the probability that a randomly selected positive observation is ranked higher than randomly selected negative observation from another cluster. We propose a computationally efficient algorithm for ramcd. The algorithm can be applied for two cases: (1) when we estimate ramcd as a goodness-of-fit criterion and (2) when we estimate ramcd as a predictability criterion. This is experimentally shown on clustered data from a simulation study and a biomarkers{\textquoteright} study.",

keywords = "Clustered data, Generalized Estimating Equation, Goodness-of-fit, Predictability, Ranking accuracy, OF-FIT TESTS, LONGITUDINAL DATA-ANALYSIS, CONGESTIVE-HEART-FAILURE, MANAGEMENT",

author = "Nasser Davarzani and Ralf Peeters and Evgueni Smirnov and Jo{\"e}l Karel and {Brunner-la Rocca}, Hans-peter",

year = "2016",

month = sep,

day = "21",

doi = "10.1007/978-3-319-46349-0_2",

language = "English",

isbn = "978-3-319-46348-3",

series = "Lecture Notes in Computer Science",

publisher = "Springer International Publishing AG",

pages = "14--25",

editor = "H Bostr{\"o}m and A Knobbe and C Soares and P Papapetrou",

booktitle = "IDA 2016: Advances in Intelligent Data Analysis XV",

note = "15th International Symposium on Intelligent Data Analysis (IDA) : IDA 2016 ; Conference date: 13-10-2016 Through 15-10-2016",

}

Davarzani, N, Peeters, R , Smirnov, E , Karel, J & Brunner-la Rocca, H 2016, Ranking Accuracy for Logistic-GEE Models. in H Boström, A Knobbe, C Soares & P Papapetrou (eds), IDA 2016: Advances in Intelligent Data Analysis XV . Springer International Publishing AG, Lecture Notes in Computer Science, vol. 9897, pp. 14-25, 15th International Symposium on Intelligent Data Analysis (IDA), Stockholm, Sweden, 13/10/16. https://doi.org/10.1007/978-3-319-46349-0_2

Ranking Accuracy for Logistic-GEE Models. / Davarzani, Nasser; Peeters, Ralf ; Smirnov, Evgueni et al.
IDA 2016: Advances in Intelligent Data Analysis XV . ed. / H Boström; A Knobbe; C Soares; P Papapetrou. Springer International Publishing AG, 2016. p. 14-25 (Lecture Notes in Computer Science, Vol. 9897).

Research output: Chapter in Book/Report/Conference proceeding › Chapter › Academic

TY - CHAP

T1 - Ranking Accuracy for Logistic-GEE Models

AU - Davarzani, Nasser

AU - Peeters, Ralf

AU - Smirnov, Evgueni

AU - Karel, Joël

AU - Brunner-la Rocca, Hans-peter

PY - 2016/9/21

Y1 - 2016/9/21

N2 - The logistic generalized estimating equations (logistic-gee) models have been extensively used for analyzing clustered binary data. However, assessing the goodness-of-fit and predictability of these models is problematic due to the fact that no likelihood is available and the observations can be correlated within a cluster. In this paper we propose a new measure for estimating the generalization performance of the logistic gee models, namely ranking accuracy for models based on clustered data (ramcd). We define ramcd as the probability that a randomly selected positive observation is ranked higher than randomly selected negative observation from another cluster. We propose a computationally efficient algorithm for ramcd. The algorithm can be applied for two cases: (1) when we estimate ramcd as a goodness-of-fit criterion and (2) when we estimate ramcd as a predictability criterion. This is experimentally shown on clustered data from a simulation study and a biomarkers’ study.

AB - The logistic generalized estimating equations (logistic-gee) models have been extensively used for analyzing clustered binary data. However, assessing the goodness-of-fit and predictability of these models is problematic due to the fact that no likelihood is available and the observations can be correlated within a cluster. In this paper we propose a new measure for estimating the generalization performance of the logistic gee models, namely ranking accuracy for models based on clustered data (ramcd). We define ramcd as the probability that a randomly selected positive observation is ranked higher than randomly selected negative observation from another cluster. We propose a computationally efficient algorithm for ramcd. The algorithm can be applied for two cases: (1) when we estimate ramcd as a goodness-of-fit criterion and (2) when we estimate ramcd as a predictability criterion. This is experimentally shown on clustered data from a simulation study and a biomarkers’ study.

KW - Clustered data

KW - Generalized Estimating Equation

KW - Goodness-of-fit

KW - Predictability

KW - Ranking accuracy

KW - OF-FIT TESTS

KW - LONGITUDINAL DATA-ANALYSIS

KW - CONGESTIVE-HEART-FAILURE

KW - MANAGEMENT

U2 - 10.1007/978-3-319-46349-0_2

DO - 10.1007/978-3-319-46349-0_2

M3 - Chapter

SN - 978-3-319-46348-3

T3 - Lecture Notes in Computer Science

SP - 14

EP - 25

BT - IDA 2016: Advances in Intelligent Data Analysis XV

A2 - Boström, H

A2 - Knobbe, A

A2 - Soares, C

A2 - Papapetrou, P

PB - Springer International Publishing AG

T2 - 15th International Symposium on Intelligent Data Analysis (IDA)

Y2 - 13 October 2016 through 15 October 2016

ER -