Estimation After a Group Sequential Trial

E. Milanzi; G. Molenberghs; A. Alonso; M.G. Kenward; A.A. Tsiatis; M. Davidian; G. Verbeke

doi:10.1007/s12561-014-9112-6

Estimation After a Group Sequential Trial

E. Milanzi, G. Molenberghs^*, A. Alonso, M.G. Kenward, A.A. Tsiatis, M. Davidian, G. Verbeke

^*Corresponding author for this work

Research output: Contribution to journal › Article › Academic › peer-review

Abstract

Group sequential trials are one important instance of studies for which the sample size is not fixed a priori but rather takes one of a finite set of pre-specified values, dependent on the observed data. Much work has been devoted to the inferential consequences of this design feature. Molenberghs et al. (Statistical Methods in Medical Research, 2012) and Milanzi et al. (Properties of estimators in exponential family settings with observation-based stopping rules, 2012) reviewed and extended the existing literature, focusing on a collection of seemingly disparate, but related, settings, namely completely random sample sizes, group sequential studies with deterministic and random stopping rules, incomplete data, and random cluster sizes. They showed that the ordinary sample average is a viable option for estimation following a group sequential trial, for a wide class of stopping rules and for random outcomes with a distribution in the exponential family. Their results are somewhat surprising in the sense that the sample average is not optimal, and further, there does not exist an optimal, or even, unbiased linear estimator. However, the sample average is asymptotically unbiased, both conditionally upon the observed sample size as well as marginalized over it. By exploiting ignorability they showed that the sample average is the conventional maximum likelihood estimator. They also showed that a conditional maximum likelihood estimator is finite sample unbiased, but is less efficient than the sample average and has the larger mean squared error. Asymptotically, the sample average and the conditional maximum likelihood estimator are equivalent. This previous work is restricted, however, to the situation in which the the random sample size can take only two values, N = n or N = 2n. In this paper, we consider the more practically useful setting of sample sizes in a the finite set {n(1), n(2) , . . . , nL}. It is shown that the sample average is then a justifiable estimator, in the sense that it follows from joint likelihood estimation, and it is consistent and asymptotically unbiased. We also show why simulations can give the false impression of bias in the sample average when considered conditional upon the sample size. The consequence is that no corrections need to be made to estimators following sequential trials. When small-sample bias is of concern, the conditional likelihood estimator (CLE) provides a relatively straightforward modification to the sample average. Finally, it is shown that classical likelihood-based standard errors and confidence intervals can be applied, obviating the need for technical corrections.

Original language	English
Pages (from-to)	187-205
Number of pages	19
Journal	Statistics in Biosciences
Volume	7
Issue number	2
Early online date	22 Feb 2014
DOIs	https://doi.org/10.1007/s12561-014-9112-6
Publication status	Published - Oct 2015

Keywords

Exponential family
Frequentist inference
Generalized sample average
Joint modeling
Likelihood inference
Missing at random
Sample average
CLINICAL-TRIALS

Access to Document

10.1007/s12561-014-9112-6

Cite this

@article{514fb0cfde594f3190616d50b0f18343,

title = "Estimation After a Group Sequential Trial",

abstract = "Group sequential trials are one important instance of studies for which the sample size is not fixed a priori but rather takes one of a finite set of pre-specified values, dependent on the observed data. Much work has been devoted to the inferential consequences of this design feature. Molenberghs et al. (Statistical Methods in Medical Research, 2012) and Milanzi et al. (Properties of estimators in exponential family settings with observation-based stopping rules, 2012) reviewed and extended the existing literature, focusing on a collection of seemingly disparate, but related, settings, namely completely random sample sizes, group sequential studies with deterministic and random stopping rules, incomplete data, and random cluster sizes. They showed that the ordinary sample average is a viable option for estimation following a group sequential trial, for a wide class of stopping rules and for random outcomes with a distribution in the exponential family. Their results are somewhat surprising in the sense that the sample average is not optimal, and further, there does not exist an optimal, or even, unbiased linear estimator. However, the sample average is asymptotically unbiased, both conditionally upon the observed sample size as well as marginalized over it. By exploiting ignorability they showed that the sample average is the conventional maximum likelihood estimator. They also showed that a conditional maximum likelihood estimator is finite sample unbiased, but is less efficient than the sample average and has the larger mean squared error. Asymptotically, the sample average and the conditional maximum likelihood estimator are equivalent. This previous work is restricted, however, to the situation in which the the random sample size can take only two values, N = n or N = 2n. In this paper, we consider the more practically useful setting of sample sizes in a the finite set {n(1), n(2) , . . . , nL}. It is shown that the sample average is then a justifiable estimator, in the sense that it follows from joint likelihood estimation, and it is consistent and asymptotically unbiased. We also show why simulations can give the false impression of bias in the sample average when considered conditional upon the sample size. The consequence is that no corrections need to be made to estimators following sequential trials. When small-sample bias is of concern, the conditional likelihood estimator (CLE) provides a relatively straightforward modification to the sample average. Finally, it is shown that classical likelihood-based standard errors and confidence intervals can be applied, obviating the need for technical corrections.",

keywords = "Exponential family, Frequentist inference, Generalized sample average, Joint modeling, Likelihood inference, Missing at random, Sample average, CLINICAL-TRIALS",

author = "E. Milanzi and G. Molenberghs and A. Alonso and M.G. Kenward and A.A. Tsiatis and M. Davidian and G. Verbeke",

year = "2015",

month = oct,

doi = "10.1007/s12561-014-9112-6",

language = "English",

volume = "7",

pages = "187--205",

journal = "Statistics in Biosciences",

issn = "1867-1764",

publisher = "Springer Verlag",

number = "2",

}

TY - JOUR

T1 - Estimation After a Group Sequential Trial

AU - Milanzi, E.

AU - Molenberghs, G.

AU - Alonso, A.

AU - Kenward, M.G.

AU - Tsiatis, A.A.

AU - Davidian, M.

AU - Verbeke, G.

PY - 2015/10

Y1 - 2015/10

N2 - Group sequential trials are one important instance of studies for which the sample size is not fixed a priori but rather takes one of a finite set of pre-specified values, dependent on the observed data. Much work has been devoted to the inferential consequences of this design feature. Molenberghs et al. (Statistical Methods in Medical Research, 2012) and Milanzi et al. (Properties of estimators in exponential family settings with observation-based stopping rules, 2012) reviewed and extended the existing literature, focusing on a collection of seemingly disparate, but related, settings, namely completely random sample sizes, group sequential studies with deterministic and random stopping rules, incomplete data, and random cluster sizes. They showed that the ordinary sample average is a viable option for estimation following a group sequential trial, for a wide class of stopping rules and for random outcomes with a distribution in the exponential family. Their results are somewhat surprising in the sense that the sample average is not optimal, and further, there does not exist an optimal, or even, unbiased linear estimator. However, the sample average is asymptotically unbiased, both conditionally upon the observed sample size as well as marginalized over it. By exploiting ignorability they showed that the sample average is the conventional maximum likelihood estimator. They also showed that a conditional maximum likelihood estimator is finite sample unbiased, but is less efficient than the sample average and has the larger mean squared error. Asymptotically, the sample average and the conditional maximum likelihood estimator are equivalent. This previous work is restricted, however, to the situation in which the the random sample size can take only two values, N = n or N = 2n. In this paper, we consider the more practically useful setting of sample sizes in a the finite set {n(1), n(2) , . . . , nL}. It is shown that the sample average is then a justifiable estimator, in the sense that it follows from joint likelihood estimation, and it is consistent and asymptotically unbiased. We also show why simulations can give the false impression of bias in the sample average when considered conditional upon the sample size. The consequence is that no corrections need to be made to estimators following sequential trials. When small-sample bias is of concern, the conditional likelihood estimator (CLE) provides a relatively straightforward modification to the sample average. Finally, it is shown that classical likelihood-based standard errors and confidence intervals can be applied, obviating the need for technical corrections.

AB - Group sequential trials are one important instance of studies for which the sample size is not fixed a priori but rather takes one of a finite set of pre-specified values, dependent on the observed data. Much work has been devoted to the inferential consequences of this design feature. Molenberghs et al. (Statistical Methods in Medical Research, 2012) and Milanzi et al. (Properties of estimators in exponential family settings with observation-based stopping rules, 2012) reviewed and extended the existing literature, focusing on a collection of seemingly disparate, but related, settings, namely completely random sample sizes, group sequential studies with deterministic and random stopping rules, incomplete data, and random cluster sizes. They showed that the ordinary sample average is a viable option for estimation following a group sequential trial, for a wide class of stopping rules and for random outcomes with a distribution in the exponential family. Their results are somewhat surprising in the sense that the sample average is not optimal, and further, there does not exist an optimal, or even, unbiased linear estimator. However, the sample average is asymptotically unbiased, both conditionally upon the observed sample size as well as marginalized over it. By exploiting ignorability they showed that the sample average is the conventional maximum likelihood estimator. They also showed that a conditional maximum likelihood estimator is finite sample unbiased, but is less efficient than the sample average and has the larger mean squared error. Asymptotically, the sample average and the conditional maximum likelihood estimator are equivalent. This previous work is restricted, however, to the situation in which the the random sample size can take only two values, N = n or N = 2n. In this paper, we consider the more practically useful setting of sample sizes in a the finite set {n(1), n(2) , . . . , nL}. It is shown that the sample average is then a justifiable estimator, in the sense that it follows from joint likelihood estimation, and it is consistent and asymptotically unbiased. We also show why simulations can give the false impression of bias in the sample average when considered conditional upon the sample size. The consequence is that no corrections need to be made to estimators following sequential trials. When small-sample bias is of concern, the conditional likelihood estimator (CLE) provides a relatively straightforward modification to the sample average. Finally, it is shown that classical likelihood-based standard errors and confidence intervals can be applied, obviating the need for technical corrections.

KW - Exponential family

KW - Frequentist inference

KW - Generalized sample average

KW - Joint modeling

KW - Likelihood inference

KW - Missing at random

KW - Sample average

KW - CLINICAL-TRIALS

U2 - 10.1007/s12561-014-9112-6

DO - 10.1007/s12561-014-9112-6

M3 - Article

SN - 1867-1764

VL - 7

SP - 187

EP - 205

JO - Statistics in Biosciences

JF - Statistics in Biosciences

IS - 2

ER -