Using Firth's method for model estimation and market segmentation based on choice data

Roselinde Kessels*, Bradley Jones, Peter Goos

*Corresponding author for this work

Research output: Contribution to journalArticleAcademicpeer-review


Using maximum likelihood (ML) estimation for discrete choice modeling of small datasets causes two problems. The first problem is that the data may exhibit separation, in which case the ML estimates do not exist. Also, provided they exist, the ML estimates are biased. In this paper, we show how to adapt Firth's penalized likelihood estimation for use in discrete choice modeling. A powerful advantage of Firth's estimation is that, unlike ML estimation, it provides useful estimates in the case of data separation. For aggregates of six or more respondents, Firth estimates have negligible bias. For preference estimates on an individual level, Firth estimates show little bias as long as each person evaluates a sufficient number of choice sets. Additionally, Firth's individual-level estimation makes it possible to construct an empirical distribution of the respondents' preferences without imposing any a priori population distribution and to effectively predict people's choices and detect market segments. Segment recovery may even be better when individual-level estimates are obtained using Firth's method instead of hierarchical Bayes estimation under a normal prior. We base all findings on data from a stated choice study on various forms of employee compensation.
Original languageEnglish
Pages (from-to)1-21
Number of pages21
JournalJournal of Choice Modelling
Publication statusPublished - Jun 2019
Externally publishedYes
EventInterdisciplinary Choice Workshop (ICW 2018) - Santiago, Chile
Duration: 7 Aug 201810 Aug 2018


  • Discrete choice modeling
  • Data separation
  • Firth's penalized maximum likelihood
  • Hierarchical Bayes estimation
  • Individual-level estimates
  • Market segmentation


Dive into the research topics of 'Using Firth's method for model estimation and market segmentation based on choice data'. Together they form a unique fingerprint.

Cite this