Flexible and efficient simulation-based inference for models of decision-making

Curation statements for this article:
  • Curated by eLife

    eLife logo

    Evaluation Summary:

    This paper provides a new approach, Mixed Neural Likelihood Estimator (MNLE) to build likelihood emulators for simulation-based models where the likelihood is unavailable. The authors show that the MNLE approach is equally accurate but orders of magnitude more efficient than a recent proposal, likelihood approximation networks (LAN), on two variants of the drift-diffusion model (a widely used model in cognitive neuroscience). The comparison between LAN and MNLE approaches could be improved to strengthen the merits of the proposed approach over existing alternatives.

    (This preprint has been reviewed by eLife. We include the public reviews from the reviewers here; the authors also receive private feedback with suggested changes to the manuscript. Reviewer #1 and Reviewer #2 agreed to share their name with the authors.)

This article has been Reviewed by the following groups

Read the full article See related articles

Abstract

Inferring parameters of computational models that capture experimental data is a central task in cognitive neuroscience. Bayesian statistical inference methods usually require the ability to evaluate the likelihood of the model—however, for many models of interest in cognitive neuroscience, the associated likelihoods cannot be computed efficiently. Simulation-based inference (SBI) offers a solution to this problem by only requiring access to simulations produced by the model. Previously, Fengler et al. introduced likelihood approximation networks (LANs, Fengler et al., 2021) which make it possible to apply SBI to models of decision-making but require billions of simulations for training. Here, we provide a new SBI method that is substantially more simulation efficient. Our approach, mixed neural likelihood estimation (MNLE), trains neural density estimators on model simulations to emulate the simulator and is designed to capture both the continuous (e.g., reaction times) and discrete (choices) data of decision-making models. The likelihoods of the emulator can then be used to perform Bayesian parameter inference on experimental data using standard approximate inference methods like Markov Chain Monte Carlo sampling. We demonstrate MNLE on two variants of the drift-diffusion model and show that it is substantially more efficient than LANs: MNLE achieves similar likelihood accuracy with six orders of magnitude fewer training simulations and is significantly more accurate than LANs when both are trained with the same budget. Our approach enables researchers to perform SBI on custom-tailored models of decision-making, leading to fast iteration of model design for scientific discovery.

Article activity feed

  1. Evaluation Summary:

    This paper provides a new approach, Mixed Neural Likelihood Estimator (MNLE) to build likelihood emulators for simulation-based models where the likelihood is unavailable. The authors show that the MNLE approach is equally accurate but orders of magnitude more efficient than a recent proposal, likelihood approximation networks (LAN), on two variants of the drift-diffusion model (a widely used model in cognitive neuroscience). The comparison between LAN and MNLE approaches could be improved to strengthen the merits of the proposed approach over existing alternatives.

    (This preprint has been reviewed by eLife. We include the public reviews from the reviewers here; the authors also receive private feedback with suggested changes to the manuscript. Reviewer #1 and Reviewer #2 agreed to share their name with the authors.)

  2. Reviewer #1 (Public Review):

    This work is an important contribution to simulator-based inference, substantially improving over previous work by Fengler et al. (2021) with ideas from other modern work in likelihood-free inference (largely driven by the authors' group).

    The authors provide a technique, Mixed Neural Likelihood Estimator (MNLE) to efficiently build a likelihood emulator (i.e., an approximation of the likelihood function) for simulator-based models for which typically the likelihood is unavailable. The strength of this approach is that then the emulated likelihood then can be flexibly plugged in whenever a likelihood is needed, using *any* desired inference schema and hierarchical structure at inference time. Moreover, it is important to note that unlike other likelihood-free inference approaches, this method is learning an emulator of the likelihood per trial (or per observation), i.e., there are no summary statistics involved and modulo approximation errors this method could match exact inference (unlike methods based on non-sufficient summary statistics).

    Another thing to note is that, unlike previous work from the authors, this approach amortizes the training of the likelihood emulator (which needs to be done once per model), but does not amortize inference itself (i.e., the modeller still needs to run MCMC or any other inference method).

    MNLE is similar in spirit to the likelihood approximation networks (LANs) proposed by Fengler et al. (2021), but arguably better in any aspect. As well-argued by this paper, both in principle and empirically, the main issue of LANs is that they use density estimation to estimate the likelihood *independently* for each parameter setting, and then train a neural network to effectively interpolate between these density estimates. Instead, MNLE uses *conditional* density estimation which trains a density network while sharing information across different parameters settings, an approach that is orders of magnitude more efficient. MNLE performs conditional estimation with mixed observations (discrete and continuous) by first learning a model of the discrete variables and then a model of the continuous variables conditioned on the discrete observations.

    On top of the humongous gain in efficiency of the training (10^5 simulations required for training MNLE vs 1011 for LAN), the paper shows that MNLE perform at least as well (and often better) than LANs on a variety of quality-of-approximation and quality-of-inference metrics (e.g., error, calibration, etc.). The authors also show results with a very useful technique, simulation-based calibration (SBC; Talts et al. 2018), which should become a gold standard.

    Important limitations that are worth highlighting and could perhaps be discussed a bit more explicitly in the paper are:
    - The current example models (drift-diffusion models) have a very low-dimensional observation structure (one binary observation + one continuous observation per trial). This limitation is not necessarily a problem as many models in cognitive neuroscience have a very low-dimensional observation structure just like the one used in this example (one discrete + one continuous variable), but it is worth mentioning.
    - The method works for i.i.d. data. Any additional structure/dependence (e.g., adding parameters to characterize the stimulus shown in the trial) effectively increases the dimensionality of the likelihood approximation the network needs to learn. For reference, the current examples explore models with medium-low dimensionality (4-5 dimensions). This is mentioned briefly in Section 4.4.
    - Related to the two points above, the study does not truly discuss nor explore issues of scalability of the method. While previous related work (by some of the authors) has shown remarkable results, such as the ability to infer posteriors up to ~30 parameters (with carefully selected and tuning of the neural architecture; Gonçalves et al., 2020), the scalability of the current approach is not analyzed here.
    - Also related, like any neural-network based approach, it seems there is some art (and brute search) required in selecting the hyperparameters and architecture of the network, and it's unclear how much the method can be applied out-of-the-box for different models. For example, in Section 4.5 the authors say that they started with standard hyperparameter choices, but ended up having to perform a hyperparameter search over multiple dimensions (number of hidden layers, hidden units, neural spline transforms, spline bins) to achieve the results presented in the paper. In short, while training an MNLE *given the hyperparameters* might require a small number of simulations (10^5), in practice we would need to account for the further cost of finding the correct hyperparameters for training the emulator (presumably, requiring an exploration of at least 10-100 hyperparameter setting).

    Notably, all these limitations also apply to LANs, so while it is important to acknowledge these, it is also clear that MNLE is a radical improvement over the previous approach along any axis.

    In short, this is a strong contribution to the field of computational methods for statistical inference in the sciences (here applied to a common class of models in cognitive neuroscience) and I expect this method and others built on top of or inspired by it will have a large impact in the field, even more so since all code has been made available by the authors.

  3. Reviewer #2 (Public Review):

    This paper describes and evaluates a novel inference algorithm for "likelihood-free" models, i.e. models whose likelihood is difficult or impossible to evaluate analytically. A paradigmatic example, in the field of cognitive neuroscience, of a likelihood-free model, is a variant of drift-diffusion models (DDM) that has collapsing boundaries. In this case, no analytical solution to the joint probability of response time and choices is available, and one has to resort to computationally-intensive numerical methods. However, this computational burden typically prevents the use of such models for quantitative data analysis. In this work, authors propose a generic method for solving efficiently such problems, with the hope of significantly broadening the family of models that can be employed in computational cognitive neuroscience. Authors called their method "Mixed Neural Likelihood Estimation" or MNLE. They compare MNLE to a similar method called "Likelihood Approximation Network" (or LAN), which was recently proposed to solve the exact same sort of problem, in a very similar manner. The main result of the work is that MNLE seems to be more efficient than LAN.

    Let me first reiterate the value I see with methods such as LAN or MNLE. In brief, I believe that the field of cognitive neuroscience currently suffers from the fact that researchers perform data analyses using a very restricted class of models: namely, those models that enjoy computationally-cheap likelihoods. This is problematic because this eventually reduces the investigators' scientific creativity in many ways. In fact, some -otherwise arguably overly simplistic- models have now become standards in the field, most likely because they are the only ones that can be easily fitted to empirical data. The "vanilla" DDM is such an example. In retrospect, it is somewhat absurd that even minor modifications to this model cannot be considered in most related empirical studies. Methods such as LAN or MNLE propose elegant (and generic) solutions to this problem. This makes the authors' contribution important, timely, and hence likely to be very well received by the community.

    Now, the MNLE method does not do anything more than the LAN method. In fact, its core algorithmic principles are exactly identical to LAN. Both methods start with the premise that the cumbersome stochastic model simulations that are required for deriving a model's likelihood function are essentially the same, irrespective of the empirical data. In turn, both methods approximate the model's likelihood function by training artificial neural networks (ANNs) on massive model simulations that are performed beforehand, once and for all. When fitting model parameters on a given empirical dataset, one then uses the trained ANNs to compute the likelihood function, which is orders of magnitude cheaper than simulating the model. The only difference between MNLE and LAN lies in the way the underlying ANNs are trained.

    This has one main implication, which triggers the main concern I have with this paper. In my view, the value of MNLE depends upon whether it is more efficient (or more robust) than LAN. However, authors report little evidence for this. This is why I think this paper would make a significant contribution to the field, provided authors improve their comparison of MNLE and LAN methods.