Select Page

Monolix 2018R1 User guide

  1. Monolix Documentation
  2. Data and models
    1. Defining a data set
    2. Creating and using models
      1. Libraries of models
      2. Outputs and Tables
    3. Models for continuous outcomes
      1. Residual error model
      2. Handling censored (BLQ) data
      3. Mixture of structural models
    4. Models for non continuous outcomes
      1. Time-to-event data models
      2. Count data model
      3. Categorical data model
    5. Joint models for multivariate outcomes
      1. Joint models for continuous outcomes
      2. Joint models for non continuous outcomes
    6. Models for the individual parameters
      1. Introduction
      2. Probability distribution of the individual parameters
      3. Model for individual covariates
      4. Inter occasion variability (IOV)
      5. Mixture of distributions
    7. Pharmacokinetic models
      1. PK model: single route of administration
      2. PK model: multiple routes of administration
      3. From multiple doses to steady-state
    8. Extensions
      1. Using regression variables
      2. Bayesian estimation
      3. Delayed differential equations
  3. Tasks
    1. Initialization
    2. Population parameter estimation using SAEM
    3. Conditional distribution
    4. EBEs
    5. Standard error using the Fisher Information Matrix
    6. Log Likelihood estimation
    7. Algorithms convergence assessment
    8. What result files are generated by Monolix?
    9. Tests
    10. Monolix API
      1. API concerning the covariate models
      2. API concerning the observation models
      3. API concerning the population parameters
      4. API concerning the individual parameter models
      5. API concerning the scenario
      6. API concerning the results
      7. API concerning the project management
      8. API concerning the settings
  4. Plots
    1. Data
      1. Observed data
    2. Model for the observations
      1. Individual fits
      2. Observation versus Prediction
      3. Scatter plot of the residuals
      4. Distribution of the residuals
    3. Model for the individual parameters
      1. Distribution of the individual parameters
      2. Distribution of the random effects
      3. Correlation between the random effects
      4. Individual parameters versus covariates
    4. Predictive checks and predictions
      1. Visual predictive checks
      2. Numerical predictive checks
      3. BLQ predictive checks
      4. Prediction distribution
    5. Tasks results
      1. Likelihood contribution
      2. Standard errors of the estimates
    6. Convergence diagnosis
      1. SAEM
      2. MCMC
      3. Importance sampling
    7. Export charts
  5. FAQ
    1. Evolutions from Monolix2016R1 to Monolix2018R1
    2. Submission of Monolix analysis to regulatory agencies
    3. Running Monolix using a command line
    4. How to compute AUC using in Monolix and Mlxtran
    5. How to export to Datxplore, Mlxpore and Simulx ?

1.Monolix Documentation

Version 2018

This documentation is for Monolix Suite 2018.
©Lixoft

Monolix

Monolix (Non-linear mixed-effects models or “MOdèles NOn LInéaires à effets miXtes” in French) is a platform of reference for model based drug development. It combines the most advanced algorithms with unique ease of use. Pharmacometricians of preclinical and clinical groups can rely on Monolix for population analysis and to model PK/PD and other complex biochemical and physiological processes. Monolix is an easy, fast and powerful tool for parameter estimation in non-linear mixed effect models, model diagnosis and assessment, and advanced graphical representation. Monolix is the result of a ten years research program in statistics and modeling, led by Inria (Institut National de la Recherche en Informatique et Automatique) on non-linear mixed effect models for advanced population analysis, PK/PD, pre-clinical and clinical trial modeling & simulation.

Objectives

The objectives of Monolix are to perform:

  1. Parameter estimation for nonlinear mixed effects models
  2. Model selection and diagnosis
  3. Easy description of pharmacometric models (PK, PK-PD, discrete data) with the Mlxtran language
  4. Goodness of fit plots

An interface for ease of use

Monolix can be used either via a graphical user interface (GUI) or a command-line interface (CLI) for powerful scripting. This means less programming and more focus on exploring models and pharmacology to deliver in time. The interface is depicted as follows:

The GUI consists of 7 tabs.

Each of these tabs refer to a specific section on this website. An advanced description of available plots is also provided.

2.Data and models

In the following, all demos of Monolix are presented. They were build to explore all functionalities of Monolix in terms of model creations, continuous and non continuous outcomes management, joint models for multivariate outcomes, models for the individual parameters, pharmacokinetic models, and some extensions.

Defining a data set

Creating and using models

  • Libraries of models: learn how to use the Monolix libraries of PKPD models and create your own libraries.
  • Use your own model: learn how to use your own libraries created from scratch or from the libraries.
  • Outputs and Tables: learn how to define outputs and create tables with selected outputs of the model.
  • Residual error model: learn how to use the predefined residual error models.
  • Handling censored data: learn how to handle easily and properly censored data, i.e. data below (resp. above) a lower (resp.upper) limit of quantification (LOQ) or detection (LOD).
  • Mixture of structural models: learn how to implement between subject mixture models (BSMM) and within subject mixture models (WSMM).
  • Time-to-event data model: learn how to implement a model for (repeated) time-to-event data.
  • Count data model: learn how to implement a model for count data, including hidden Markov model.
  • Categorical data model: learn how to implement a model for categorical data, assuming either independence or a Markovian dependence between observations.

2.1.Defining a data set

Data set structure

The data set structure contains for each subject measurements, dose regimen, covariates etc … i.e. all collected information. The data must be in the long format, i.e each line corresponds to one individual and one time point. Different type of information (dose, observation, covariate, etc) are recorded in different columns, which must be tagged with a column type (see below). The column types are very similar and compatible with the structure used by the Nonmem software (the differences are listed here). This is specified when the user defines each column type in the data set as in the following picture.

Notice that Monolix often provides an initial guess of the type of the column depending on the name.
In addition, we have a button DATA VIEWER that allows to explore the data set as Datxplore.

Description of column-types

The first line of the data set must be a header line, defining the names of the columns. The columns names are completely free. In the MonolixSuite applications, when defining the data, the user will be asked to assign each column to a column-type (see here for an example of this step). The column type will indicate to the application how to interpret the information in that column. The available column types are given below:

Column-types used for all types of lines:

Column-types used for response-lines:

Column-types used for dose-lines:

 

Labeling

The name proposed in the figure and in the data choice is the one defined in the label. The user can modify it. By default, the label used is the one defined in the data set.

 

Loading a new data set

To load a new data set, you have to go to “Browse” your data set (green frame), tag all the columns (blue frame), and click on the button ACCEPT (purple frame) as on the following.

 

Observation type

There are three types of observations

  • continuous: The observation is continuous with respect to time. For example, a concentration is a continuous observation.
  • discrete: The observation values takes place in a finite categorical space. For example, the observation can be a categorical observation (an effect can be observed as low, medium, high) or a count observation over a defined time (the number of epileptic crisis in a defined time).
  • event: The observation is an event, for example the occurring of an epileptic crisis.

The type of observations can be specified by the user in the interface.

2.2.Creating and using models

2.2.1.Libraries of models


Objectives: learn how to use the Monolix libraries of models and use your own models.


Projects: theophylline_project, PDsim_project, warfarinPK_project, TMDD_project, LungCancer_project, hcv_project


For the definition of the structural model, the user can either select a model from the available model libraries or write a model itself using the Mlxtran language.
Discover how to easily choose a model from the libraries via step-by-step selection of its characteristics. An enriched PK, a PD, a joint PKPD, a target-mediated drug disposition (TMDD), and a time to-event (TTE) library are now available.







Model libraries

Five different model libraries are available in Monolix, which we will detail below. To use a model from the libraries, in the Structural model tab, click on Load from library and select the desired library. A list of model files appear, as well as a menu to filter them.  Use the filters and indications in the file name (parameters names) to select the model file you need.

The model files are simply text files that contain pre-written models in Mlxtran language. Once selected, the model appears in the Monolix GUI. Below we show the content of the (ka,V,Cl) model:

The PK library

  • theophylline_project (data = ‘theophylline_data.txt’ , model=’lib:oral1_1cpt_kaVCl.txt’)

The PK library includes model with different administration routes (bolus, infusion, first-order absorption, zero-order absorption, with or without Tlag), different number of compartments (1, 2 or 3 compartments), and different types of eliminations (linear or Michaelis-Menten). More details, including the full equations of each model, can be found on the dedicated page for the model libraries.

The PK library models can be used with single or multiple doses data, but they allow one type of administration in the data set (only oral or only bolus, but not some individuals with bolus and some with oral for instance). If a model for several types of administrations is needed, see below “Using my own model”.

 

 

The PD library

  • PDsim_project (data = ‘PDsim_data.txt’ , model=’lib:immed_Emax_const.txt’)

The PD model library contains direct response models such as Emax and Imax with various baseline models, and turnover response models. These models are PD models only and the drug concentration over time must be defined in the data set and passed as a regressor. The logic of the file names and the full equations are described on the PK/PD model library webpage.

 

 

The PKPD library

  • warfarinPKPD_project (data = ‘warfarin_data.txt’, model = ‘lib:oral1_1cpt_IndirectModelInhibitionKin_TlagkaVClR0koutImaxIC50.txt’)

The PKPD library contains joint PKPD models, which correspond to the combination of the models from the PK and from the PD library. These models contain two outputs, and thus require the definition of two observation identifiers (i.e two different values in the OBSERVATION ID column).

 

 

The TMDD library

  • TMDD_project (data = ‘TMDD_dataset.csv’ , model=’lib:bolus_2cpt_MM_VVmKmClQV2_outputL.txt’)

The TMDD library contains various models for molecules displaying target-mediated drug disposition (TMDD). It includes models with different administration routes (bolus, infusion, first-order absorption, zero-order absorption, bolus + first-order absorption, with or without Tlag), different number of compartments (1, or 2 compartments), different types of TMDD models (full model, MM approximation, QE/QSS approximation, etc), and different types of output (free ligand or total free+bound ligand). More details about the library and guidelines to choose model can be found on the dedicated TMDD documentation page.

 

 

The TTE library

  • LungCancer_project (data = ‘lung_cancer_survival.csv’ , model=’lib:gompertz_model_singleEvent.txt’)

The TTE library contains typical parametric models for time-to-event (TTE) data. TTE models are defined via the hazard function, in the library we provide exponential, Weibull, log-logistic, uniform, Gompertz, gamma and generalized gamma models, for data with single (e.g death) and multiple events (e.g seizure) per individual. More details and modeling guidelines can be found on the TTE dedicated webpage, along with case studies.

 

 

Step-by-step example with the PK library

  • theophylline_project (data = ‘theophylline_data.txt’ , model=’lib:oral1_1cpt_kaVCl.txt’)

We would like to set up a one compartment PK model with first order absorption and linear elimination for the theophylline data set. We start by creating a new Monolix project. Next, the the Data tab, we click browse, and select the theophylline data set (which can be downloaded from the data set documentation webpage). In this examples, all columns are already automatically tagged, based on the header names. We click ACCEPT and NEXT and arrive on the Structural model tab, click on LOAD FROM LIBRARY to choose a model from the Monolix libraries. The menu at the top permit to filter the list of models: after selecting an oarl/extravascular administration, no delay, first-order absorption, one compartment and a linear elimination, two models remain in the list (ka,V,Cl) and (ka,V,k). Click on the oral1_1cpt_kaVCl.txt file to select it.

After this step, the GUI moves to the Initial Estimates tab, but it is possible to go back to the Structural model tab to see the content of the file:

[LONGITUDINAL]
input = {ka, V, Cl}

EQUATION:
Cc = pkmodel(ka, V, Cl)

OUTPUT:
output = Cc

Back to the Initial Estimates tab, the initial values of the population parameters can be adjusted by comparing the model prediction using the chosen population parameters and the individual data. Click on SET AS INITIAL VALUES when you are done.

In the next tab, the Statistical model & Tasks tab, we propose by default:

At this stage, the monolix project should be saved. This creates a human readable text file with extension .mlxtran, which contains all the information defined via the GUI. In particular, the name of the model appears in the section [LONGITUDINAL] of the saved project file:

<MODEL>
[INDIVIDUAL]
input = {ka_pop, omega_ka, V_pop, omega_V, Cl_pop, omega_Cl}

DEFINITION:
ka = {distribution=lognormal, typical=ka_pop, sd=omega_ka}
V = {distribution=lognormal, typical=V_pop, sd=omega_V}
Cl = {distribution=lognormal, typical=Cl_pop, sd=omega_Cl}

[LONGITUDINAL]
input = {a, b}
file = 'lib:oral1_1cpt_kaVCl.txt'

DEFINITION:
CONC = {distribution=normal, prediction=Cc, errorModel=combined1(a,b)}

2.2.2.Outputs and Tables


Objectives: learn how to define outputs and create tables from the outputs of the model.


Projects: tgi_project, tgiWithTable_project


About the OUTPUT block

  • tgi_project (data = ‘tgi_data.txt’ , model=’tgi_model.txt’)

We use the Tumor Growth Inhibition (TGI) model proposed by Ribba et al. in this example (Ribba, B., Kaloshi, G., Peyre, M., Ricard, D., Calvez, V., Tod, M., . & Ducray, F., A tumor growth inhibition model for low-grade glioma treated with chemotherapy or radiotherapy. Clinical Cancer Research, 18(18), 5071-5080, 2012.)

DESCRIPTION: Tumor Growth Inhibition (TGI) model proposed by Ribba et al
A tumor growth inhibition model for low-grade glioma treated with chemotherapy or radiotherapy. 
Clinical Cancer Research, 18(18), 5071-5080, 2012.

Variables 
- PT: proliferative equiescent tissue
- QT: nonproliferative equiescent tissue
- QP: damaged quiescent cells 
- C:  concentration of a virtual drug encompassing the 3 chemotherapeutic components of the PCV regimen

Parameters
- K      : maximal tumor size (should be fixed a priori)
- KDE    : the rate constant for the decay of the PCV concentration in plasma
- kPQ    : the rate constant for transition from proliferation to quiescence
- kQpP   : the rate constant for transfer from damaged quiescent tissue to proliferative tissue 
- lambdaP: the rate constant of growth for the proliferative tissue
- gamma  : the rate of damages in proliferative and quiescent tissue
- deltaQP: the rate constant for elimination of the damaged quiescent tissue
- PT0    : initial proliferative equiescent tissue
- QT0    : initial nonproliferative equiescent tissue

[LONGITUDINAL]
input = {K, KDE, kPQ, kQpP, lambdaP, gamma, deltaQP, PT0, QT0}

PK:
depot(target=C)

EQUATION:
; Initial conditions
t0    = 0
C_0   = 0
PT_0  = PT0
QT_0  = QT0
QP_0  = 0

; Dynamical model
PSTAR   = PT + QT + QP
ddt_C   = -KDE*C
ddt_PT  = lambdaP*PT*(1-PSTAR/K) + kQpP*QP - kPQ*PT - gamma*KDE*PT*C
ddt_QT  = kPQ*PT - gamma*KDE*QT*C
ddt_QP  = gamma*KDE*QT*C - kQpP*QP - deltaQP*QP

OUTPUT:
output = PSTAR

PSTAR is the tumor size predicted by the model. It is therefore used as a prediction for the observations in the project.
At the end of the scenario or of SAEM, individual predictions of the tumor size PSTAR are computed using the individual parameters available. Thus, individual predictions of the tumor size PSTAR are computed using both the conditional modes (indPred_mode), the conditional mean (indPred_mean), and the conditional means estimated during the last iterations of SAEM (indPred_SAEM) and saved in table predictions.txt.
Notice that the population prediction is also proposed.

Remark: the same model file tgi_model.txt can be used with different tools, including Mlxplore or Simulx (see this Shiny application for instance).

Add additional outputs in tables





  • tgiWithTable_project (data = ‘tgi_data.txt’ , model=’tgiWithTable_model.txt’)

We can save in the tables additional variables defined in the model, such as PT, Q and QP, for instance, by adding a block OUTPUT: in the model file:

OUTPUT:
output = PSTAR
table  = {PT, QT, QP}

An additional file tables.txt now include the predicted values of these variables for each individual (columns PT_SAEM, QT_SAEM, QP_SAEM, PT_mean, QT_mean, QP_mean, PT_mode, QT_mode, and QP_mode).

Notice that only continuous variable are possible for variable in table.

Good to know: it is not allowed to do calculations directly in the output or table statement. The following if not example not possible:

; not allowed:
OUTPUT:
output = {Cser+Ccsf}

It has to be replaced by:

EQUATION:
Ctot = Cser+Ccsf
OUTPUT:
output = {Ctot}

 

2.3.Models for continuous outcomes

2.3.1.Residual error model


Objectives: learn how to use the predefined residual error models.


Projects: warfarinPKlibrary_project, bandModel_project, autocorrelation_project, errorGroup_project





Introduction

For continuous data, we are going to consider scalar outcomes (\(y_{ij} \in \mathbb{R}\)) and assume the following general model:

$$y_{ij}=f(t_{ij},\psi_i)+ g(t_{ij},\psi_i,\xi)\varepsilon_{ij}$$

for i from 1 to N, and j from 1 to \(\text{n}_{i}\), where \(\psi_i\) is the parameter vector of the structural model f for individual i. The residual error model is defined by the function g which depend on some additional vector of parameters \(\xi\). The residual errors \((\varepsilon_{ij})\) are standardized Gaussian random variables (mean 0 and standard deviation 1). In this case, it is clear that \(f(t_{ij}, \psi_i)\) and \(g(t_{ij}, \psi_i, \xi)\) are the conditional mean and standard deviation of \(y_{ij}\), i.e.,

$$\mathbb{E}(y_{ij} | \psi_i) = f(t_{ij}, \psi_i)~~\textrm{and}~~\textrm{sd}(y_{ij} | \psi_i)= g(t_{ij}, \psi_i, \xi)$$

Available error models

In Monolix, we only consider the function g to be a function of the structural model f, i.e. \(g(t_{ij}, \psi_i, \xi)= g(f(t_{ij}, \psi_i), \xi)\)  leading to an expression of the observation model of the form

$$y_{ij}=f(t_{ij},\psi_i)+ g(f(t_{ij}, \psi_i), \xi)\varepsilon_{ij}$$

The following error models are available:

  • constant : \(y = f + a \varepsilon\). The function g is constant, and the additional parameter is \(\xi=a\)
  • proportional : \(y = f + bf^c \varepsilon\). The function g is proportional to the structural model f, and the additional parameter are \(\xi = (b,c)\). By default, the parameter c is fixed at 1 and  the additional parameter is \xi = b.
  • combined1 : \(y = f + (a+ bf^c) \varepsilon\). The function g is a linear combination of a constant term and a term proportional to the structural model f, and the additional parameter are \(\xi = (a, b)\) (by default, the parameter c is fixed at 1).
  • combined2 : \(y = f + \sqrt{a^2+ b^2(f^c)^2} \varepsilon\). The function g is a combination of a constant term and a term proportional to the structural model f (g = bf^c), and the additional parameter are \(\xi = (a, b)\) (by default, the parameter c is fixed at 1).

Notice that the parameter c is fixed to 1 by default. However, it can be unfixed and estimated.
The assumption that the distribution of any observation \(y_{ij}\) is symmetrical around its predicted value is a very strong one. If this assumption does not hold, we may want to transform the data to make it more symmetric around its (transformed) predicted value. In other cases, constraints on the values that observations can take may also lead us to transform the data.

Available transformations

Model can be extended to include a transformation of the data:

$$u(y_{ij})=u(f(t_{ij},\psi_i)) + g(u(f(t_{ij},\psi_i)) ,\xi) $$

As we can see, both the data \(y_{ij}\) and the structural model are transformed by the function u so that \(f(t_{ij}, \psi_i)\) remains the prediction of \(y_{ij}\). Classical distributions are proposed as transformation:

  • normal: u(y) = y. This is equivalent to no transformation.
  • lognormalu(y) = log(y). Thus, for a combined error model for example, corresponding observation model writes \(\log(y) = \log(f) + (a + b\log(f)) \varepsilon\). It assumes that all observations are strictly positive. Otherwise, an error message is thrown. In case of censored data with a limit, the limit has to be strictly positive too.
  • logitnormalu(y) = log(y/(1-y)). Thus, for a combined error model for example, corresponding observation model writes \(\log(y/(1-y)) = \log(f/(1-f)) + (a + b\log(f/(1-f)))\varepsilon\). It assumes that all observations are strictly between 0 and 1. However, we can to modify these bounds and not “impose” to be 0 and 1, i.e. to define the logit function between a minimum and a maximum and the function u becomes u(y) = log((y-y_min)/(y_max-y)). Again, in case of censored data with a limit, the limits has to be strictly in the proposed interval too.

Any interrogation on what is the formula behind your observation model? There is a button FORMULA on the interface as on the figure below where the observation model is described linking the observation (named CONC in that case) and the prediction (named Cc in that case). Note that \(\epsilon\) is noted e here.

Remarks: In previous Monolix version, only the error was available. Thus, what happens to the errors that are not proposed anymore? Is it possible to have “exponential”, “logit”, “band(0,10)”, and “band(0,100)”? Yes, in this version, we choose to split the observation model between its error model and its distribution. The purpose is to have a more unified vision of models and increase the number of possibilities. Thus, here is how to configure new projects with the previous error model definition.

  • “exponential” is an observation model with a constant error model and a lognormal distribution.
  • “logit” is an observation model with a constant error model and a logitnormal distribution.
  • “band(0,10)” is an observation model with a constant error model and a logitnormal distribution with min and max at 0 and 10 respectively.
  • “band(0,100)” is an observation model with a constant error model and a logitnormal distribution with min and max at 0 and 10 respectively.

 Defining the residual error model from the Monolix GUI

A menu in the frame Statistical model|Tasks of the main GUI allows one to select the both the error model and the distribution as on the following figure (in green and blue respectively)

A summary of the statistical model which includes the residual error model can be displayed by clicking on the button formula.

Some basic residual error models

  • warfarinPKlibrary_project (data = ‘warfarin_data.txt’, model = ‘lib:oral1_1cpt_TlagkaVCl.txt’)

The residual error model used with this project for fitting the PK of warfarin is a combined error model, i.e. \(y_{ij} = f(t_{ij}, \psi_i))+ (a+bf(t_{ij}, \psi_i)))\varepsilon_{ij}\)Several diagnosis plots can then be used for evaluating the error model. The observation versus prediction figure below seems ok.

Remarks:

  • Figures showing the shape of the prediction interval for each observation model available in Monolix are displayed here.
  • When the residual error model is defined in the GUI, a bloc DEFINITION: is then automatically added to the project file in the section [LONGITUDINAL] of <MODEL> when the project is saved:
DEFINITION:
y1 = {distribution=normal, prediction=Cc, errorModel=combined1(a,b)}

Residual error models for bounded data

  • bandModel_project (data = ‘bandModel_data.txt’, model = ‘lib:immed_Emax_null.txt’)

In this example, data are known to take their values between 0 and 100. We can use a constant error model and a logitnormal for the transformation with bounds (0,100) if we want to take this constraint into account.

In the Observation versus prediction plot, one can see that the error is smaller when the observations are close to 0 and 100 which is normal. To see the relevance of the predictions, one can look at the 90% prediction interval. Using a logitnormal distribution, we have a very different shape of this prediction interval to take that specificity into account.

VPCs obtained with this error model do not show any mispecification

This residual error model is implemented in Mlxtran as follows:

DEFINITION:
effect = {distribution=logitnormal, min=0, max=100, prediction=E, errorModel=constant(a)}

Autocorrelated residuals

For any subject i, the residual errors \((\varepsilon_{ij},1 \leq j \leq n_i)\) are usually assumed to be independent random variables. The extension to autocorrelated errors is possible by assuming, that \((\varepsilon_{ij})\) is a stationary autoregressive process of order 1, AR(1), which autocorrelation decreases exponentially:

$$ \textrm{corr}(\varepsilon_{ij},\varepsilon_{i,{j+1}}) = r_i^{(t_{i,j+1}-t_{ij})}$$

where \(0 \leq r_i \leq 1\) for each individual i. If \(t_{ij}=j\) for any (i,j), then \(t_{i,j+1}-t_{i,j}=1\) and the autocorrelation function \(\gamma_i\)for individual i is given by

$$\gamma_i(\tau) = \textrm{corr}(\varepsilon_{ij}, \varepsilon_{i,j+\tau}) = r_i^{\tau}$$

The residual errors are uncorrelated when \(r_i=0\).

  • autocorrelation_project (data = ‘autocorrelation_data.txt’, model = ‘lib:infusion_1cpt_Vk.txt’)

Autocorrelation is estimated since the checkbox r is ticked in this project:Estimated population parameters now include the autocorrelation r:
Important remarks:

  • Monolix accepts both regular and irregular time grids.
  • For estimating properly the autocorrelation structure of the residual errors, rich data are required (i.e. a large number of time points per individual) .
  • To add autocorrelation, the user should either use the connectors, or write it directly in the Mlxtran
    • add “autoCorrCoef=r” in definition “DV = {distribution=normal, prediction=Cc, errorModel=proportional(b), autoCorrCoef=r}”  for example
    • add “r” as an input parameter.

Using different error models per group/study

  • errorGroup_project (data = ‘errorGroup_data.txt’, model = ‘errorGroup_model.txt’)

Data comes from 3 different studies in this example. We want to have the same structural model but use different error models for the 3 studies. A solution consists in defining the column STUDY with the reserved keyword OBSERVATION ID. It will be then possible to define one error model per outcome:
We use here the same PK model for the 3 studies:

[LONGITUDINAL]
input = {V, k}

PK:
Cc1 = pkmodel(V, k)
Cc2 = Cc1
Cc3 = Cc1

OUTPUT:
output = {Cc1, Cc2, Cc3}

Since 3 outputs are defined in the structural model, we can now define 3 error models in the GUI:
Different residual error parameters are estimated for the 3 studies. We can remark than, even if 2 proportional error models are used for the 2 first studies, different parameters b1 and b2 are estimated:

2.3.2.Handling censored (BLQ) data







Objectives: learn how to handle easily and properly censored data, i.e. data below (resp. above) a lower (resp.upper) limit of quantification (LOQ) or below a limit of detection (LOD).


Projects: censored1log_project, censored1_project, censored2_project, censored3_project, censored4_project


Introduction

Censoring occurs when the value of a measurement or observation is only partially known. For continuous data measurements in the longitudinal context, censoring refers to the values of the measurements, not the times at which they were taken. For example, the lower limit of detection (LLOD) is the lowest quantity of a substance that can be distinguished from its absence. Therefore, any time the quantity is below the LLOD, the “observation” is not a measurement but the information that the measured quantity is less than the LLOD. Similarly, in longitudinal studies of viral kinetics, measurements of the viral load below a certain limit, referred to as the lower limit of quantification (LLOQ), are so low that their reliability is considered suspect. A measuring device can also have an upper limit of quantification (ULOQ) such that any value above this limit cannot be measured and reported.
As hinted above, censored values are not typically reported as a number, but their existence is known, as well as the type of censoring. Thus, the observation y^{(r)}_{ij} (i.e., what is reported) is the measurement y_{ij} if not censored, and the type of censoring otherwise.
We usually distinguish between three types of censoring: left, right and interval. In each case, the SAEM algorithm implemented in Monolix properly computes the maximum likelihood estimate of the population parameters, combining all the information provided by censored and non censored data.

Theory

In the presence of censored data, the conditional density function needs to be computed carefully. To cover all three types of censoring (left, right, interval), let I_{ij} be the (finite or infinite) censoring interval existing for individual i at time t_{ij}. Then,

$$\displaystyle p(y^{(r)}|\psi)=\prod_{i=1}^{N}\prod_{j=1}^{n_i}p(y_{ij}|\psi_i)^{1_{y_{ij}\notin I_{ij}}}\mathbb{P}(y_{ij}\in I_{ij}|\psi_i)^{1_{y_{ij}\in I_{ij}}}$$

where

$$\displaystyle \mathbb{P}(y_{ij}\in I_{ij}|\psi_i)=\int_{I_{ij}} p_{y_{ij}|\psi_i} (u|\psi_i)du$$

We see that if y_{ij} is not censored (i.e. 1_{y_{ij}\notin I_{ij}}=1), its contribution to the likelihood is the usual p(y_{ij}|\psi_i), whereas if it is censored, the contribution is \mathbb{P}(y_{ij}\in I_{ij}|\psi_i).
For the calculation of the likelihood, this is equivalent to the M3 method in NONMEM when only the CENSORING column is given, and to the M4 method when both a CENSORING column and a LIMIT column are given.

Censoring definition in a data set

To define that a measurement is censored, you have to

  • Set your censored measurement in the OBSERVATION column.
  • Have a CENSORING column and put 1 or – 1 depending if this is a lower or an upper limit.
  • Optionally, have LIMIT column to set the other limit.

If the measurement is not censored, just put 0 in the CENSORING column and the regular value in the OBSERVATION column. Examples are provided below and here.

PK data below a lower limit of quantification

Left censored data

  • censored1log_project (data = ‘censored1log_data.txt’, model = ‘pklog_model.txt’)

PK data are log-concentration in this example. The limit of quantification of 1.8 mg/l for concentrations becomes log(1.8)=0.588 for log-concentrations. Column of observations (Y) contains either the LLOQ for data below the limit of quantification (BLQ data) or the measured log-concentrations for non BLQ data. Furthermore, Monolix uses an additional column CENSORING to indicate if an observation is left censored (CENS=1) or not (CENS=0). In this example, subject 1 has two BLQ data at time 24h and 30h (the measured log-concentrations were below 0.588 at these times):

The plot of individual fits displays BLQ (red band) and non BLQ data (blue dots) together with the predicted log-concentrations (purple line) on the whole time interval:

Notice that the band goes from .8 to -Infinity as no bound has been specified (no LIMIT column was proposed).
For diagnosis plots such as VPC, residuals of observations versus predictions, Monolix samples the BLQ data from the conditional distribution

$$p(y^{BLQ} | y^{non BLQ}, \hat{\psi}, \hat{\theta})$$

where \hat{\theta} and \hat{\psi} are the estimated population and individual parameters. This is done by adding a residual error on top of the prediction, using a truncated normal distribution to make sure that the simulated BLQ remains within the censored interval. This is the most efficient way to take into account the complete information provided by the data and the model for diagnosis plots such as VPCs:


A strong bias appears if LLOQ is used instead for the BLQ data (if you choose LOQ instead of simulated in the display frame of the settings) :

Notice that ignoring the BLQ data entails a loss of information as can be seen below (if you choose no in the “Use BLQ” toggle):

As can be seen below, imputed BLQ data is also used for residuals (IWRES on the left) and for observations versus predictions (on the right)

More on these diagnosis plots

Impact of the BLQ in residuals and observations versus predictions plots

A strong bias appears if LLOQ is used instead for the BLQ data for these two diagnosis plots:

while ignoring the BLQ data entails a loss of information:

BLQ predictive checks

Diagnosis plot BLQ plots the cumulative fraction of BLQ data (green line) with a 90% prediction interval

Interval censored data

  • censored1_project (data = ‘censored1_data.txt’, model = ‘lib:oral1_1cpt_kaVk.txt’)

We use the original concentrations in this project. Then, BLQ data should be treated as interval censored data since a concentration is know to be positive. In other word, a data reported as BLQ data means that the (non reported) measured concentration is between 0 and 1.8mg/l. The value in the observation column 1.8 indicates the value, the value in the CENSORING column indicates that the value in the observation column is the upper bound. An additional column LIMIT reports the lower limit of the censored interval (0 in this example):


Remarks

  • if this column is missing, then BLQ data is assumed to be left-censored data that can take any positive and negative value below LLOQ.
  • the value of the limit can vary between observations of the same subject.

Monolix will use this additional information for properly estimating the parameters of the model and imputing the BLQ data for the diagnosis plots.
Plot of individual fits now displays LLOD at 1.8 with a red band when a PK data is censored. We see that the band lower limit is at 0 as defined in the limit column.

PK data below a lower limit of quantification or below a limit of detection

  • censored2_project (data = ‘censored2_data.txt’, model = ‘lib:oral1_1cpt_kaVk.txt’)

Plot of individual fits now displays LLOQ or LLOD with a red band when a PK data is censored. We see that the band lower limits are depending on the observation.

PK data below a lower limit of quantification and PD data above an upper limit of quantification

  • censored3_project (data = ‘censored3_data.txt’, model = ‘pkpd_model.txt’)

We work with PK and PD data in this project and assume that the PD data may be right censored and that the upper limit of quantification is ULOQ=90. We use CENS=-1 to indicate that an observation is right censored. In such case, the PD data can take any value above the upper limit reported in column Y (here the YTYPE column of type OBSERVED ID defines the type of observation, YTYPE=1 and YTYPE=2 are used respectively for PK and PD data):

Plot of individual fits for the PD data now displays ULOQ and the predicted PD profile:

We can display the cumulative fraction of censored data both for the PK and the PD data (on the left and right respectively):

Combination of interval censored PK and PD data

  • censored4_project (data = ‘censored4_data.txt’, model = ‘pkpd_model.txt’)

We assume in this example

  • 2 different censoring intervals(0,1) and (1.2, 1.8) for the PK,
  • a censoring interval (80,90) and right censoring (>90) for the PD.

Combining columns CENS, LIMIT and Y allow to combine efficiently these different censoring processes:

This coding of the data means that, for subject 1,

  • PK data is between 0 and 1 at time 30h (second blue frame),
  • PK data is between 1.2 and 1.8 at times 0.5h and 24h (first blue frame for time .5h),
  • PD data is between 80 and 90 at times 12h and 16h (second green frame for time 12h),
  • PD data is above 90 at times 4h and 8h (first green frame for time 4h).

Plot of individual fits for the PK and the PD data displays the different limits of these censoring intervals (PK on the left and PD on the right):

Other diagnosis plots, such as the plot of observations versus predictions, adequately use imputed censored PK and PD data:

Case studies

  • 8.case_studies/hiv_project (data = ‘hiv_data.txt’, model = ‘hivLatent_model.txt’)
  • 8.case_studies/hcv_project (data = ‘hcv_data.txt’, model = ‘hcvNeumann98_model_latent.txt’)

2.3.3.Mixture of structural models


Objectives: learn how to implement between subject mixture models (BSMM) and within subject mixture models (WSMM).


Projects: bsmm1_project, bsmm2_project, wsmm_project


Introduction

There exist several types of mixture models useful in the context of mixed effects models. It may be necessary in some situations to introduce diversity into the structural models themselves:

  • Between-subject model mixtures (BSMM) assume that there exist subpopulations of individuals. Different structural models describe the response of each subpopulation, and each subject belongs to one of these subpopulations. One can imagine for example different structural models for responders, nonresponders and partial responders to a given treatment.

The easiest way to model a finite mixture model is to introduce a label sequence (z_i ; 1 \leq i \leq N) that takes its values in {1, 2, \ldots, M} such that z_i = m if subject i belongs to subpopulation m\mathbb{P}(z_i = m) is the probability for subject i to belong to subpopulation m. A BSMM assumes that the structural model is a mixture of M different structural models:

$$f\left(t_{ij}; \psi_i, z_i \right) = \sum_{m=1}^M 1_{z_i = m} f_m\left( t_{ij}; \psi_i \right) $$

In other word, each subpopulation has its own structural model: f_m is the structural model for subpopulation m.

  • Within-subject model mixtures (WSMM) assume that there exist subpopulations (of cells, viruses, etc.) within each patient. In this case, different structural models can be used to describe the response of different subpopulations, but the proportion of each subpopulation depends on the patient.

Then, it makes sense to consider that the mixture of models happens within each individual. Such within-subject model mixtures require additional vectors of individual parameters \pi_i=(\pi_{1,i}, \ldots, \pi_{M,i}) representing the proportions of the M models within each individual i:

$$f\left( t_{ij}; \psi_i, z_i \right) = \sum_{m=1}^M \pi_{m,i} f_m\left( t_{ij}; \psi_i \right)$$

The proportions (\pi_{m,i}) are now individual parameters in the model and the problem is transformed into a standard mixed effects model. These proportions are assumed to be positive and to sum to 1 for each patient.

Between subject mixture models

Supervised learning

  • bsmm1_project (data = ‘pdmixt1_data.txt’, model = ‘bsmm1_model.txt’)

We consider a very simple example here with two subpopulations of individuals who receive a given treatment. The outcome of interest is the measured effect of the treatment (a viral load for instance). The two populations are non responders and responders. We assume here that the status of the patient is known. Then, the data file contains an additional column GROUP. This column is duplicated because Monolix uses it

  • i) as a regression variable (REGRESSOR): it is used in the model to distinguish responders and non responders,
  • ii) as a categorical covariate (CATEGORICAL COVARIATE): it is used to stratify the diagnosis plots.

We can then display the data

and use the categorical covariate GROUP_CAT to split the plot into responders and non responders:
We use different structural models for non responders and responders. The predicted effect for non responders is constant f(t) = A1 while the predicted effect for responders decreases exponentially f(t) = A2 exp(-kt).

The model is implemented in the model file bsmm1_model.txt (remark that the names of the regression variable in the data file and in the model script do not need to match):

[LONGITUDINAL]
input = {A1, A2, k, g}
g = {use=regressor}

EQUATION:
if g==1
   f = A1
else
   f = A2*exp(-k*max(t,0))
end

OUTPUT:
output = f

The plot of individual fits exhibit the two different structural models:

VPCs should then be splitted according to the GROUP_CAT

as well as the prediction distribution for non responders and responders:

Unsupervised learning

  • bsmm2_project (data = ‘pdmixt2_data.txt’, model = ‘bsmm2_model.txt’)

The status of the patient is unknown in this project (which means that the column GROUP is not available anymore). Let p be the proportion of non responders in the population. Then, the structural model for a given subject is f1 with probability p and f2 with probability 1-p. The structural model is therefore a BSMM:

[LONGITUDINAL]
input = {A1, A2, k, p}

EQUATION:
f1 = A1
f2 = A2*exp(-k*max(t,0))
f  = bsmm(f1, p, f2, 1-p)

OUTPUT:
output = f

Important: p is a population parameter of the model to estimate. There is no inter-patient variability on p: all the subjects have the same probability to be a non responder in this example. We use a logit-normal distribution for p  in order to constrain it to be between 0 and 1, but without variability:

p is estimated with the other population parameters:
Then, the group to which a patient belong is also estimated as the group of highest conditional probability:

$$\begin{aligned}\hat{z}_i &= 1~~~~\textrm{if}~~~~ \mathbb{P}(z_i=1 | (y_{ij}), \hat{\psi}_i, \hat{\theta})> \mathbb{P}(z_i=2 | (y_{ij}),\hat{\psi}_i, \hat{\theta}),\\ &=0~~~~\textrm{otherwise}\end{aligned}$$

The estimated groups can used as a stratifying variable to split some plots such as VPCs

Within subject mixture models

  • wsmm_project (data = ‘pdmixt2_data.txt’, model = ‘wsmm_model.txt’)

It may be too simplistic to assume that each individual is represented by only one well-defined model from the mixture. We consider here that the mixture of models happens within each individual and use a WSMM: f = p*f1 + (1-p)*f2

[LONGITUDINAL]
input = {A1, A2, k, p}

EQUATION:
f1 = A1
f2 = A2*exp(-k*max(t,0))
f = wsmm(f1, p, f2, 1-p)

OUTPUT:
output = f

Remark: Here, writing f = wsmm(f1, p, f2, 1-p) is equivalent to write f = p*f1 + (1-p)*f2
Important: Here, p is an individual parameter: the subjects have different proportions of non responder cells. We use a probit-normal distribution for p in order to constrain it to be between 0 and 1, with variability:

There is no latent covariate when using WSMM: mixtures are continuous mixtures. We therefore cannot split anymore the VPC and the prediction distribution.

2.4.Models for non continuous outcomes

2.4.1.Time-to-event data models


Objectives: learn how to implement a model for (repeated) time-to-event data with different censoring processes.


Projects: tte1_project, tte2_project, tte3_project, tte4_project, rtteWeibull_project, rtteWeibullCount_project


Introduction

Here, observations are the “times at which events occur”. An event may be one-off (e.g., death, hardware failure) or repeated (e.g., epileptic seizures, mechanical incidents, strikes). Several functions play key roles in time-to-event analysis: the survival, hazard and cumulative hazard functions. We are still working under a population approach here so these functions, detailed below, are thus individual functions, i.e., each subject has its own. As we are using parametric models, this means that these functions depend on individual parameters (\psi_i).

  • The survival function S(t, \psi_i) gives the probability that the event happens to individual i after time t>t_{\text{start}}:

    $$S(t,\psi_i) = \mathbb{P}(T_i>t; \psi_i) $$

  • The hazard function h(t,psi_i) is defined for individual i as the instantaneous rate of the event at time t, given that the event has not already occurred:

    $$h(t, \psi_i) = \lim_{dt \to 0} \frac{S(t, \psi_i) – S(t + dt, \psi_i)}{ S(t, \psi_i)  dt} $$

    This is equivalent to

    $$h(t, \psi_i) = -\frac{d}{dt} \left(\log{S(t, \psi_i)}\right)$$

  • Another useful quantity is the cumulative hazard function H(a,b; \psi_i), defined for individual i as

$$H(a,b; \psi_i) = \int_a^b h(t,\psi_i) dt $$

Note that S(t, \psi_i) = e^{-H(t_{\text{start}},t; \psi_i)}. Then, the hazard function h(t,\psi_i) characterizes the problem, because knowing it is the same as knowing the survival function S(t, \psi_i). The probability distribution of survival data is therefore completely defined by the hazard function.

Time-to-event (TTE) models are thus defined in Monolix via the hazard function. Monolix also holds a TTE library that contains typical hazard functions for time-to-event data. More details and modeling guidelines can be found on the TTE dedicated webpage, along with case studies.

 

Formatting of time-to-event data in the MonolixSuite

In the data set, exactly observed events, interval censored events and right censoring are recorded for each individual. Contrary to other softwares for survival analysis, the MonolixSuite requires to specify the time at which the observation period starts. This allows to define the data set using absolute times, in addition to durations (if the start time is zero, the records represent durations between the start time and the event).

The column TIME contains also the end of the observation period or the time intervals for interval-censoring. The column OBSERVATION contains an integer that indicates how to interpret the associated time. The different values for each type of event and observation are summarized in the table below:

The figure below summarizes the different situations with examples:

For instance for single events, exactly observed (with or without right censoring), one must indicate the start time of the observation period (Y=0), and the time of event (Y=1) or the time of the end of the observation period if no event has occurred (Y=0). In the following example:

ID TIME Y
1   0   0
1  34   1
2   0   0
2  80   0

the observation period last from starting time t=0 to the final time t=80. For individual 1, the event is observed at t=34, and for individual 2, no event is observed during the period. Thus it is noticed that at the final time (t=80), no event had occurred. Using absolute times instead of durations, we could equivalently write:

ID TIME Y
1  20   0
1  54   1
2  33   0
2  113  0

The durations between start time and event (or end of the observation period) are the same as before, but this time we record the day at which the patients enter the study and the days at which they have events or leave the study. Different patients may enter the study at different times.

Examples for repeated events, and interval censored events are available on the data set documentation page.

 

Single event

To begin with, we will consider a one-off event. Depending on the application, the length of time to this event may be called the survival time (until death, for instance), failure time (until hardware fails), and so on. In general, we simply say “time-to-event”. The random variable representing the time-to-event for subject i is typically written Ti.

 

Single event exactly observed or right censored

  • tte1_project (data = tte1_data.txt , model=lib:exponential_model_singleEvent.txt)

The event time may be exactly observed at time t_i, but if we assume that the trial ends at time t_{\text{stop}}, the event may happen after the end. This is “right censoring”. Here, Y=0 at time t means that the event happened after t and Y=1 means that the event happened at time t. The rows with t=0 are included to show the trial start time t_{\text{start}}=0:
By clicking on the button Observed data, it is possible to display the Kaplan Meier plot (i.e. the empirical survival function) before fitting any model:


A very basic model with constant hazard is used for this data:

[LONGITUDINAL]
input = Te
  
EQUATION:
h = 1/Te

DEFINITION:
Event = {type=event, maxEventNumber=1, hazard=h}

OUTPUT:
output = {Event}

Here, Te is the expected time to event. Specification of the maximum number of events is required both for the estimation procedure and for the diagnosis plots based on simulation, such as the predicted interval for the Kaplan Meier plot which is obtained by Monte Carlo simulation:

 

Single event interval censored or right censored

  • tte2_project (data = tte2_data.txt , model=exponentialIntervalCensored_model.txt)

We may know the event has happened in an interval I_i but not know the exact time t_i. This is interval censoring. Here, Y=0 at time t means that the event happened after t and Y=1 means that the event happened before time t.
Event for individual 1 happened between t=10 and t=15. No event was observed until the end of the experiment (t=100) for individual 5. We use the same basic model, but we need now to specify that the events are interval censored:

[LONGITUDINAL]
input = Te  

EQUATION:
h = 1/Te

DEFINITION:
Event = {type=event, maxEventNumber=1, eventType=intervalCensored, hazard = h
         intervalLength=5     ; used for the plots (not mandatory)
}

OUTPUT:
output = Event

Repeated events

Sometimes, an event can potentially happen again and again, e.g., epileptic seizures, heart attacks. For any given hazard function h, the survival function S for individual i now represents the survival since the previous event at t_{i,j-1}, given here in terms of the cumulative hazard from t_{i,j-1} to t_{i,j}:

$$S(t_{i,j} | t_{i,j-1}; \psi_i) = \mathbb{P}(T_{i,j} > t_{i,j} | T_{i,j-1} = t_{i,j-1}; \psi_i) = \exp(-\int_{t_{i,j-1}}^{t_{i,j}}h(t,\psi_i) dt)$$

 

Repeated events exactly observed or right censored

  • tte3_project (data = tte3_data.txt , model=lib:exponential_model_repeatedEvents.txt)

A sequence of n_i event times is precisely observed before t_{\text{stop}} = 200: We can then display the Kaplan Meier plot for the first event and the mean number of events per individual:

After fitting the model, prediction intervals for these two curves can also be displayed on the same graph as on the following

 

Repeated events interval censored or right censored

  • tte4_project (data = tte4_data.txt , model=exponentialIntervalCensored_repeated_model.txt)

We do not know the exact event times, but the number of events that occurred for each individual in each interval of time.

 

User defined likelihood function for time-to-event data

  • weibullRTTE (data = weibull_data.txt , model=weibullRTTE_model.txt)

A Weibull model is used in this example:

[LONGITUDINAL]
input = {lambda, beta}  

EQUATION: 
h = (beta/lambda)*(t/lambda)^(beta-1)

DEFINITION:
Event = {type=event, hazard=h, eventType=intervalCensored,
         intervalLength=5}

OUTPUT:
output = Event
  • weibullCount (data = weibull_data.txt , model=weibullCount_model.txt)

Instead of defining the data as events, it is possible to consider the data as count data: indeed, we count the number of events per interval. An additional column with the start of the interval is added in the data file and defined as a regression variable. We then use a model for count data (see rtteWeibullCount_model.txt).

2.4.2.Count data model


Objectives: learn how to implement a model for count data.


Projects: count1a_project, count1a_project, count1a_project, count2_project


Introduction

Longitudinal count data is a special type of longitudinal data that can take only nonnegative integer values {0, 1, 2, …} that come from counting something, e.g., the number of seizures, hemorrhages or lesions in each given time period . In this context, data from individual j is the sequence y_i=(y_{ij},1\leq j \leq n_i) where y_{ij} is the number of events observed in the jth time interval I_{ij}.
Count data models can also be used for modeling other types of data such as the number of trials required for completing a given task or the number of successes (or failures) during some exercise. Here, y_{ij} is either the number of trials or successes (or failures) for subject i at time t_{ij}. For any of these data types we will then model y_i=(y_{ij},1 \leq j \leq n_i) as a sequence of random variables that take their values in {0, 1, 2, …}.  If we assume that they are independent, then the model is completely defined by the probability mass functions \mathbb{P}(y_{ij}=k) for k \geq 0 and 1 \leq j \leq n_i. Here, we will consider only parametric distributions for count data.

Formatting of count data in the MonolixSuite

Count data can take only non-negative integer values that come from counting something, e.g., the number of trials required for completing a given task. The task can for instance be repeated several times and the individuals performance followed. In the following data set:

ID TIME Y
1 0 10
1 24 6
1 48 5
1 72 2

10 trials are necessary the first day (t=0), 6 the second day (t=24), etc. Count data can also represent the number of events happening in regularly spaced intervals, e.g the number of seizures every week. If the time intervals are not regular, the data may be considered as repeated time-to-event interval censored, or the interval length can be given as regressor to be used to define the probability distribution in the model.
One can see the epilepsy attacks data set for a more practical example.

Count data with constant distribution over time

  • count1a_project (data = ‘count1_data.txt’, model = ‘count_library/poisson_mlxt.txt’)

A Poisson model is used for fitting the data:

[LONGITUDINAL]
input = lambda

DEFINITION:
Y = {type = count,  log(P(Y=k)) = -lambda + k*log(lambda) - factln(k) }

OUTPUT:
output = Y

Residuals for noncontinuous data reduce to NPDEs. We can compare the empirical distribution of the NPDEs with the distribution of a standardized normal distribution either with the pdf (top) or the cdf (bottom):

VPCs for count data compare the observed and predicted frequencies of the categorized data over time:

  • count1b_project (data = ‘count1_data.txt’, model = ‘count_library/poissonMixture_mlxt.txt’)

A mixture of two Poisson distributions is used to fit the same data. For that, we define the probability of k occurrences as the weigthed sum of two Poisson distribution with two expected number of occurrences lambda1 and lambda2. The structural model file writes

[LONGITUDINAL]
input = {lambda1, alpha, mp}

EQUATION:
lambda2 = (1+alpha)*lambda1

DEFINITION:
Y = { type = count,
       P(Y=k) = mp*exp(-lambda1 + k*log(lambda1) - factln(k)) + (1-mp)*exp(-lambda2 + k*log(lambda2) - factln(k)) 
}

OUTPUT:
output = Y

Thus, the parameter alpha has to be strictly positive to ensure different expected number of occurrences in the two poisson distributions and mp has to be in [0, 1] to ensure the probability is correctly defined. Thus those parameters should be defined with lognormal and probitnormal distribution respectively as shown on the following figure.

We see on the VPC below that the data set is well modeled using this mixture of Poisson distribution.

In addition, we can compute the prediction distribution of the modalities as on the following figure

Count data with time varying distribution

  • count2_project (data = ‘count2_data.txt’, model = ‘count_library/poissonTimeVarying_mlxt.txt’)

The distribution of the data changes with time in this example:

We then use a Poisson distribution with a time varying intensity:

[LONGITUDINAL]
input =  {a,b}
                           
EQUATION:
lambda= a*exp(-b*t)
                           
DEFINITION:
y = {type=count, P(y=k)=exp(-lambda)*(lambda^k)/factorial(k)}

OUTPUT:
output = y

This model seems to fit the data very well:

2.4.3.Categorical data model


Objectives: learn how to implement a model for categorical data, assuming either independence or a Markovian dependence between observations.


Projects: categorical1_project, categorical2_project, markov0_project, markov1a_project, markov1b_project, markov1c_project, markov2_project, markov3a_project, markov3b_project


Introduction

Assume now that the observed data takes its values in a fixed and finite set of nominal categories \{c_1, c_2,\ldots , c_K\}. Considering the observations (y_{ij},\, 1 \leq j \leq n_i) for any individual i as a sequence of conditionally independent random variables, the model is completely defined by the probability mass functions \mathbb{P}(y_{ij}=c_k | \psi_i) for k=1,\ldots, K and 1 \leq j \leq n_i. For a given (i,j), the sum of the K probabilities is 1, so in fact only K-1 of them need to be defined. In the most general way possible, any model can be considered so long as it defines a probability distribution, i.e., for each k, \mathbb{P}(y_{ij}=c_k | \psi_i) \in [0,1], and \sum_{k=1}^{K} \mathbb{P}(y_{ij}=c_k | \psi_i) =1. Ordinal data further assume that the categories are ordered, i.e., there exists an order \prec such that

c_1 \prec c_2,\prec \ldots \prec c_K .

We can think, for instance, of levels of pain (low \prec moderate \prec severe) or scores on a discrete scale, e.g., from 1 to 10. Instead of defining the probabilities of each category, it may be convenient to define the cumulative probabilities \mathbb{P}(y_{ij} \preceq c_k | \psi_i) for k=1,\ldots ,K-1, or in the other direction: \mathbb{P}(y_{ij} \succeq c_k | \psi_i) for k=2,\ldots, K. Any model is possible as long as it defines a probability distribution, i.e., it satisfies

$$0 \leq \mathbb{P}(y_{ij} \preceq c_1 | \psi_i) \leq \mathbb{P}(y_{ij} \preceq c_2 | \psi_i)\leq \ldots \leq \mathbb{P}(y_{ij} \preceq c_K | \psi_i) =1 .$$

It is possible to introduce dependence between observations from the same individual by assuming that (y_{ij},\,j=1,2,\ldots,n_i) forms a Markov chain. For instance, a Markov chain with memory 1 assumes that all that is required from the past to determine the distribution of y_{ij} is the value of the previous observation y_{i,j-1}., i.e., for all k=1,2,\ldots ,K,

$$\mathbb{P}(y_{ij} = c_k\,|\,y_{i,j-1}, y_{i,j-2}, y_{i,j-3},\ldots,\psi_i) = \mathbb{P}(y_{ij} = c_k | y_{i,j-1},\psi_i)$$

Formatting of categorical data in the MonolixSuite

In case of categorical data, the observations at each time point can only take values in a fixed and finite set of nominal categories. In the data set, the output categories must be coded as integers, as in the following example:

ID TIME Y
1 0.5 3
1 1 0
1 1.5 2
1 2 2
1 2.5 3

One can see the respiratory status data set and the warfarin data set for example for more practical examples on a categorical and a joint continuous and categorical data set respectively.

Ordered categorical data

  • categorical1_project (data = ‘categorical1_data.txt’, model = ‘categorical1_model.txt’)

In this example, observations are ordinal data that take their values in {0, 1, 2, 3}:

  • Cumulative odds ratio are used in this example to define the model

$$\textrm{logit}(\mathbb{P}(y_{ij} \leq k))= \log \left( \frac{\mathbb{P}(y_{ij} \leq k)}{1 – \mathbb{P}(y_{ij} \leq k )} \right)$$

where

$$\begin{array}{ccl} \text{logit}(\mathbb{P}(y_{ij} \leq 0)) &=& \theta_{i,1}\\ \text{logit}(\mathbb{P}(y_{ij} \leq 1)) &=& \theta_{i,1}+\theta_{i,2}\\ \text{logit}(\mathbb{P}(y_{ij} \leq 2)) &=& \theta_{i,1}+\theta_{i,2}+\theta_{i,3}\end{array}$$

This model is implemented in categorical1_model.txt:

[LONGITUDINAL]
input = {th1, th2, th3}

DEFINITION:
level = { type = categorical,  categories = {0, 1, 2, 3},
  logit(P(level<=0)) = th1
  logit(P(level<=1)) = th1 + th2
  logit(P(level<=2)) = th1 + th2 + th3
}

A normal distribution is used for \theta_{1}, while log-normal distributions for \theta_{2} and \theta_{3} ensure that these parameters are positive (even without variability). Residuals for noncontinuous data reduce to NPDE’s. We can compare the empirical distribution of the NPDE’s with the distribution of a standardized normal distribution:

VPC’s for categorical data compare the observed and predicted frequencies of each category over time:

The prediction distribution can also be computed by Monte-Carlo:

Ordered categorical data with regression variables

  • categorical2_project (data = ‘categorical2_data.txt’, model = ‘categorical2_model.txt’)

A proportional odds model is used in this example, where PERIOD and DOSE are used as regression variables (i.e. time-varying covariates)

Discrete-time Markov chain

If observation times are regularly spaced (constant length of time between successive observations), we can consider the observations (y_{ij},j=1,2,\ldots,n_i) to be a discrete-time Markov chain.

  • markov0_project (data = ‘markov1a_data.txt’, model = ‘markov0_model.txt’)

In this project, states are assumed to be independent and identically distributed:

 \mathbb{P}(y_{ij} = 1) = 1 - \mathbb{P}(y_{ij} = 2) = p_{i,1}

Observations in markov1a_data.txt take their values in {1, 2}.

  • markov1a_project (data = ‘markov1a_data.txt’, model = ‘markov1a_model.txt’)

Here,

\begin{aligned}\mathbb{P}(y_{i,j} = 1 | y_{i,j-1} = 1) = 1 - \mathbb{P}(y_{i,j} = 2 | y_{i,j-1} = 1) = p_{i,11}\\ \mathbb{P}(y_{i,j} = 1 | y_{i,j-1} = 2) = 1 - \mathbb{P}(y_{i,j} = 2 | y_{i,j-1} = 2) = p_{i,12} \end{aligned}

[LONGITUDINAL]
input = {p11, p21}
DEFINITION:
State = {type = categorical,  categories = {1,2},  dependence = Markov
  P(State=1|State_p=1) = p11
  P(State=1|State_p=2) = p21
}

The distribution of the initial state is not defined in the model, which means that, by default,

 \mathbb{P}(y_{i,1} = 1) = \mathbb{P}(y_{i,1} = 2) = 0.5

  • markov1b_project (data = ‘markov1b_data.txt’, model = ‘markov1b_model.txt’)

The distribution of the initial state, p = \mathbb{P}(y_{i,1} = 1), is estimated in this example

DEFINITION:
State = {type = categorical,  categories = {1,2},  dependence = Markov
  P(State_1=1)= p
  P(State=1|State_p=1) = p11
  P(State=1|State_p=2) = p21
}
  • markov3a_project (data = ‘markov3a_data.txt’, model = ‘markov3a_model.txt’)

Transition probabilities change with time in this example. We then define time varying transition probabilities in the model:

[LONGITUDINAL]
input = {a1, b1, a2, b2}
EQUATION:
lp11 = a1 + b1*t/100
lp21 = a2 + b2*t/100
DEFINITION:
State = {type = categorical, categories = {1,2}, dependence = Markov
  logit(P(State=1|State_p=1)) = lp11
  logit(P(State=1|State_p=2)) = lp21
}
  • markov2_project (data = ‘markov2_data.txt’, model = ‘markov2_model.txt’)

Observations in markov2_data.txt take their values in {1, 2, 3}. Then, 6 transition probabilities need to be defined in the model.

Continuous-time Markov chain

The previous situation can be extended to the case where time intervals between observations are irregular by modeling the sequence of states as a continuous-time Markov process. The difference is that rather than transitioning to a new (possibly the same) state at each time step, the system remains in the current state for some random amount of time before transitioning. This process is now characterized by transition rates instead of transition probabilities:

 \mathbb{P}(y_{i}(t+h) = k,|,y_{i}(t)=\ell , \psi_i) = h \rho_{\ell k}(t,\psi_i) + o(h),\qquad k \neq \ell .

The probability that no transition happens between t and t+h is

 \mathbb{P}(y_{i}(s) = \ell, \forall s\in(t, t+h) | y_{i}(t)=\ell , \psi_i) = e^{h , \rho_{\ell \ell}(t,\psi_i)} .

Furthermore, for any individual i and time t, the transition rates (\rho_{\ell,k}(t, \psi_i)) satisfy for any 1\leq \ell \leq K,

 \sum_{k=1}^K \rho_{\ell k}(t, \psi_i) = 0

Constructing a model therefore means defining parametric functions of time (\rho_{\ell,k}) that satisfy this condition.

  • markov1c_project (data = ‘markov1c_data.txt’, model = ‘markov1c_model.txt’)

Observation times are irregular in this example. Then, a continuous time Markov chain should be used in order to take into account the Markovian dependence of the data:

DEFINITION:
State = { type = categorical,  categories = {1,2}, dependence = Markov
  transitionRate(1,2) = q12
  transitionRate(2,1) = q21
}
  • markov3b_project (data = ‘markov3b_data.txt’, model = ‘markov3b_model.txt’)

Time varying transition rates are used in this example.

2.5.Joint models for multivariate outcomes

2.5.1.Joint models for continuous outcomes


Objectives: learn how to implement a joint model for continuous PKPD data.


Projects: warfarinPK_project, warfarin_PKPDimmediate_project, warfarin_PKPDeffect_project, warfarin_PKPDturnover_project, warfarin_PKPDseq1_project, warfarin_PKPDseq2_project, warfarinPD_project





Introduction

A “joint model” describes two or more types of observation that typically depend on each other. A PKPD model is a “joint model” because the PD depends on the PK. Here we demonstrate how several observations can be modeled simultaneously. We also discuss the special case of sequential PK and PD modelling, using either the population PK parameters or the individual PK parameters as an input for the PD model.

Fitting first a PK model to the PK data

  • warfarinPK_project (data = ‘warfarin_data.txt’, model = ‘lib:oral1_1cpt_TlagkaVCl.txt’)

The column DV of the data file contains both the PK and the PD measurements: in Monolix this column is tagged as  an OBSERVATION column. The column DVID is a flag defining the type of observation: DVID=1 for PK data and DVID=2 for PD data: the keyword OBSERVATION ID is then used for this column.

We will use the model oral1_1cpt_TlagkaVCl from the Monolix PK library

[LONGITUDINAL]
input = {Tlag, ka, V, Cl}

EQUATION:
Cc = pkmodel(Tlag, ka, V, Cl)

OUTPUT:
output = Cc

Only the predicted concentration Cc is defined as an output of this model. Then, this prediction will be automatically associated to the outcome of type 1 (DVID=1) while the other observations (DVID=2) will be ignored.
Remark: any other ordered values could be used for OBSERVATION ID column: the smallest one will always be associated to the first prediction defined in the model.

Simultaneous PKPD modeling

  • warfarin_PKPDimmediate_project (data = ‘warfarin_data.txt’, model = ‘immediateResponse_model.txt’)

Is is also possible for the user to write his own PKPD model. The same PK model used previously and an immediate response model are defined in the model file immediateResponse_model.txt

[LONGITUDINAL]
input = {Tlag, ka, V, Cl, Imax, IC50, S0}

EQUATION:
Cc = pkmodel(Tlag, ka, V, Cl)
E = S0 * (1 - Imax*Cc/(Cc+IC50))

OUTPUT:
output = {Cc, E}

Two predictions are now defined in the model: Cc for the PK (DVID=1) and E for the PD (DVID=2).

  • warfarin_PKPDeffect_project (data = ‘warfarin_data.txt’, model = ‘effectCompartment_model.txt’)

An effect compartment is defined in the model file effectCompartment_model.txt

[LONGITUDINAL]
input = {Tlag, ka, V, Cl, ke0, Imax, IC50, S0}

EQUATION:
{Cc, Ce} = pkmodel(Tlag, ka, V, Cl, ke0)
E = S0 * (1 - Imax*Ce/(Ce+IC50))

OUTPUT:
output = {Cc, E}

Ce is the concentration in the effect compartment

  • warfarin_PKPDturnover_project (data = ‘warfarin_data.txt’, model = ‘turnover1_model.txt’)

An indirect response (turnover) model is defined in the model file turnover1_model.txt

[LONGITUDINAL]
input =  {Tlag, ka, V, Cl, Imax, IC50, Rin, kout}

EQUATION:
Cc = pkmodel(Tlag, ka, V, Cl)
E_0 = Rin/kout
ddt_E = Rin*(1-Imax*Cc/(Cc+IC50)) - kout*E

OUTPUT:
output = {Cc, E}

Sequential PKPD modelling

In the sequential approach, a PK model is developed and parameters estimated in the first step. For a given PD model, different strategies are then possible for the second step, i.e., for estimating the population PD parameters:

Using estimated population PK parameters

  • warfarin_PKPDseq1_project (data = ‘warfarin_data.txt’, model = ‘turnover1_model.txt’)

Population PK parameters are set to their estimated values but individual PK parameters are not assumed to be known and sampled from their conditional distributions at each SAEM iteration. In Monolix, this simply means changing the status of the population PK parameter values so that they are no longer used as initial estimates for SAEM but considered fixed as on the figure below.

To fix parameters, click on the green option button (framed in green) and choose the Fixed method as on the figure below

The joint PKPD model defined in turnover1_model.txt is again used with this project.

Using estimated individual PK parameters

  • warfarin_PKPDseq2_project (data = ‘warfarinSeq_data.txt’, model = ‘turnoverSeq_model.txt’)

Individual PK parameters are set to their estimated values and used as constants in the PKPD model for the fitting the PD data. In this example, individual PK parameters (\psi_i) were estimated as the modes of the conditional distributions (p(\psi_i | y_i, \hat{\theta})). An additional column IGNORED OBSERVATION is necessary in the datafile in order to ignore the PK data. For that, we MDV=1 for the line where YTYPE=1 (PK data), and MDV=0 on the line where YTYPE=2 (PD data).


In addition, the estimated individual PK parameters (blue frame) are defined as regression variables, using the reserved keyword REGRESSOR. The covariates used for defining the distribution of the individual PK parameters are not mandatory as all the information is already in the individual parameters.
We use the same turnover model for the PD data. Here, the PK parameters are defined as regression variables (i.e. regressors).

[LONGITUDINAL]
input =  {Imax, IC50, Rin, kout, Tlag, ka, V, Cl}
Tlag  = {use = regressor}
ka    = {use = regressor}
V     = {use = regressor}
Cl    = {use = regressor}

EQUATION:
Cc = pkmodel(Tlag,ka,V,Cl)
E_0 = Rin/kout
ddt_E= Rin*(1-Imax*Cc/(Cc+IC50)) - kout*E

OUTPUT:
output = E

As you can see, the names of the regressors do not match the parameter names. The regressors are matched by order (not by name) between the data set and the model input statement.

Fitting a PKPD model to the PD data only

  • warfarinPD_project (data = ‘warfarinPD_data.txt’, model = ‘turnoverPD_model.txt’)

In this example, only PD data are available. Nevertheless, a PKPD model – where only the effect is defined as a prediction – can be used for fitting this data and thus defined in the OUTPUT section.

[LONGITUDINAL]
input =  {Tlag, ka, V, Cl, Imax, IC50, Rin, kout}

EQUATION:
Cc = pkmodel(Tlag, ka, V, Cl)
E_0 = Rin/kout
ddt_E = Rin*(1-Imax*Cc/(Cc+IC50)) - kout*E

OUTPUT:
output = E

Case studies

  • 8.case_studies/PKVK_project (data = ‘PKVK_data.txt’, model = ‘PKVK_model.txt’)
  • 8.case_studies/hiv_project (data = ‘hiv_data.txt’, model = ‘hivLatent_model.txt’)

2.5.2.Joint models for non continuous outcomes


Objectives: learn how to implement a joint model for continuous and non continuous data.


Projects: warfarin_cat_project, PKcount_project, PKrtte_project


Joint model for continuous PK and categorical PD data

  • warfarin_cat_project (data = ‘warfarin_cat_data.txt’, model = ‘PKcategorical1_model.txt’)

In this example, the original continuous PD data has been recoded as 1 (Low), 2 (Medium) and 3 (High).

More details about the data

International Normalized Ratio (INR) values are commonly used in clinical practice to target optimal warfarin therapy. Low INR values (<2) are associated with high blood clot risk and high ones (>3) with high risk of bleeding, so the targeted value of INR, corresponding to optimal therapy, is between 2 and 3.

Prothrombin complex activity is inversely proportional to the INR. We can therefore associate the three ordered categories for the INR to three ordered categories for PCA: Low PCA values if PCA is less than 33% (corresponding to INR>3), medium if PCA is between 33% and 50% (INR between 2 and 3) and high if PCA is more than 50% (INR<2).

The column dv contains both the PK and the new categorized PD measurements. Instead of modeling the original continuous PD data, we can model the probabilities of each of these categories, which have direct clinical interpretations. The model is still a joint PKPD model since this probability distribution is expected to depend on exposure, i.e., the plasmatic concentration predicted by the PK model. We introduce an effect compartment to mimic the effect delay. Let y_{ij}^{(2)} be the PCA level for patient i at time t_{ij}^{(2)}. We can then use a proportional odds model for modeling this categorical data:

$$\begin{array}{ccl}\text{logit} \left(\mathbb{P}(y_{ij}^{(2)} \leq 1 | \psi_i)\right) &= &\alpha_{i} + \beta_{i} Ce(t_{ij}^{(2)},\phi_i^{(1)}) \\ \text{logit} \left(\mathbb{P}(y_{ij}^{(2)} \leq 2 | \psi_i)\right) &=& \alpha_{i} + \gamma_{i} + \beta_{i}Ce(t_{ij}^{(2)},\phi_i^{(1)}) \\ \text{logit} \left(\mathbb{P}(y_{ij}^{(2)} \leq 3 | \psi_i)\right) &= & 1,\end{array}$$

where C_e(t,\phi_i^{(1)}) is the predicted concentration of warfarin in the effect compartment at time t for patient i with PK parameters \phi_i^{(1)}. This model defines a probability distribution for y_{ij} if \gamma_i\geq 0.
If \beta_i>0, the probability of low PCA at time t_{ij}^{(2)} (y_{ij}^{(2)}=1) increases along with the predicted concentration Ce(t_{ij}^{(2)},\phi_i^{(1)}). The joint model is implemented in the model file PKcategorical1_model.txt

[LONGITUDINAL]
input = {Tlag, ka, V, Cl, ke0, alpha, beta, gamma}

EQUATION:
{Cc,Ce} = pkmodel(Tlag,ka,V,Cl,ke0)
lp1 = alpha + beta*Ce
lp2 = lp1+ gamma         ; gamma >= 0

DEFINITION:
Level = {type=categorical, categories={1,2,3}
    logit(P(Level<=1)) = lp1
    logit(P(Level<=2)) = lp2
}
OUTPUT:
output = {Cc, Level}

See Categorical data model for more details about categorical data models.

Joint model for continuous PK and count PD data

  • PKcount_project (data = ‘PKcount_data.txt’, model = ‘PKcount1_model.txt’)

The data file used for this project is PKcount_data.txt where the PK and the count PD data are simulated data. We use a Poisson distribution for the count data, assuming that the Poisson parameter is function of the predicted concentration. For any individual i, we have

$$\lambda_i(t) = \lambda_{0,i} \left( 1 – \frac{Cc_i(t)}{Cc_i(t) + IC50_i} \right)$$

where Cc_i(t) is the predicted concentration for individual i at time t and

$$ \log\left(P(y_{ij}^{(2)} = k)\right) = -\lambda_i(t_{ij}) + k\,\log(\lambda_i(t_{ij})) – \log(k!)$$

The joint model is implemented in the model file PKcount1_model.txt

[LONGITUDINAL]
input = {ka, V, Cl, lambda0, IC50} 

EQUATION:
Cc = pkmodel(ka,V,Cl)
lambda=lambda0*(1 - Cc/(IC50+Cc)) 

DEFINITION:
Seizure = {type = count, 
           log(P(Seizure=k)) = -lambda + k*log(lambda) - factln(k)
}

OUTPUT:
output = {Cc,Seizure}

See Count data model for more details about count data models.

Joint model for continuous PK and time-to-event data

  • PKrtte_project (data = ‘PKrtte_data.txt’, model = ‘PKrtteWeibull1_model.txt’)

The data file used for this project is PKrtte_data.txt where the PK and the time-to-event data are simulated data. We use a Weibull model for the events count data, assuming that the baseline is function of the predicted concentration. For any individual i, we define the hazard function as

$$h_i(t) = \gamma_{i}  Cc_i(t)  t^{\beta-1}$$

where Cc_i(t) is the predicted concentration for individual i at time t. The joint model is implemented in the model file PKrtteWeibull1_model.txt

[LONGITUDINAL]
input  = {ka, V, Cl, gamma, beta}  

EQUATION:
Cc = pkmodel(ka, V, Cl)
if t<0.1
  haz = 0
else
  haz = gamma*Cc*(t^(beta-1))
end

DEFINITION:
Hemorrhaging  = {type=event, hazard=haz}

OUTPUT:
output = {Cc, Hemorrhaging}

See Time-to-event data model for more details about time-to-event data models.

2.6.Models for the individual parameters

2.6.1.Model for the individual parameters: introduction

A model for observations depend on a vector of individual parameters ψ \psi_i. As we want to work with a population approach, we now suppose that \psi_i comes from some probability distribution p_{{\psi_i}}.

In this section, we are interested in the implementation of individual parameter distributions (p_{{\psi_i}}, 1\leq i \leq N). Generally speaking, we assume that individuals are independent. This means that in the following analysis, it is sufficient to take a closer look at the distribution p_{{\psi_i}} of a unique individual i. The distribution p_{{\psi_i}} plays a fundamental role since it describes the inter-individual variability of the individual parameter \psi_i.
In Monolix, we consider that some transformation of the individual parameters is normally distributed and is a linear function of the covariates:

h(\psi_i) = h(\psi_{\rm pop})+ \beta \cdot ({c}_i - {c}_{\rm pop}) + \eta_i \,, \quad \eta_i \sim {\cal N}(0,\Omega).

This model gives a clear and easily interpreted decomposition of the variability of h(\psi_i) around h(\psi_{\rm pop}), i.e., of \psi_i around \psi_{\rm pop}:

The component \beta \cdot ({c}_i - {c}_{\rm pop}) describes part of this variability by way of covariates {c}_i that fluctuate around a typical value {c}_{\rm pop}.
The random component \eta_i describes the remaining variability, i.e., variability between subjects that have the same covariate values.
By definition, a mixed effects model combines these two components: fixed and random effects. In linear covariate models, these two effects combine additively. In the present context, the vector of population parameters to estimate is \theta = (\psi_{\rm pop},\beta,\Omega). Several extensions of this basic model are possible:

We can suppose for instance that the individual parameters of a given individual can fluctuate over time. Assuming that the parameter values remain constant over some periods of time called \emph{occasions}, the model needs to be able to describe the inter-occasion variability (IOV) of the individual parameters.
If we assume that the population consists of several homogeneous subpopulations, a straightforward extension of mixed effects models is a finite mixture of mixed effects models, assuming for instance that the distribution p_{{\psi_i}} is a mixture of distributions.

2.6.2.Probability distribution of the individual parameters


Objectives: learn how to define the probability distribution and the correlation structure of the individual parameters.


Projects: warfarin_distribution1_project, warfarin_distribution2_project, warfarin_distribution3_project, warfarin_distribution4_project


Introduction

One way to extend the use of Gaussian distributions is to consider that some transformation of the parameters in which we are interested is Gaussian, i.e., assume the existence of a monotonic function \(h\) such that \(h(\psi)\) is normally distributed. Then, there exists some \(\omega\) such that, for each individual i:

\(h(\psi_i) \sim {\cal N}(h(\bar{\psi}_i), \omega^2)\)

where \(\bar{\psi}_i\) is the predicted value of \(\psi_i\). In this section, we consider models for the individual parameters without any covariate. Then, the predicted value of \(\psi_i\) is the \(\bar{\psi}_i = \psi_{\rm pop}\) and

\(h(\psi_i) \sim {\cal N}(h(\psi_{pop}), \omega^2)\)

The transformation \(h\) defines the distribution of \(\psi_i\). Some predefined distributions/transformations are available in Monolix:

  • Normal distribution:

In that case, \(h(\psi_i) = \psi_i\).
Remark: the two mathematical representations for normal distributions are equivalent:

\( \psi_i \sim {\cal N}(\bar{\psi}_{i}, \omega^2) ~~\Leftrightarrow~~ \psi_i = \bar{\psi}_i + \eta_i, ~~\text{where}~~\eta_i \sim {\cal N}(0,\omega^2).\)

  • Log-normal distribution:

In that case, \(h(\psi_i) = log(\psi_i)\). A log-normally random variable takes positive values only. A log-normal distribution looks like a normal distribution for a small variance \(\omega^2\). On the other hand, the asymmetry of the distribution increases when \(\omega^2\) increases.
Remark: the two mathematical representations for log-normal distributions are equivalent:

\(\log(\psi_i) \sim {\cal N}(\log(\bar{\psi}_{i}), \omega^2) ~~\Leftrightarrow~~ \psi_i = \bar{\psi}_i e^{\eta_i}, ~~\text{where}~~\eta_i \sim {\cal N}(0,\omega^2).\)

  • Logit-normal distribution:

In that case, \(h(\psi_i) = log\left(\frac{\psi_i}{1-\psi_i}\right)\). A random variable \(\psi_i\) with a logit-normal distribution takes its values in ]0,1[. The logit of \(\psi_i\) is normally distributed, i.e.,

\(\text{logit}(\psi_i) = \log \left(\frac{\psi_i}{1-\psi_i}\right) \ \sim \ \ {\cal N}( \text{logit}(\bar{\psi}_i), \omega^2).\)

  • Probit-normal distribution:

The probit function is the inverse cumulative distribution function (quantile function) \(\Phi^{-1}\) associated with the standard normal distribution \({\cal N}(0,1)\). A random variable \(\psi\) with a probit-normal distribution also takes its values in ]0,1[.

\(\text{probit}(\psi_i) = \Phi^{-1}(\psi_i) \ \sim \ {\cal N}( \Phi^{-1}(\bar{\psi}_i), \omega^2) .\)

To chose one of these distribution in the GUI, click on the distribution corresponding to the parameter you want to change in the individual model part and choose the corresponding distribution.

Remarks:

  1. If you change your distribution and your population parameter is not valid, then an error message is thrown. Typically, when you want to change your distribution to a logit or a probit distribution, typically for a bio-availability, make sure the associated population parameter is between 0 and 1 strictly.
  2. When creating a project, the default proposed distribution is lognormal.
  3. Logit and probit transformations can be generalized to any interval (a,b) by setting \( \psi_{(a,b)} = a + (b-a)\psi_{(0,1)}\) where \(\psi_{(0,1)}\) is a random variable that takes values in (0,1) with a logit-normal (or probit-normal) distribution. Thus, if you need to have bounds between a and b, you need to modify your structural model to reshape a parameter between 0 and 1 and use a logit or a probit distribution. Examples are shown on this page.

 

Marginal distributions of the individual parameters

  • warfarin_distribution1_project (data = ‘warfarin_data.txt’, model = ‘lib:oral1_1cpt_TlagkaVCl.txt’)

We use the warfarin PK example here. The four PK parameters Tlag, ka, V and Cl are log-normally distributed. LOGNORMAL distribution is then used for these four log-normal distributions in the main Monolix graphical user interface:

The distribution of the 4 PK parameters defined in the MonolixGUI is automatically translated into Mlxtran in the project file:

[INDIVIDUAL]
input = {Tlag_pop, omega_Tlag, ka_pop, omega_ka, V_pop, omega_V, Cl_pop, omega_Cl}
DEFINITION:
Tlag = {distribution=lognormal, typical=Tlag_pop, sd=omega_Tlag}
ka = {distribution=lognormal, typical=ka_pop, sd=omega_ka}
V = {distribution=lognormal, typical=V_pop, sd=omega_V}
Cl = {distribution=lognormal, typical=Cl_pop, sd=omega_Cl}

Estimated parameters are the parameters of the 4 log-normal distributions and the parameters of the residual error model:


Here, \(V_{\rm pop} = 7.94\) and \(\omega_V=0.326\) means that the estimated population distribution for the volume is: \(\log(V_i) \sim {\cal N}(\log(7.94) , 0.326^2)\) or, equivalently, \(V_i = 7.94 e^{\eta_i}\) where \(\eta_i \sim {\cal N}(0,0.326^2)\).

Remarks:

  • \(V_{\rm pop} = 7.94\) is not the population mean of the distribution of \(V_i\), but the median of this distribution (in that case, the mean value is 7.985). The four probability distribution functions are displayed figure Parameter distributions:
  • \(V_{\rm pop}\) is not the population mean of the distribution of \(V_i\), but the median of this distribution. The same property holds for the 3 other distributions which are not Gaussian.
  • Here, standard deviations \(\omega_{Tlag}\), \(\omega_{ka}\), \(\omega_V\) and \(\omega_{Cl}\) are approximately the coefficients of variation (CV) of Tlag, ka, V and Cl since these 4 parameters are log-normally distributed with variances < 1.
  • warfarin_distribution2_project (data = ‘warfarin_data.txt’, model = ‘lib:oral1_1cpt_TlagkaVCl.txt’)

Other distributions for the PK parameters are used in this project:

  • NORMAL for Tlag, we fix the population value \(Tlag_{\text{pop}}\) to 1.5 and the standard deviation \(\omega_{\rm Tlag}\) to 1:
  • NORMAL for ka,
  • NORMAL for V,
  • and LOGNORMAL for Cl

Estimated parameters are the parameters of the 4 transformed normal distributions and the parameters of the residual error model:

Here, \( Tlag_{\rm pop} = 1.5\) and \(\omega_{Tlag}=1\) means that \(Tlag_i \sim {\cal N}(1.5, 1^2)\) while \(Cl_{\rm pop} = .133\) and \(\omega_{Cl}=..29\) means that \(log(Cl_i) \sim {\cal N}(log(.133), .29^2)\).
The four probability distribution functions are displayed Figure Parameter distributions:

 

 

Correlation structure of the random effects






Dependency can be introduced between individual parameters by supposing that the random effects \(\eta_i\) are not independent. This means considering them to be linearly correlated.

  • warfarin_distribution3_project (data = ‘warfarin_data.txt’, model = ‘lib:oral1_1cpt_TlagkaVCl.txt’)

 

Defining correlation between random effects in the interface

 

To introduce correlations between random effects in Monolix, one can define correlation groups. For example, two correlation groups are defined on the interface below, between \(\eta_{V,i}\) and \(\eta_{Cl,i}\) (#1 in that case) and between \(\eta_{Tlag,i}\) and \(\eta_{ka,i}\) in an other group (#2 in that case):

To define a correlation between the random effects of V and Cl, you just have to click on the check boxes of the correlation for those two parameters. If you want to define a correlation between the random effects ka and Tlag independently of the first correlation group, click on the + next to CORRELATION to define a second group and click on the check boxes corresponding to the parameters ka and Tlag under the correlation group #2. Notice, that as the random effects of Cl and V are already in the correlation group #1, these random effects can not be used in another correlation group. When three of more parameters are included in a correlation groups, all pairwise correlations will be estimated. It is not instance not possible to estimate the correlation between \(\eta_{ka,i}\) and \(\eta_{V,i}\) and between \(\eta_{Cl,i}\) and \(\eta_{V,i}\) but not between \(\eta_{Cl,i}\) and \(\eta_{ka,i}\).

It is important to mention that the estimated correlations are not the correlation between the individual parameters (between \(Tlag_i\) and \(ka_i\), and between \(V_i\) and \(Cl_i\)) but the (linear) correlation between the random effects (between \(\eta_{Tlag,i}\) and \(\eta_{ka,i}\), and between \(\eta_{V,i}\) and \(\eta_{Cl,i}\)  respectively).

Remarks

  • If the box is greyed, it means that the associated random effects can not be used in a correlation group, as in the following cases
    • when the parameter has no random effects
    • when the random effect of the parameter is already used in another correlation group
  • There are no limitation in terms of number of parameters in a correlation group
  • You can have a look in the FORMULA to have a recap of all correlations
  • In case of inter-occasion variability, you can define the correlation group for each level of variability independantly.
  • The initial value for the correlations is zero and cannot be changed.
  • The correlation value cannot be fixed.

Estimated population parameters now include these 2 correlations:

Notice that the high uncertainty on \(\text{corr_ka_Tlag}\) suggests that the correlation between \(\eta_{Tlag,i}\) and \(\eta_{ka,i}\) is not reliable.

 

How to decide to include correlations between random effects?

 

The scatterplots of the random effects can hint at correlations to include in the model. This plot represents the joint empirical distributions of each pair of random effects. The regression line (in pink below) and the correlation coefficient (“information” toggle in the settings) permits to visually detect tendencies. If “conditional distribution” (default) is chosen in the display settings, the displayed random effects are calculated using individual parameters sampled from the conditional distribution, which permits to avoid spurious correlations (see the page on shrinkage for more details). If a large correlation is present between a pair of random effects, this correlation can be added to the model in order to be estimated as a population parameter.

Depending on a number of random effects values used to calculate the correlation coefficient, a same correlation value can be more or less significant. To help the user identify significant correlations, Pearson’s correlation tests are performed in the “Result” tab, “Tests” section. If no significant correlation is found, like for the pair \(\eta_{Tlag}\) and \(\eta_{Cl}\) below, the distributions can be assumed to be independent. However, if a significant correlation appears, like for the pair \(\eta_V\) and \(\eta_{Cl}\) below, it can be hypothesized that the distributions are not independent and that the correlation must be included in the model and estimated. Once the correlation is included in the model, the random effects for \(V\) and \(Cl\) are drawn from the joint distribution rather than from two independent distributions.

 

How do the correlations between random effects affect the individual model?

 

In this example the model has four parameters Tlag, ka, V and Cl. Without correlation, the individual model is:
\(log(Tlag) = log(Tlag_{pop}) + \eta_{Tlag}\)
\(log(ka) = log(ka_{pop}) + \eta_{ka}\)
\(log(V) = log(V_{pop}) + \eta_V\)
\(log(Cl) = log(Cl_{pop}) + \eta_{Cl}\)

The random effects follow normal distributions: \((\eta_{Tlag,i},\eta_{ka,i},\eta_{V,i},\eta_{Cl,i}) \sim \mathcal{N}(0,\Omega)\)
\(\Omega\) is the variance-covariance matrix defining the distributions of the vectors of random effects, here:

\(\Omega = \begin{pmatrix} \omega_{Tlag}^2 & 0 & 0 & 0 \\ 0 & \omega_{ka}^2 & 0 & 0 \\ 0 & 0 & \omega_V^2 & 0 \\ 0 & 0 & 0 & \omega_{Cl}^2 \end{pmatrix}\)

In this example, two correlations between \(\eta_{Tlag}\) and \(\eta_{ka}\) and between \(\eta_{V}\) and \(\eta_{Cl}\) are added to the model. They are defined with two population parameters called \(\text{corr_Tlag_ka}\) and \(\text{corr_V_Cl}\) that appear in the variance-covariance matrix. So the only difference in the individual model is in \(\Omega\), that is now:

\(\Omega = \begin{pmatrix} \omega_{Tlag}^2 & \omega_{Tlag} \omega_{ka} \text{corr_Tlag_ka} & 0 & 0 \\ \omega_{Tlag} \omega_{ka} \text{corr_Tlag_ka} & \omega_{ka}^2 & 0 & 0 \\ 0 & 0 & \omega_V^2 & \omega_{V} \omega_{Cl} \text{corr_V_Cl} \\ 0 & 0 & \omega_{V} \omega_{Cl} \text{corr_V_Cl} & \omega_{Cl}^2 \end{pmatrix}\)

In the result folder, the estimated variance-covariance matrix \(\Omega\) is saved, as well as the correlation matrix, as:

$$\text{corr}(\theta_i,\theta_j)=\frac{\text{covar}(\theta_i,\theta_j)}{\sqrt{\text{var}(\theta_i)}\sqrt{\text{var}(\theta_j)}}$$

Why should the correlation be estimated as part of the population parameters?

The effect of correlations is especially important when simulating parameters from the model. This is the case in the VPC or when simulating new individuals in Simulx to assess the outcome of a different dosing scenario for instance. If in reality individuals with a large distribution volume also have a large clearance (i.e there is a positive correlation between the random effects of the volume and the clearance), but this correlation has not been included in the model, then the concentrations predicted by the model for a new cohort of individuals will display a larger variability than they would in reality.

How do the EBEs change after having included correlation in the model?

Before adding correlation in the model, the EBEs or the individual parameters sampled from the conditional distribution may already be correlated, as can be seen in the “correlation between random effects” plot. This is because the individual parameters (EBEs or sampled) are based on the individual conditional distributions, which takes into account the information given by the data. Especially when the data is rich, the data can indicate that individuals with a large volume of distribution also have a large clearance, even if this correlation is not yet included in the model.

Including the correlation in the model as a population parameter to estimate allows to precisely estimate its value. Usually, one can see a stronger correlation for the corresponding pair of random effects when the correlation is included in the model compared to when it is not. In this example, after including the correlations in the individual model, the joint distribution of \(\eta_{V}\) and \(\eta_{Cl}\) displays a higher correlation coefficient (0.439 compared to 0.375 previously):

  • warfarin_distribution4_project (data = ‘warfarin_data.txt’, model = ‘lib:oral1_1cpt_TlagkaVCl.txt’)

In this example, \(Tlag_i\) does not vary in the population, which means that \(\eta_{Tlag,i}=0\) for all subjects i, while the three other random effects are correlated:

Estimated population parameters now include the 3 correlations between \(\eta_{ka,i}\), \(\eta_{V,i}\) and \(\eta_{Cl,i}\) :

2.6.3.Model for individual covariates

 


Objectives: learn how to implement a model for continuous and/or categorical covariates.


Projects: warfarin_covariate1_project, warfarin_covariate2_project, warfarin_covariate3_project, phenobarbital_project







Model with continuous covariates

  • warfarin_covariate1_project (data = ‘warfarin_data.txt’, model = ‘lib:oral1_1cpt_TlagkaVCl.txt’)

The warfarin data contains 2 individual covariates: weight which is a continuous covariate and sex which is a categorical covariate with 2 categories (1=Male, 0=Female). We can ignore these columns if are sure not to use them, or declare them using respectively the reserved keywords CONTINUOUS COVARIATE  and CATEGORICAL COVARIATE to define continuous and categorical covariate.

Even if these 2 covariates are now available, we can choose to define a model without any covariate by not clicking on any check box in the covariate model.

Here, unchecked box in the line of the parameter V and the column of the covariate wt means that there is no relationship between weight and volume in the model. A diagnosis plot Individual parameters vs covariates is generated which displays possible relationships between covariates and individual parameters (even if these covariates are not used in the model):

On the figure, we can see a strong correlation between the volume V and both the weight wt and the sex. One can also see a correlation between the clearance and the weight wt. Therefore, the next step is to add some covariate to our model.

  • warfarin_covariate2_project (data = ‘warfarin_data.txt’, model = ‘lib:oral1_1cpt_TlagkaVCl.txt’)

We decide to use the weight in this project in order to explain part of the variability of \(V_i\) and \(Cl_i\). We will implement the following model for these two parameters:

$$\log(V_i) = \log(V_{\rm pop}) + \beta_V \log(w_i/70) + \eta_{V,i} ~~\text{and}~~\log(Cl_i) = \log(Cl_{\rm pop}) + \beta_{Cl} \log(w_i/70) + \eta_{Cl,i}$$

which means that population parameters of the PK parameters are defined for a typical individual of the population with weight = 70kg.

More details about the model
The model for \(V_{i}\) and \(Cl_{i}\) can be equivalently written as follows:

$$ V_i = V_{\rm pop} ( w_i/70 )^{\beta_V} e^{ \eta_{V,i} } ~~\text{and}~~ Cl_i = Cl_{\rm pop} ( w_i/70 )^{\beta_{Cl}} e^{ \eta_{Cl,i} }$$

The individual predicted values for \(V_i\) and \(Cl_i\) are therefore

$$\bar{V}_i = V_{\rm pop} \left( w_i/70 \right)^{\beta_V} ~~\text{and}~~ \bar{Cl}_i = Cl_{\rm pop} \left( w_i/70 \right)^{\beta_{Cl}} $$

and the statistical model describes how \(V_i\) and \(Cl_i\) are distributed around these predicted values:

$$ \log(V_i)  \sim {\cal N}( \log(\bar{V}_i) , \omega^2_V) ~~\text{and}~~\log(Cl_i) \sim {\cal N}( \log(\bar{Cl}_i) , \omega^2_{Cl}) $$

Here, \(\log(V_i)\) and \(\log(Cl_i)\) are linear functions of \(\log(w_i/70)\): we then need to transform first the original covariate \(w_i\) into \(\log(w_i/70)\) by clicking on the button CONTINUOUS next to ADD COVARIATE (blue button). Then, the following pop up arises

You have to define the

  • the name of the covariate you want to add (the blue frame).
  • the associated equation (the green frame).
  • click on the ACCEPT button

Remarks

  • You can define any formula for your covariate as long as you use mathematical functions available in the Mlxtran language.
  • You can use any covariate available in the list of covariates proposed in the window. Thus, if you have a Height and Weight as covariates, you can directly compute the Body Mass Index.
  • If your go over a covariate with your mouse, all the information (min, mean, median, and max) are displayed as a tooltip.
  • If you click on the covariate name, it will be written in the formula.

We then define a new covariate model, where \(\log(V_i)\) and \(\log(Cl_i)\) are linear functions of the transformed weight \(lw70_i\) as shown on the following figure:

Notice that by clicking on the button FORMULA, you have the display of all the individual model equations. Coefficients \(\beta_{V}\) and \(\beta_{Cl}\) are now estimated with their s.e. and the p-values of the Wald tests are derived to test if these coefficients are different from 0.
Again, a diagnosis plot Individual parameters vs covariates is generated which displays possible relationships between covariates and individual parameters (even if these covariates are not used in the model) as one can see on the figure below on the left. However, as there are covariate on the model, what is interesting is to see if there still are correlation between the correlation and the random effects as one can see on the figure below on the right.

 

Model with categorical covariates

  • warfarin_covariate3_project (data = ‘warfarin_data.txt’, model = ‘lib:oral1_1cpt_TlagkaVCl.txt’)

We use sex instead of weight in this project, assuming different population values of volume and clearance for males and females. More precisely, we consider the following model for \(V_i\) and \(Cl_i\):

$$\log(V_i) = \log(V_{\rm pop}) + \beta_V 1_{sex_i=F} + \eta_{V,i}~~\text{and}~~\log(Cl_i) = \log(Cl_{\rm pop}) + \beta_{Cl} 1_{sex_i=F} + \eta_{Cl,i}$$

where \(1_{sex_i=F} =1\) if individual i is a female and 0 otherwise. Then, \(V_{\rm pop}\) and \(Cl_{\rm pop}\) are the population volume and clearance for males while \(V_{\rm pop}, e^{\beta_V}\) and \(Cl_{\rm pop} e^{\beta_{Cl}}\) are the population volume and clearance for females. By clicking on the purple button DIRCRETE, the following windows pop up

You have to define the

  • the name of the covariate you want to add (the blue frame).
  • the associated categories (the green frame).
  • click on the ALLOCATE button to define all the categories.

Then, you can define

  • the name of the categories (the blue frame).
  • the reference category (the green frame).
  • click on ACCEPT

Define then the covariate model in the main GUI:

Estimated population parameters, including the coefficients \(\beta_V\) and \(\beta_{Cl}\) are displayed with the results:

We can display the probability distribution functions of the 4 PK parameters using the Individual parameter graphic:

Notice that for the volume and the clearance, the theoretical curve is is not a “pure” Gaussian law, due to the impact of the covariate sex.

 

Transforming categorical covariates

  • phenobarbital_project (data = ‘phenobarbital_data.txt’, model = ‘lib:bolus_1cpt_Vk.txt’)

The phenobarbital data contains 2 covariates: the weight and the APGAR score which is considered as a categorical covariate. Instead of using the 10 original levels of the APGAR score, we will transform this categorical covariate and create 3 categories: Low = {1,2,3}, Medium = {4, 5, 6, 7} and High={8,9,10}.

If we assume, for instance that the volume is related to the APGAR score, then \(\beta_{V,Low}\) and \(\beta_{V,High}\) are estimated (assuming that Medium is the reference level).

In that case, one can see that both p-values concerning the transformed APGAR covariate are over .05.

 

Complex parameter covariate relationships

Complex parameter covariate relationships such as Michaelis-Menten or Hill dependencies, time-dependent covariates, or covariate-dependent standard deviations of random effects, can be defined directly in the structural model.

Examples are shown on this page.

2.6.4.Inter occasion variability (IOV)


Objectives: learn how to take into account inter occasion variability (IOV).


Projects: iov1_project, iov1_Evid_project, iov2_project, iov3_project, iov4_project


Introduction

A simple model consists of splitting the study into K time periods or occasions and assuming that individual parameters can vary from occasion to occasion but remain constant within occasions. Then, we can try to explain part of the intra-individual variability of the individual parameters by piecewise-constant covariates, i.e., occasion-dependent or occasion-varying (varying from occasion to occasion and constant within an occasion) ones. The remaining part must then be described by random effects. We will need some additional notation for describing this new statistical model. Let

  • \psi_{ik} be the vector of individual parameters of individual i for occasion k, where 1\leq i \leq N and 1\leq k \leq K.
  • {c}_{ik} be the vector of covariates of individual i for occasion k. Some of these covariates remain constant (gender, group treatment, ethnicity, etc.) and others can vary (weight, treatment, etc.).

Let \psi_i = (\psi_{i1}, \psi_{i2}, \ldots , \psi_{iK}) be the sequence of K individual parameters for individual i. We also need to define:

  • \eta_i^{(0)}, the vector of random effects which describes the random inter-individual variability of the individual parameters,
  • \eta_{ik}^{(1)}, the vector of random effects which describes the random intra-individual variability of the individual parameters in occasion k, for each 1\leq k \leq K.

Here and in the following, the superscript (0) is used to represent inter-individual variability, i.e., variability at the individual level, while superscript (1) represents inter-occasion variability, i.e., variability at the occasion level for each individual. The model now combines these two sequences of random effects:

h(\psi_{ik}) = h(\psi_{\rm pop})+ \beta(c_{ik} - c_{\rm pop}) + \eta_i^{(0)} + \eta_{ik}^{(1)} .

Remark: Individuals do not need to share the same sequence of occasions: the number of occasions and the times defining the occasions can differ from one individual to another.

Occasion definition in a data set

There are two ways to define occasions in a data set:

  • Explicitly using a OCCASION column. It is possible to have, in a data set, one or several columns with the column-type OCCASION. It corresponds to the same subject (ID should remain the same) but under different circonstances, occasions. For example, if the same subject has two successive different treatments, it should be considered as the same subject with two occasions. The OCC columns can contain only integers.
  • Implicitly using EVID column. If there is an EVID column with a value 4 then Monolix defines a washout and create an occasion. Thus, if there is several times where EVID equals 4 for a subject, it will create the same number of occasions. Notice that if EVID equals 4 happens only once at the beginning, only one occasion will be defined and no inter occasion variability would be possible.

There are three kinds of occasions

  • Cross over study: In that case, data are collected for each patient during two independent treatment periods of time, there is an overlap on the time definition of the periods. A column OCCASION can be used used to identify the period. An alternative way is to define an EVID column starting for all occasions with EVID equals 4. Both type of definition will be presented in the iov1 example.
  • Occasions with whashout: In that case, data are collected for each patient during one period and there are no overlap between the periods. The time is increasing but the dynamical system (i.e. the compartments) is reset when the second period starts. In particular, EVID=4 indicates that the system is reset (washout) for example, when a new dose is administrated.
  • Occasions without whashout: In that case, data are collected for each patient during one period and there are no overlap between the periods. The time is increasing and we want to differentiate periods in terms of occasions without any reset of the dynamical system. Multiple doses are administrated to each patient. each period of time between successive doses is defined as a statistical occasion. A column OCCASION is therefore necessary in the data file to define it.

Cross over study

  • iov1_project (data = ‘iov1_data.txt’, model = ‘lib:oral1_1cpt_kaVk.txt’)

In this example, PK data are collected for each patient during two independent treatment periods of time (each one starting at time 0). A column OCCASION is used to identify the study:
This column is defined using the reserved keyword OCCASION. Then, the model associated to the individual parameter is as presented below

First, to define the variability of each parameter on each level, you just have to go on the good level, and you’ll see the associated random effects on each level. On the figure above, we see that all parameters have variability on the ID level, which means that all parameters have inter-individual variability. On the figure below,  we see the OCC level. In the presented case, only the volume V has inter-study variability and thus inter occasion variability. Thus, this is the only one having variability on the occasion level.

In therms of covariates, we  then see two parts as displayed below. We see the covariates

  • associated to the level ID (in green). It corresponds to all the covariates that are constant for each subject.
  • associated to the level OCC (in blue). It corresponds to all the covariates that are constant for each occasion but not on each subject.

In the presented case, the treatment TRT varies for each individual. It contains inter-occasion information and is thus displayed with the occasion level. On the other hand, the SEX is constant for each subject. It contains then inter-individual information but no inter-occasion information. It is then displayed with the ID level.

What is the  impact?
Covariates can be associated to the parameter if and only if they share the its level of variability is relevant with the level of variability of the parameter.
In the presented case,

  • TRT has inter-occasion variability. It can only be used with the parameter V that has inter-occasion variability. The two other parameters have only inter-individual variability and can then not use this TRT information. The interface is greyed and the user can not add this covariate to the parameters ka and Cl.
  • SEX has only inter-individual variability. It can then be associated to any parameter.

The population parameters now include the standard deviations of the random effects for the 2 levels of variability (omega is used fo IIV and gamma for IOV):
Two important features are proposed in the plots. Firstly, in the individual fits, you can split or merge the occasions. When split is done, the name of the subject-occasion is the name of the subject, #, and the name of the occasion.

Secondly, you can use the occasion to split the plots

  • iov1_Evid_project (data = ‘iov1_Evid_data.txt’, model = ‘lib:oral1_1cpt_kaVk.txt’)

Another way to describe this cross over study is to use EVID=4 as explained in the data set definition. In that example, the EVID creates a washout and another occasion.

Occasions with washout

  • iov2_project (data = ‘iov2_data.txt’, model = ‘lib:oral1_1cpt_kaVk.txt’)

The time is increasing in this example, but the dynamical system (i.e. the compartments) is reset when the second period starts. Column EVID provides some information about events concerning dose administration. In particular, EVID=4 indicates that the system is reset (washout) when a new dose is administrated

Monolix automatically proposes to define the treatment periods (between successive resetting) as statistical occasions and introduce IOV, as we did in the previous example. We can display the individual fit by splitting each occasion for each individual


Or by merging the different occasions in a unique plot for each individual:

Remark: If you are modeling a PK as in this example, the washout implies that the occasions are independent. Thus, the cpu time is much faster as we do not have to compute predictions between occasions.

Occasions without washout

  • iov3_project (data = ‘iov3_data.txt’, model = ‘lib:oral1_1cpt_kaVk.txt’)

Multiple doses are administrated to each patient. We consider each period of time between successive doses as an statistical occasion. A column OCCASION is therefore necessary in the data file.

We can color the observed data by their occasion to have a better representation

The model for IIV and IOV can then be defined as usual. The plot of individual fits allows us to check that the predicted concentration is now continuous over the different occasions for each individual:

Multiple levels of occasions

  • iov4_project (data = ‘iov4_data.txt’, model = ‘lib:oral1_1cpt_kaVk.txt’)

We can easily extend such approach to multiple levels of variability. In this example, columns P1 and P2 define embedded occasions. The are both defined as occasions:

We then define a statistical model for each level of variability.

2.6.5.Mixture of distributions





 


Objectives: learn how to implement a mixture of distributions for the individual parameters.


Projects: PKgroup_project, PKmixt_project


Introduction

Mixed effects models allow us to take into account between-subject variability.

One complicating factor arises when data is obtained from a population with some underlying heterogeneity. If we assume that the population consists of several homogeneous subpopulations, a straightforward extension of mixed effects models is a finite mixture of mixed effects models, assuming, for instance, that the probability distribution of some individual parameters vary from one subpopulation to another one. The introduction of a categorical covariate (e.g., sex, genotype, treatment, status, etc.) into such a model already supposes that the whole population can be decomposed into subpopulations. The covariate then serves as a \emph{label} for assigning each individual to a subpopulation.

In practice, the covariate can either be known or not. If it is unknown, the covariate is called a latent covariate and is defined as a random variable with a user-defined number of modalities in the statistical model. Differences in estimation and diagnosis methods appear to deal with this additional random variable: this difference represents a task of unsupervised classification.

Mixture models usually refer to models for which the categorical covariate is unknown and unsupervised classification is needed.

For the sake of simplicity, we will consider a basic model that involves individual parameters (\psi_i,1\leq i \leq N) and observations (y_{ij}, i \leq N, 1\leq j \leq n_i). Then, the easiest way to model a finite mixture model is to introduce a label sequence (z_i , 1\leq i \leq N) that takes its values in \{1,2,\ldots,M\} such that z_i=m if subject i belongs to subpopulation m.
In some situations, the label sequence (z_i , 1\leq i \leq N) is known and can be used as a categorical covariate in the model. If (z_i) is unknown, it can modeled as a set of independent random variables taking its values in \{1,2,\ldots,M\} where for i=1,2,\ldots, N, P(z_i = m) is the probability that individual belongs to group m. We will assume furthermore that the (z_i) are identically distributed, i.e., P(z_i = m) does not depend on i for m=1,\ldots,M.

 

Mixture of distributions based on a categorical covariate

  • PKgroup_project (data = ‘PKmixt_data.txt’, model = ‘lib:oral1_1cpt_kaVCl.txt’)

The sequence of labels is known as GROUP in this project and comes from the dataset. It is therefore defined as a categorical covariate that classifies  We can then assume, for instance different population values for the volume in the two groups and estimate the population parameters using this covariate model.

Then, this covariate GROUP can be used as a stratification variable and is very important in the modeling.

 

Mixture of distributions based on unsupervised classification with a latent covariate

A latent covariate is defined as a random variable, and the probability of each modality is part of the statistical model and is estimated as well. Methods for estimation and diagnosis are different. After the estimation, for each individual the categorical covariate is not perfectly known, only the probabilities of each modality are estimated.

Note also that latent covariates can be useful to model statistical mixtures of populations, but they provide no biological interpretation for the cause of the heterogeneity in the population since they do not come from the dataset.

  • PKmixt_project (data = ‘PKmixt_data.txt’, model = ‘lib:oral1_1cpt_kaVCl.txt’)

We will use the same data with this project but ignoring the column GROUP (which is equivalent to assume that the label is unknown). If we suspect some heterogeneity in the population, we can introduce a “latent covariate” by clicking on the grey button MIXTURE.

It is possible to change the name and the number of modalities of this latent covariate.
Remark: several latent covariates can be introduced in the model, with different number of categories.

We can then use this latent covariate lcat as any observed categorical covariate. We can assume again different population values for the volume in the two groups by applying it on the volume random effect and estimate the population parameters using this covariate model. Proportions of each group are also estimated, plcat_1 which is the probability to have modality 1:

Once the population parameters are estimated, the sequence of latent covariates, i.e. the group to which belongs each subject, can be estimated together with the individual parameters, as the modes of the conditional distributions.

The sequence of estimated latent covariates lcat can be used as a stratification variable. We can for example display the VPC in the 2 groups:

By plotting the distribution of the individual parameters, we see that V has a bimodal distribution

2.7.Pharmacokinetic models

2.7.1.PK model: single route of administration


Objectives: learn how to define and use a PK model for single route of administration.


Projects: bolusLinear_project, bolusMM_project, bolusMixed_project, infusion_project, oral1_project, oral0_project, sequentialOral0Oral1_project, simultaneousOral0Oral1_project, oralAlpha_project, oralTransitComp_project

Introduction

Once a drug is administered, we usually describe subsequent processes within the organism by the pharmacokinetics (PK) process known as ADME: absorption, distribution, metabolism, excretion. A PK model is a dynamical system mathematically represented by a system of ordinary differential equations (ODEs) which describes transfers between compartments and elimination from the central compartment.
See this web animation for more details.
Mlxtran is remarkably efficient for implementing simple and complex PK models:

  • The function pkmodel can be used for standard PK models. The model is defined according to the provided set of named arguments. The pkmodel function enables different parametrizations, different models of absorption, distribution and elimination, defined here and summarized in the following..
  • PK macros define the different components of a compartmental model. Combining such PK components provide a high degree of flexibility for complex PK models. They can also extend a custom ODE system.
  • A system of ordinary differential equations (ODEs) can be implemented very easily.

It is also important to highlight the fact that the data file use by Monolix for PK modelling only contains information about dosing, i.e. how and when the drug is administrated. There is no need to integrate in the data file any information related to the PK model. This is an important remark since it means that any (complex) PK model can be used with the same data file. In particular, we make a clear distinction between administration (related to the data) and absorption (related to the model).

 The pkmodel function

The PK model is defined by the names of the input parameters of the pkmodel function. These names are reserved keywords.

Absorption

  • p: Fraction of dose which is absorbed
  • ka: absorption constant rate (first order absorption)
  • or, Tk0: absorption duration (zero order absorption)
  • Tlag: lag time before absorption
  • or, Mtt, Ktr: mean transit time & transit rate constant

Distribution

  • V: Volume of distribution of the central compartment
  • k12, k21: Transfer rate constants between compartments 1 (central) & 2 (peripheral)
  • or V2, Q2: Volume of compartment 2 (peripheral) & inter compartment clearance, between compartments 1 and 2,
  • k13, k31: Transfer rate constants between compartments 1 (central) & 3 (peripheral)
  • or V3, Q3: Volume of compartment 3 (peripheral) & inter compartment clearance, between compartments 1 and 3.

Elimination

  • k: Elimination rate constant
  • or Cl: Clearance
  • Vm, Km: Michaelis Menten elimination parameters

Effect compartment

  • ke0: Effect compartment transfer rate constant

Intravenous bolus injection

Linear elimination

  • bolusLinear_project

A single iv bolus is administered at time 0 to each patient. The data file bolus1_data.txt contains 4 columns: id, time, amt (the amount of drug in mg) and y (the measured concentration). The names of these columns are recognized as keywords by Monolix:
It is important to remark that, in this data file, a row contains either some information about the dose (in which case y = ".") or a measurement (in which case amt = "."). We could equivalently use the data file bolus2_data.txt which contains 2 additional columns: EVID (in the green frame) and IGNORED OBSERVATION (in the blue frame):

Here, the EVENT ID column  allows the identification of an event. It is an integer between 0 and 4. It helps to define the type of line. EVID=1 means that this record describes a dose while EVID=0 means that this record contains an observed value.
On the other hand, the IGNORED OBSERVATION column enables to tag lines for which the information in the OBSERVATION column-type is missing. MDV=1 means that the observed value of this record should be ignored while MDV=0 means that this record contains an observed value. The two data files bolus1_data.txt and bolus2_data.txt contain exactly the same information and provide exactly the same results. A one compartment model with linear elimination is used with this project:

$$\begin{array}{ccl} \frac{dA_c}{dt} &=& – k A_c(t) \\ A_c(t) &= &0  ~~\text{for}~~ t<0 \end{array} $$

Here, \(A_c(t)\) and \(C_c(t)=A_c(t)/V\) are, respectively, the amount and the concentration of drug in the central compartment at time t. When a dose D arrives in the central compartment at time \(\tau\), an iv bolus administration assumes that

$$A_c(\tau^+) = A_c(\tau^-) + D$$

where \(A_c(\tau^-)\) (resp. \(A_c(\tau^+)\)) is the amount of drug in the central compartment just before (resp. after) \(\tau\) Parameters of this model are V and k. We therefore use the model bolus_1cpt_Vk from the Monolix PK library:

[LONGITUDINAL]
input = {V, k}

EQUATION:
Cc = pkmodel(V, k)

OUTPUT:
output = Cc

We could equivalently use the model bolusLinearMacro.txt (click on the button Model and select the new PK model in the library 6.PK_models/model)

[LONGITUDINAL]
input = {V, k}

PK:
compartment(cmt=1, amount=Ac)
iv(cmt=1)
elimination(cmt=1, k)
Cc = Ac/V

OUTPUT:
output = Cc

These two implementations generate exactly the same C++ code and then provide exactly the same results. Here, the ODE system is linear and Monolix uses its analytical solution. Of course, it is also possible (but not recommended with this model) to use the ODE based PK model bolusLinearODE.txt :

[LONGITUDINAL]
input = {V, k}

PK:
depot(target = Ac)

EQUATION:
ddt_Ac = - k*Ac
Cc = Ac/V

OUTPUT:
output = Cc

Results obtained with this model are slightly different from the ones obtained with the previous implementations since a numeric scheme is used here for solving the ODE. Moreover, the computation time is longer (between 3 and 4 time longer in that case) when using the ODE compared to the analytical solution.
Individual fits obtained with model look nice

but the VPC show some misspecification in the elimination process:

Michaelis Menten elimination

  • bolusMM_project

A non linear elimination is used with this project:

$$\frac{dA_c}{dt} = – \frac{ V_m \, A_c(t)}{V\, K_m + A_c(t) }$$

This model is available in the Monolix PK library as bolus_1cpt_VVmKm:

[LONGITUDINAL]
input = {V, Vm, Km}

PK:
Cc = pkmodel(V, Vm, Km)

OUTPUT:
output = Cc

Instead of this model, we could equivalently use PK macros with bolusNonLinearMacro.txt from the library 6.PK_models/model:

[LONGITUDINAL]
input = {V, Vm, Km}

PK:
compartment(cmt=1, amount=Ac, volume=V)
iv(cmt=1)
elimination(cmt=1, Vm, Km)
Cc = Ac/V

OUTPUT:
output = Cc

or an ODE with bolusNonLinearODE:

[LONGITUDINAL]
input = {V, Vm, Km}

PK:
depot(target = Ac)

EQUATION:
ddt_Ac = -Vm*Ac/(V*Km+Ac) 
Cc=Ac/V

OUTPUT:
output = Cc

Results obtained with these three implementations are identical since no analytical solution is available for this non linear ODE. We can then check that this PK model seems to describe much better the elimination process of the data:

Mixed elimination

  • bolusMixed_project

THe Monolix PK library contains “standard” PK models. More complex models should be implemented by the user in a model file. For instance, we assume in this project that the elimination process is a combination of linear and nonlinear elimination processes:

$$ \frac{dA_c}{dt} = -\frac{ V_m A_c(t)}{V K_m + A_c(t) } – k  A_c(t) $$

This model is not available in the Monolix PK library. It is implemented in bolusMixed.txt:

[LONGITUDINAL]
input = {V, k, Vm, Km}

PK:
depot(target = Ac)

EQUATION:
ddt_Ac = -Vm*Ac/(V*Km+Ac) - k*Ac
Cc=Ac/V

OUTPUT:
output = Cc

This model,  with a combined error model, seems to describe very well the data:

Intravenous infusion

  • infusion_project

Intravenous infusion assumes that the drug is administrated intravenously with a constant rate (infusion rate), during a given time (infusion time). Since the amount is the product of infusion rate and infusion time, an additional column INFUSION RATE or INFUSION DURATION is required in the data file: Monolix can use both indifferently. Data file infusion_rate_data.txt has an additional column rate:

It can be replaced by infusion_tinf_data.txt which contains exactly the same information:

We use with this project a 2 compartment model with non linear elimination and parameters V_1, Q, V_2, V_m, K_m:

$$\begin{aligned} k_{12} &= Q/V_1 \\ k_{21} &= Q/V_2 \\\frac{dA_c}{dt} & = k_{21} \, Ap(t) – k_{12} \, Ac(t)- \frac{ V_m \, A_c(t)}{V_1\, K_m + A_c(t) } \\ \frac{dA_p}{dt} & = – k_{21} \, Ap(t) + k_{12} \, Ac(t) \\ Cc(t) &= \frac{Ac(t)}{V_1} \end{aligned}$$

This model is available in the Monolix PK library as infusion_2cpt_V1QV2VmKm:

[LONGITUDINAL]
input = {V1, Q, V2, Vm, Km}

PK:
V = V1
k12 = Q/V1
k21 = Q/V2
Cc = pkmodel(V, k12, k21, Vm, Km)

OUTPUT:
output = Cc

Oral administration

first-order absorption

  • oral1_project

This project uses the data file oral_data.txt. For each patient, information about dosing is the time of administration and the amount. A one compartment model with first order absorption and linear elimination is used with this project. Parameters of the model are ka, V and Cl. we will then use model oral1_kaVCl.txt from the Monolix PK library

[LONGITUDINAL]
input = {ka, V, Cl}

EQUATION:
Cc = pkmodel(ka, V, Cl)

OUTPUT:
output = Cc

Both the individual fits and the VPCs show that this model doesn’t describe properly the absorption process:

There exist many options for implementing this PK model with Mlxtran:
– using PK macros: oralMacro.txt:

[LONGITUDINAL]
input = {ka, V, Cl}

PK:
compartment(cmt=1, amount=Ac)
oral(cmt=1, ka)
elimination(cmt=1, k=Cl/V) 
Cc=Ac/V

OUTPUT:
output = Cc

– using a system of two ODEs as in oralODEb.txt:

[LONGITUDINAL]
input = {ka, V, Cl}

PK:
depot(target=Ad)

EQUATION:
k = Cl/V
ddt_Ad = -ka*Ad
ddt_Ac =  ka*Ad - k*Ac
Cc = Ac/V

OUTPUT: 
output = Cc

– combining PK macros and ODE as in oralMacroODE.txt (macros are used for the absorption and ODE for the elimination):

[LONGITUDINAL]
input = {ka, V, Cl}

PK:
compartment(cmt=1, amount=Ac)
oral(cmt=1, ka)

EQUATION:
k = Cl/V
ddt_Ac = - k*Ac
Cc = Ac/V

OUTPUT: 
output = Cc

– or equivalently, as in oralODEa.txt:

[LONGITUDINAL]
input = {ka, V, Cl}

PK:
depot(target=Ac, ka)

EQUATION:
k = Cl/V
ddt_Ac = - k*Ac
Cc = Ac/V<

OUTPUT: 
output = Cc

Remark: Models using the pkmodel function or PK macros only use an analytical solution of the ODE system.

zero-order absorption

  • oral0_project

A one compartment model with zero order absorption and linear elimination is used to fit the same PK data with this project. Parameters of the model are Tk0, V and Cl. We will then use model oral0_1cpt_Tk0Vk.txt from the Monolix PK library

[LONGITUDINAL]
input = {Tk0, V, Cl}

EQUATION:
Cc = pkmodel(Tk0, V, Cl)

OUTPUT:
output = Cc


Remark 1: implementing a zero-order absorption process using ODEs is not easy… on the other hand, it becomes extremely easy to implement using either the pkmodel function or the PK macro oral(Tk0).
Remark 2: The duration of a zero-order absorption has nothing to do with an infusion time: it is a parameter of the PK model (exactly as the absorption rate constant ka for instance), it is not part of the data.

sequential zero-order first-order absorption

  • sequentialOral0Oral1_project

More complex PK models can be implemented using Mlxtran. A sequential zero-order first-order absorption process assumes that a fraction Fr of the dose is first absorbed during a time Tk0 with a zero-order process, then, the remaining fraction is absorbed with a first-order process. This model is implemented in sequentialOral0Oral1.txt using PK macros:

[LONGITUDINAL]
input = {Fr, Tk0, ka, V, Cl}

PK:
compartment(amount=Ac)
absorption(Tk0, p=Fr)
absorption(ka, Tlag=Tk0, p=1-Fr)
elimination(k=Cl/V)
Cc=Ac/V

OUTPUT:
output = Cc

Both the individual fits and the VPCs show that this PK model describes very well the whole ADME process for the same PK data:

simultaneous zero-order first-order absorption

  • simultaneousOral0Oral1_project

A simultaneous zero-order first-order absorption process assumes that a fraction Fr of the dose is absorbed with a zero-order process while the remaining fraction is absorbed simultaneously with a first-order process. This model is implemented in simultaneousOral0Oral1.txt using PK macros:

[LONGITUDINAL]
input = {Fr, Tk0, ka, V, Cl}

PK:
compartment(amount=Ac)
absorption(Tk0, p=Fr)
absorption(ka, p=1-Fr)
elimination(k=Cl/V)
Cc=Ac/V

OUTPUT:
output = Cc

alpha-order absorption

  • oralAlpha_project

An \alpha-order absorption process assumes that the rate of absorption is proportional to some power of the amount of drug in the depot compartment:

\frac{dA_d}{dt} = - r \left(A_d(t)\right)^\alpha

This model is implemented in oralAlpha.txt using ODEs:

[LONGITUDINAL]
input = {r, alpha, V, Cl}

PK:
depot(target = Ad)

EQUATION:
dAd = Ad^alpha
ddt_Ad = -r*dAd
ddt_Ac = r*Ad - (Cl/V)*Ac
Cc = Ac/V

OUTPUT:
output = Cc

transit compartment model

  • oralTransitComp_project

A PK model with transit compartment of transit rate Ktr and mean transit time Mtt can be implemented using the PK macro oral(ka, Mtt, Ktr), or using the pkmodel function, as in oralTransitComp.txt:

[LONGITUDINAL]
input = {Mtt, Ktr, ka, V, Cl}

EQUATION:
Cc = pkmodel(Mtt, Ktr, ka, V, Cl)

OUTPUT:
output = Cc

Using different parametrizations

The PK macros and the function pkmodel use some preferred parametrizations and some reserved names as input arguments: Tlag, ka, Tk0, V, Cl, k12, k21… It is however possible to use another parametrization and/or other parameter names. As an example, consider a 2 compartments model for oral administration with a lag, a first order absorption and a linear elimination. We can use the pkmodel function with, for instance, parameters ka, V, k, k12 and k21:

[LONGITUDINAL]
input = {ka, V, k, k12, k21}

PK:
Cc = pkmodel(ka, V, k, k12, k21)

OUTPUT:
output = Cc

Imagine now that we want i) to use the clearance Cl instead of the elimination rate constant k, ii) to use capital letters for the parameter names. We can still use the pkmodel function as follows:

[LONGITUDINAL]
input = {KA, V, CL, K12, K21}

PK:
Cc = pkmodel(ka=KA, V, k=CL/V, k12=K12, k21=K21)

OUTPUT:
output = Cc

2.7.2.PK model: multiple routes of administration


Objectives: learn how to define and use a PK model for multiple routes of administration..


Projects: ivOral1_project, ivOral2_project





Some drugs can display complex absorption kinetics. Common examples are mixed first-order and zero-order absorptions, either sequentially or simultaneously, and fast and slow parallel first-order absorptions. A few examples of those kinds of absorption kinetics are proposed below. Various absorption models are proposed here as examples.

Combining iv and oral administrations – Example 1

  • ivOral1_project (data = ‘ivOral1_data.txt’ , model = ‘ivOral1Macro_model.txt’)

In this example, we combine oral and iv administrations of the same drug. The data file ivOral1_data.txt contains an additional column ADMINISTRATION ID which indicates the route of administration (1=iv, 2=oral)

We assume here a one compartment model with first-order absorption process from the depot compartment (oral administration) and a linear elimination process from the central compartment. We further assume that only a fraction F (bioavailability) of the drug orally administered is absorbed. This model is implemented in ivOral1Macro_model.txt using PK macros:

[LONGITUDINAL]
input = {F, ka, V, k}

PK:
compartment(cmt=1, amount=Ac)
iv(adm=1, cmt=1)
oral(adm=2, cmt=1, ka, p=F)
elimination(cmt=1, k)
Cc = Ac/V

OUTPUT:
output = Cc

A logit-normal distribution is used for bioavability F that takes it values in (0,1). The model properly fits the data as can be seen on the individual fits of the 6 first individuals
Remark: the same PK model could be implemented using ODEs instead of PK macros.
Let \(A_d\) and \(A_c\) be, respectively, the amounts in the depot compartment (gut) and the central compartment (bloodtsream). Kinetics of \(A_d\) and \(A_c\) are described by the following system of ODEs

$$\dot{A}_d(t)  = – k_a A_d(t)~~\text{and}~~ \dot{A}_c(t) = k_a A_d(t) – k A_c(t)$$

The target compartment is the depot compartment (\(A_d\)) for oral administrations and the central compartment (\(A_c\)) for iv administrations. This model is implemented in ivOral1ODE_model.txt using a system of ODEs:

[LONGITUDINAL]
input = {F, ka, V, k}

PK:
depot(type=1, target=Ad, p=F)
depot(type=2, target=Ac)

EQUATION:
ddt_Ad = -ka*Ad
ddt_Ac =  ka*Ad - k*Ac
Cc = Ac/V

OUTPUT:
output = Cc

Solving this ODEs system is less efficient than using the PK macros which uses the analytical solution of the linear system.

Combining iv and oral administrations – Example 2

  • ivOral2_project (data = ‘ivOral2_data.txt’ , model = ‘ivOral2Macro_model.txt’)

In this example (based on simulated PK data), we combine intraveinous injection with 3 different types of oral administrations of the same drug. The datafile ivOral2_data.txt contains column ADM which indicates the route of administration (1,2,3=oral, 4=iv). We assume that one type of oral dose (adm=1) is absorbed into a latent compartment following a zero-order absorption process. The 2 oral doses (adm=2,3) are absorbed into the central compartment following first-order absorption processes with different rates. Bioavailabilities are supposed to be different for the 3 oral doses. There is linear transfer from the latent to the central compartment. A peripheral compartment is linked to the central compartment. The drug is eliminated by a linear process from the central compartment:

This model is implemented in ivOral2Macro_model.txt using PK macros:

[LONGITUDINAL]
input = {F1, F2, F3, Tk01, ka2, ka3, kl, k23, k32, V, Cl}

PK:
compartment(cmt=1, amount=Al)
compartment(cmt=2, amount=Ac)
peripheral(k23,k32)
oral(type=1, cmt=1, Tk0=Tk01, p=F1)
oral(type=2, cmt=2, ka=ka2,   p=F2)
oral(type=3, cmt=2, ka=ka3,   p=F3)
iv(type=4, cmt=2)
transfer(from=1, to=2, kt=kl)
elimination(cmt=2, k=Cl/V)
Cc = Ac/V

OUTPUT:
output = Cc

Here, logit-normal distributions are used for bioavabilities \(F_1\), \(F_2\) and \(F_3\). The model properly fits the data:

Remark: the number and type of doses vary from one patient to another one in this example.

2.7.3.From multiple doses to steady-state


Objectives: learn how to define and use a PK model with multiple doses or assuming steady-state.


Projects: multidose_project, addl_project, ss1_project, ss2_project, ss3_project


Multiple doses

  • multidose_project (data = ‘multidose_data.txt’ , model = ‘lib:bolus_1cpt_Vk.txt’)

In this project, each patient receives several iv bolus injections. Each dose is represented by a row in the data file multidose_data.txt:

The PK model and the statistical model used in this project properly fit the observed data of each individual. Even if there is no observations between 12h and 72h, predicted concentrations computed on this time interval exhibit the multiple doses received by each patient:

VPCs, which is a diagnosis tool, are based on the design of the observations and therefore “ignore” what may happen between 12h and 72h:

On the other hand, the prediction distribution, which is not a diagnosis tool, computes the distribution of the predicted concentration at any time point:

Additional doses (ADDL)

  • addl_project (data = ‘addl_data.txt’ , model = ‘lib:bolus_1cpt_Vk.txt’)

We can remark in the previous project, that, for each patient, the interval time between two successive doses is the same (12 hours for each patient) and the amount of drug which is administrated is always the same as well (40mg for each patient). We can take advantage of this design in order to simplify the data file by defining, for each patient, a unique amount (AMT), the number of additional doses which are administrated after the first one (ADITIONAL DOSES) and the time interval between successive doses (INTERDOSE INTERVAL):

The keywords ADDL and II are automatically recognized by Monolix.

Remarks:

  • Results obtained with this project, i.e. with this data file, are identical to the ones obtained with the previous project.
  • It is possible to combine single doses (using ADDL=0) and repeated doses in a same data file.

Steady-state

  • ss1_project (data = ‘ss1_data.txt’ , model = ‘lib:oral0_1cpt_Tk0VCl.txt’)

The dose orally administrated at time 0 to each patient is assumed to be a “steady-state dose” which means that a  “large” number of doses before time 0 have been administrated, with a constant amount and a constant interval dosing, such that steady-state, i.e. equilibrium, is reached at time 0. The data file ss1_data contains a column STEADY STATE which indicates if the dose is a steady-state dose or not and a column INTERDOSE INTERVAL for the inter-dose interval:

Click on Check the initial fixed effects to display the predicted concentration between the last dose administrated at time 0. One can see that the initial concentration is not 0 but the result of the steady state calculation.

We see on this plot that Monolix adds 5 doses before the last dose to reach steady-state. Individual fits display the predicted concentrations computed with these additional doses:

If the dynamics is slow, adding 5 doses before the last dose might not be sufficient. You can adapt the number of doses in the frame data and thus define it for all individuals as on the following figure.

leading to the following check initial fixed effects

  • ss2_project (data = ‘ss2_data.txt’ , model = ‘lib:oral0_1cpt_Tk0VCl.txt’)

Steady-state and non steady-sates doses are combined in this project:

Individual fits display the predicted concentrations computed with this combination of doses:

2.8.Extensions

2.8.1.Using regression variables







Objectives: learn how to define and use regression variables (time varying covariates).


Projects: reg1_project, reg2_project


Introduction

A regression variable is a variable x which is a given function of time, which is not defined in the model but which is used in the model. x is only defined at some time points t_1, t_2, \ldots, t_m (possibly different from the observation time points), but x is a function of time that should be defined for any t (if is used in an ODE for instance, or if a prediction is computed on a fine grid). Then, Mlxtran defines the function x by interpolating the given values (x_1, x_2, \ldots, x_m). In the current version of Mlxtran, interpolation is performed by using the last given value:

 x(t) = x_j \quad~~\text{for}~~t_j \leq t < t_{j+1} 

The way to introduce it in the Mlxtran longitudinal model is defined here.

Regressor definition in a data set

It is possible to have in a data set one or several columns with column-type REGRESSOR. Within a given subject-occasion, string “.” will be interpolated (last value carried forward interpolation is used) for observation and dose-lines. Lines with no observation and no dose but with regressor values are also taken into account by Monolix for regressor interpolation.

Several points have to be noticed:

  • There is no name correspondance between the name of the regressor in the data set and the name of the regressor used in the longitudinal model.
  • If there are several regressors, the mapping will be done by order of definition.
  • Regressors can only be used in the longitudinal model.

Continuous regression variables

  • reg1_project (data = reg1_data.txt , model=reg1_model.txt)

We consider a basic PD model in this example, where some concentration values are used as a regression variable. The data set is defined as followed

[LONGITUDINAL]
input = {Emax, EC50, Cc}
Cc    = {use=regressor}

EQUATION:
E = Emax*Cc/(EC50 + Cc)

OUTPUT:
output = E

As explained in the previous subsection, there are no name correspondance between the regressor in the data set and the regressor in the model file. Thus, in that case, the values of Cc with respect to time will be taken from the y1 column.
In addition, in that case, the predicted effect is therefore piece wise constant because

  • the regressor interpolation is performed by using the last given value, and then Cc is piece wise constant.
  • The effect model is direct with respect to the concetration.

Thus, it changes at the time points where concentration values are provided:

Categorical regression variables

  • reg2_project (data = reg2_data.txt , model=reg2_model.txt)

The variable z_{ij} takes its values in {1, 2} in this example and represents the state of individual i at time t_{ij}. We then assume that the observed data y_{ij} has a Poisson distribution with parameter lambda1 if z_{ij}=1 and parameter lambda2 if z_{ij}=2. z is known in this example: it is then defined as a regression variable in the model:

[LONGITUDINAL]
input = {lambda1, lambda2, z}
z = {use=regressor}
                           
EQUATION:
if z==0
   lambda=lambda1
else
   lambda=lambda2
end

DEFINITION:
y = {type=count, 
     log(P(y=k)) = -lambda + k*log(lambda) - factln(k)
}

OUTPUT:
output = y

2.8.2.Bayesian estimation


Objectives: learn how to combine maximum likelihood estimation and Bayesian estimation of the population parameters.


Projects: theobayes1_project, theobayes2_project,


Introduction

The Bayesian approach considers the vector of population parameters \(\theta\) as a random vector with a prior distribution \(\pi_\theta\). We can then define the *posterior distribution* of \(\theta\):

\(\begin{aligned} p(\theta | y ) &= \frac{\pi_\theta( \theta )p(y | \theta )}{p(y)} \\ &= \frac{\pi_\theta( \theta ) \int p(y,\psi |\theta) \, d \psi}{p(y)} . \end{aligned} \)

We can estimate this conditional distribution and derive statistics (posterior mean, standard deviation, quantiles, etc.) and the so-called maximum a posteriori (MAP) estimate of \(\theta\):

\(\begin{aligned} \hat{\theta}^{\rm MAP} &=\text{arg~max}_{\theta} p(\theta | y ) \\ &=\text{arg~max}_{\theta} \left\{ {\cal LL}_y(\theta) + \log( \pi_\theta( \theta ) ) \right\} . \end{aligned} \)

The MAP estimate maximizes a penalized version of the observed likelihood. In other words, MAP estimation is the same as penalized maximum likelihood estimation. Suppose for instance that \(\theta\) is a scalar parameter and the prior is a normal distribution with mean \(\theta_0\) and variance \(\gamma^2\). Then, the MAP estimate is the solution of the following minimization problem:

\(\hat{\theta}^{\rm MAP} =\text{arg~min}_{\theta} \left\{ -2{\cal LL}_y(\theta) + \frac{1}{\gamma^2}(\theta – \theta_0)^2 \right\} .\)

This is a trade-off between the MLE which minimizes the deviance, \(-2{\cal LL}_y(\theta)\), and \(\theta_0\) which minimizes \((\theta – \theta_0)^2\). The weight given to the prior directly depends on the variance of the prior distribution: the smaller \(\gamma^2\) is, the closer to \(\theta_0\) the MAP is. In the limiting case, \(\gamma^2=0\); this means that \(\theta\) is fixed at \(\theta_0\) and no longer needs to be estimated. Both the Bayesian and frequentist approaches have their supporters and detractors. But rather than being dogmatic and following the same rule-book every time, we need to be pragmatic and ask the right methodological questions when confronted with a new problem.
All things considered, the problem comes down to knowing whether the data contains sufficient information to answer a given question, and whether some other information may be available to help answer it. This is the essence of the art of modeling: find the right compromise between the confidence we have in the data and our prior knowledge of the problem. Each problem is different and requires a specific approach. For instance, if all the patients in a clinical trial have essentially the same weight, it is pointless to estimate a relationship between weight and the model’s PK parameters using the trial data. A modeler would be better served trying to use prior information based on physiological knowledge rather than just some statistical criterion.
Generally speaking, if prior information is available it should be used, on the condition of course that it is relevant. For continuous data for example, what does putting a prior on the residual error model’s parameters mean in reality? A reasoned statistical approach consists of including prior information only for certain parameters (those for which we have real prior information) and having confidence in the data for the others. Monolix allows this hybrid approach which reconciles the Bayesian and frequentist approaches. A given parameter can be

  • a fixed constant if we have absolute confidence in its value or the data does not allow it to be estimated, essentially due to lack of identifiability.
  • estimated by maximum likelihood, either because we have great confidence in the data or no information on the parameter.
  • estimated by introducing a prior and calculating the MAP estimate or estimating the posterior distribution.

Computing the Maximum a posteriori (MAP) estimate

  • theobayes1_project (data = ‘theophylline_data.txt’ , model = ‘lib:oral1_1cpt_kaVCl.txt’)

We want to introduce a prior distribution for \(ka_{\rm pop}\) in this example. Click on the option button

and select Maximum A Poteriori Estimation

We propose a typical value, here 2 and standard deviation 0.1 for \(ka_{\rm pop}\) and to compute the MAP estimate for \(ka_{\rm pop}\). The distribution of the MAP is inevitably the same as the the one used for the parameter.
The parameter is then colored in purple.

The MAP estimate of \(ka_{\rm pop}\) is a penalized maximum likelihood estimate:

Fixing the value of a parameter

  • theobayes2_project (data = ‘theophylline_data.txt’ , model = ‘lib:oral1_1cpt_kaVCl.txt’)

We can combine different strategies for the population parameters: Bayesian estimation for (ka_{\rm pop}), fixed value for (V_{\rm pop}) and maximum likelihood estimation for (Cl_{\rm pop}), for instance.

Remark:

  • The parameter \(V_{\rm pop}\) is fixed and then colored in red.
  • \(V_{\rm pop}\) is not estimated (it’s s.e. is not computed) but the standard deviation \(\omega_{V}\) is estimated as usual.

2.8.3.Delayed differential equations


Objectives: learn how to implement a model with ordinary differential equations (ODE) and delayed differential equations (DDE).


Projects: tgi_project, seir_project


Ordinary differential equations based model

  • tgi_project (data = tgi_data.txt , model = tgi_model.txt)

We consider here the tumor growth inhibition (TGI) model proposed by Ribba et al. (Ribba, B., Kaloshi, G., Peyre, M., Ricard, D., Calvez, V., Tod, M., . & Ducray, F., *A tumor growth inhibition model for low-grade glioma treated with chemotherapy or radiotherapy*. Clinical Cancer Research, 18(18), 5071-5080, 2012.). This model is defined by a set of ordinary differential equations

 \begin{aligned}\frac{dC}{dt} &= - k_{de} C(t) \\\frac{dP_T}{dt} &= \lambda P_T(t)(1- P^{\star}(t)/K) + k_{QPP}Q_P(t) -k_{PQ} P_T(t) -\gamma \, k_{de} P_T(t)C(t) \\ \frac{dQ_T}{dt} &= k_{PK} P_T(t) -\gamma k_{de} Q_T(t)C(t) \\\frac{dQ_P}{dt} &= \gamma k_{de} Q_T(t)C(t) - k_{QPP} Q_P(t) -\delta_{QP} Q_P(t)\end{aligned}

where P^\star(t) = P_T(t) + Q_T(t) + Q_P(t) is the total tumor size. This set of ODEs is valid for t greater than 0, while

 \begin{aligned} C(0) &= 0 \\ P_T(0) &= P_{T0} \\ Q_T(0) &= Q_0 \\ Q_P(0) &= 0 \end{aligned} 

This model (derivatives and initial conditions) can easily be implemented with Mlxtran:

DESCRIPTION: Tumor Growth Inhibition (TGI) model proposed by Ribba et al
A tumor growth inhibition model for low-grade glioma treated with chemotherapy or radiotherapy. 
Clinical Cancer Research, 18(18), 5071-5080, 2012.

Variables 
- PT: proliferative equiescent tissue
- QT: nonproliferative equiescent tissue
- QP: damaged quiescent cells 
- C:  concentration of a virtual drug encompassing the 3 chemotherapeutic components of the PCV regimen

Parameters
- K      : maximal tumor size (should be fixed a priori)
- KDE    : the rate constant for the decay of the PCV concentration in plasma
- kPQ    : the rate constant for transition from proliferation to quiescence
- kQpP   : the rate constant for transfer from damaged quiescent tissue to proliferative tissue 
- lambdaP: the rate constant of growth for the proliferative tissue
- gamma  : the rate of damages in proliferative and quiescent tissue
- deltaQP: the rate constant for elimination of the damaged quiescent tissue
- PT0    : initial proliferative equiescent tissue
- QT0    : initial nonproliferative equiescent tissue

[LONGITUDINAL]
input = {K, KDE, kPQ, kQpP, lambdaP, gamma, deltaQP, PT0, QT0}

PK:
depot(target=C)

EQUATION:
; Initial conditions
t0    = 0
C_0   = 0
PT_0  = PT0
QT_0  = QT0
QP_0  = 0

; Dynamical model
PSTAR   = PT + QT + QP
ddt_C   = -KDE*C
ddt_PT  = lambdaP*PT*(1-PSTAR/K) + kQpP*QP - kPQ*PT - gamma*KDE*PT*C
ddt_QT  = kPQ*PT - gamma*KDE*QT*C
ddt_QP  = gamma*KDE*QT*C - kQpP*QP - deltaQP*QP

OUTPUT:
output = PSTAR

Remark: t0, PT_0 and QT_0 are reserved keywords that define the initial conditions.

Then, the graphic of individual fits clearly shows that the tumor size is constant until t=0 and starts changing according to the model at t=0.

Don’t forget the initial conditions!

  • tgiNoT0_project (data = tgi_data.txt , model = tgiNoT0_model.txt)

The initial time t0 is not specified in this example. Since t0 is missing, Monolix uses the first time value encountered for each individual. If, for instance, the tumor size has not be compued before 5 for the individual fits, then t0=5 will be used for defining the initial conditions for this individual, which introduces a shift in the plot:

As defined here, the following rule applies

  • When no starting time t0 is defined in the Mlxtran model for Monolix then by default t0 is selected to be equal to the first dose or the first observation, whatever comes first.
  • If t0 is defined, a differential equation needs to be defined.

Conclusion: don’t forget to specify properly the initial conditions of a system of ODEs!

Delayed differential equations based model

A system of delay differential equations (DDEs) can be implemented in a block EQUATION of the section [LONGITUDINAL] of a script Mlxtran. Mlxtran provides the command delay(x,T) where x is a one-dimensional component and T is the explicit delay. Therefore, DDEs with a nonconstant past of the form

$$ \begin{array}{ccl} \frac{dx}{dt} &=& f(x(t),x(t-T_1), x(t-T_2), …), ~~\text{for}~~t \geq 0\ x(t) &=& x_0(t) ~~~~\text{for}~~\text{min}(T_k) \leq t \leq 0 \end{array} $$

can be solved. The syntax and rules are explained here.

  • seir_project (data = seir_data.txt , model = seir_model.txt)

The model is a system of 4 DDEs and defined with the following mode:

DESCRIPTION: SEIR model, using delayed differential equations.
"An Epidemic Model with Recruitment-Death Demographics and Discrete Delays", Genik & van den Driessche (1999)

Decomposition of the total  population into four epidemiological classes 
S (succeptibles), E (exposed), I (infectious), and  R (recovered)

The parameters corresponds to 
- birthRate: the birth rate,
- deathRate: the natural death rate,
- infectionRate: the contact rate of infective individuals,
- recoveryRate: the rate of recovery,
- excessDeathRate: the excess death rate for infective individuals

There is a time delay in the model:
- tauImmunity: a temporary immunity delay

[LONGITUDINAL]
input = {birthRate, deathRate, infectionRate, recoveryRate, excessDeathRate, tauImmunity, tauLatency}

EQUATION:
; Initial conditions
t0    = 0
S_0 = 15
E_0 = 0
I_0  = 2
R_0 = 3

; Dynamical model
N = S + E + I + R

ddt_S = birthRate - deathRate*S - infectionRate*S*I/N + recoveryRate*delay(I,tauImmunity)*exp(-deathRate*tauImmunity)
ddt_E = infectionRate*S*I/N - deathRate*E - infectionRate*delay(S,tauLatency)*delay(I,tauLatency)*exp(-deathRate*tauLatency)/(delay(I,tauLatency)+delay(S,tauLatency)+delay(E,tauLatency)+delay(R,tauLatency))
ddt_I = -(recoveryRate+excessDeathRate+deathRate)*I + infectionRate*delay(S,tauLatency)*delay(I,tauLatency)*exp(-deathRate*tauLatency)/(delay(I,tauLatency)+delay(S,tauLatency)+delay(E,tauLatency)+delay(R,tauLatency))
ddt_R = recoveryRate*I - deathRate*R - recoveryRate*delay(I,tauImmunity)*exp(-deathRate*tauImmunity)

OUTPUT:
output = {S, E, I, R}

Introducing these delays allows to obtain nice fits for the 4 outcomes, including (R_{ij}) (corresponding to the output y4):

Case studies

  • 8.case_studies/arthritis_project

3.Tasks






Monolix allows a workflow with several tasks.

On the interface, one can see six different tasks

  • POP. PARAM.: it corresponds to the estimation of the population parameters,
  • EBEs: it corresponds to the estimation of the individual parameters using the conditional mode, i.e. the most probable individual parameters.
  • CONDITIONAL DISTRIBUTION: It corresponds to the draws individual parameters based on the conditional distribution. It allows to compute the mean value of the conditional distribution.
  • STD. ERRORS.: it corresponds to the calculation of the Fisher information matrix and standard errors. Two methods are proposed for it. Either using the linearization method or using the stochastic approximation. The choice between those methods is done with the “Use linearization method” toggle under the tasks.
  • LIKELIHOOD: it corresponds to the explicit calculation of the log-likelihood. A specificity of the SAEM algorithm is that it does not computed explicitly the objective function. Thus, a dedicated task is proposed. Two methods are proposed for it. Either using the linearization method or using the importance sampling. The choice between those methods is done with the “Use linearization method” toggle under the tasks. This toogle is for both STD ERRORS and LIKELIHOOD tasks to be more relevant.
  • PLOTS: it corresponds to the generation of the plots.

Also, different types of results are available in the form of plots and tables. The tasks can run individually by clicking on the associated button, or you can define a workflow by clicking on the tasks to run (on the small light blue checks) and click on the play button (in green) as proposed on the figure below.
Notice that you can initialize all the parameters and the associated methods in the “Initial Estimates” frame as described here.
Moreover, Monolix includes a convergence assessment tool. It is possible to execute a workflow as defined above but for different, randomly generated, initial values of fixed effects.
All the output files are detailed here.

Monolix API

Monolix is now proposed with an API leading to the possibility to have access to the project exactly by the same way as you would do with the interface. All the functions are described here.

3.1.Initialization

Parameter initial estimates and associated methods






Initial values are specified for the fixed effects, for the standard deviations of the random effects and for the residual error parameters. These initial values are available through the frame “Initial estimates” of the interface as can be seen on the following figure. It is recommended to initialize the estimation to have faster convergence.

Initialization of the estimates

Initialization of the “Fixed effects”

The user can modify all the initial values of the fixed effects. When initializating the project, the values are set by default to 1. To change it, the user can click on the parameter and change the value

Notice that when you click on the parameter, an information is provided to tell what value is possible. The constraint depends on the distribution chosen for the parameter. For exemple, if the volume parameter V is defined as lognormal, its initial value should be strictly greater than 0. In that case, if you set a negative value, an error will be thrown and the previous parameter will be displayed.

When a parameter depends on a covariate, initial values for the dependency (named with \beta prefix, for instance beta_V_SEX_M to add the dependency of SEX, on parameter V) are displayed. The default initial value is 0. In case of a continuous covariate, the covariate is added linearly to the transformed parameter, with a coefficient \beta. For categorical covariates, the initial value for the reference category will be the one of the fixed effect, while for all other categories it will be the initial value for the fixed effect plus the initial value of the \beta, in the transformed parameter space. It is possible to define different initial values for the non-reference categories. The equations for the parameters can be visualized by clicking on button formula in the “Statistical model & Tasks” frame

Initialization of the “Standard deviation of the random effects”

The user can modify all the initial values of the standard deviations of the random effects. The default value is set to 1. We recommend to keep these values high in order for SAEM to have the possibility to explore the domain.

Initialization of the “Residual error parameters”

The user can modify all the initial values of the residual error parameters. There are as many lines as continuous outputs of the model. The default value depends on the parameter (1 for “a”, 0.3 for “b” and 1 for “c”).

What method can I use for the parameters estimations?

For all the parameters, there are several methods for the estimation

  • “Fixed”: the parameter is kept to its initial value and so, it will not be estimated. In that case, the parameter name is set to orange.
  • “Maximum Likelihood Estimation”: The parameter is estimated using maximum likelihood. In that case, the the parameter name remains grey. This is the default option
  • “Maximum A Posteriori Estimation”: The parameter is estimated using maximum a posteriori estimation. In that case, the user has to define both a typical value and a standard deviation. For more about this, see here. In that case, the parameter name is colored in purple.

To change the method, click on the right of the parameter as on the following.

A window pops up to choose the method as on the following figure


Notice that you have buttons to fix all the parameters or estimate all on the top right of the window as can be seen on the following figure

How to initialize your parameters?

Check initial fixed effects

When clicking on the “Check the initial fixed effects”, the simulations obtained with the initial population fixed effects values are displayed for each individual together with the data points, in case of continuous observations. This feature is very useful to find some “good” initial values. Although Monolix is quite robust with respect to initial parameter values, good initial estimates speed up the estimation. You can change the values of the parameters and see how the agreement with the data change. In addition, you can change the axis to log-scale and choose the same limit on all axis to have a better comparison of the individuals.

In addition, if you think that there are not enough points for the prediction (if there are a lot of doses for example), you can change the discretization and increase the number of points.

On the use of last estimates





If you have already estimated the population parameters for this project, then you can use the “Use the last estimates” buttons to use the previous estimates as initial values. The user has the possibility to use all the last estimates or only the fixed effects. The interest of using only the fixed effects is not to have too low initial standard effects and thus let SAEM explore a larger domain for the next run.

3.2.Population parameter estimation using SAEM

Purpose

The estimation of the population parameters is the key task in non-linear mixed effect modeling. In Monolix, it is performed using the Stochastic Approximation Expectation-Maximization (SAEM) algorithm [1]. SAEM has been shown to be extremely efficient for both simple and a wide variety of complex models: categorical data [2], count data [3], time-to-event data [4], mixture models [56], differential equation based models, censored data [7], … The convergence of SAEM has been rigorously proven [1] and its implementation in Monolix is particularly efficient. No other algorithms are available in Monolix.

Calculations: the SAEM algorithm

[under construction]

Running the population parameter estimation task

The pop-up window which permits to follow the progress of the task is shown below. The algorithm starts with a small number (5 by default) of burn-in iterations for initialization which are displayed in the following way: (note that this step can be so fast that it is not visible by the user)

Afterwards, the evolution of the value for each population parameter over the iterations of the algorithm is displayed. The red line marks the switch from the exploratory phase to the smoothing phase. The exact value at each iterations can be followed by hovering over the curve (as for Cl_pop below). The convergence indicator (in purple) is a measure related to the likelihood (but not directly the likelihood) which helps to detect that convergence has been reached. It is used, among other measures, in the auto-stop criteria.

Dependencies between tasks:

The “Population parameter” estimation task must be launched before running any other task. To skip this task, the user can fix all population parameters. If all population parameters have been set to “fixed”, the estimation will stop after a single iteration and allow the user to continue with the other tasks.

Outputs

In the graphical user interface

The estimated population parameters are displayed in the Pop. Param section of the Results tab. Fixed effects are named “*_pop”, the standard deviation of the random effects “omega_*”, parameters of the error model “a”, “b”, “c”, the correlation between random effects “corr_*_*” and parameters associated to covariates “beta_*_*”.

When the “Standard errors” task has also been run, the standard error (s.e), relative standard error (r.s.e) and p-values for covariate betas are also displayed in this result section. The total elapsed time for this task is shown at the bottom.

In the output folder

After having run the estimation of the population parameters, the following files are available:

  • summary.txt: contains the estimated population parameters, in a format easily readable by a human (but not easy to parse for a computer)
  • populationParameters.txt: contains the estimated population parameters (by default in csv format)
  • predictions.txt: contains for each individual and each observation time, the observed data (y), the prediction using the population parameters and population median covariates value from the data set (popPred_medianCOV), the prediction using the population parameters and individual covariates value (popPred), the prediction using the individual approximate conditional mean calculated from the last iterations of SAEM (indivPred_SAEM) and the corresponding weighted residual (indWRes_SAEM).
  • IndividualParameters/estimatedIndividualParameters.txt: individual parameters corresponding to the approximate conditional mean, calculated using the last estimations of SAEM (*_SAEM)
  • IndividualParameters/estimatedRandomEffects.txt: individual random effects corresponding to the approximate conditional mean, calculated using the last estimations of SAEM (*_SAEM)

More details about the content of the output files can be found here.

Settings

The settings are accessible through the interface via the button next to the parameter estimation task.

Burn-in phase:

The burn-in phase corresponds to an “initialization” of SAEM. Note: the meaning of the brun-in phase in Monolix is different to what is called burn-in in Nonmem algorithms.

  • Number of iterations (default: 5): number of iterations  of the burn-in phase

Exploratory phase

  • Auto-stop criteria (default: yes): if ticked, auto-stop criteria are used to automatically detect convergence during the exploratory phase. If convergence is detected, the algorithm switches to the smoothing phase before the maximum number of iterations.
  • Maximum number of iterations (default: 500, if auto-stop ticked): maximum number of iterations for the exploratory phase. Even if the auto-stop criteria are not fulfilled, the algorithm switches to the smoothing phase after this maximum number of iterations.
  • Minimum number of iterations (default: 150, if auto-stop ticked): minimum number of iterations for the exploratory phase. This value also corresponds to the interval length over which the auto-stop criteria are tested. A larger minimum number of iterations means that the auto-stop criteria are harder to reach.
  • Number of iterations (default: 500, if auto-stop unticked): fixed number of iterations for the exploratory phase.
  • Step-size exponent (default: 0): The value, comprised between 0 and 1, represents memory of the stochastic process, i.e how much weight is given at iteration k to the value of the previous iteration compared to the new information collected. A value 0 means no memory, i.e the parameter value at iteration k is build based on the information collected at that iteration only, and does not take into account the value of the parameter at the previous iteration.
  • Simulated annealing (default: enabled): the Simulated Annealing version of SAEM permits to better explore the parameter space by constraining the standard deviation of the random effects to decrease slowly. For more details on this setting, see here.
  • Decreasing rate for the variance of the residual errors (default: 0.95, if simulated annealing enabled): the residual error variance (parameter “a” for a constant error model for instance) at iteration k is constrained to be larger than the decreasing rate times the variance at the previous iteration.
  • Decreasing rate for the variance of the individual parameters (default: 0.95, if simulated annealing enabled): the variance of the random effects at iteration k is constrained to be larger than the decreasing rate times the variance at the previous iteration.

Smoothing phase

  • Auto-stop criteria (default: yes): if ticked, auto-stop criteria are used to automatically detect convergence during the smoothing phase. If convergence is detected, the algorithm stops before the maximum number of iterations.
  • Maximum number of iterations (default: 200, if auto-stop ticked): maximum number of iterations for the smoothing phase. Even if the auto-stop criteria are not fulfilled, the algorithm stops after this maximum number of iterations.
  • Minimum number of iterations (default: 50, if auto-stop ticked): minimum number of iterations for the smoothing phase. This value also corresponds to the interval length over which the auto-stop criteria is tested. A larger minimum number of iterations means that the auto-stop criteria is harder to reach.
  • Number of iterations (default: 200, if auto-stop unticked): fixed number of iterations for the smoothing phase.
  • Step-size exponent (default: 0.7): The value, comprised between 0 and 1, represents memory of the stochastic process, i.e how much weight is given at iteration k to the value of the previous iteration compared to the new information collected.  The value must be strictly larger than 0.5 for the smoothing phase to converge. Large values (close to 1) will result in a smoother parameter trajectory during the smoothing phase, but may take longer to converge to the maximum likelihood estimate.

Methodology for parameters without variability (if parameters without variability are present in the model):

The SAEM algorithm requires to draw parameter values from their marginal distribution, which does not exist for parameters without variability. These parameters are thus estimated via another method, which can be chosen among:

  • No variability (default choice): After each SAEM iteration, the parameter without variability are optimized using the Nelder-Mead simplex algorithm. The absolute tolerance (stopping criteria) is 1e-4 and the maximum number of iterations 20 times the number of parameters to calculate via this algorithm.
  • Add decreasing variability: an artificial variability is added for these parameters, allowing estimation via SAEM. The variability is progressively decreased such that at the end of the estimation process, the parameter has a variability of 1e-5.
  • Variability in the first stage: during the exploratory phase, an artificial variability is added and progressively forced to 1e-5 (same as above). In the smoothing phase, the Nelder-Mead simplex optimization algorithm is used.

Handling parameters without variability is also discussed here.

Good practice and tips

[under construction]

3.3.Conditional distribution







Purpose

The conditional distribution represents the uncertainty of the individual parameter values. The conditional distribution estimation task permits to sample from this distribution. The samples are used to calculate the condition mean, or directly as estimator of the individual parameters in the plots to improve their informativeness [1]. They are also used to compute the statistical tests.

Calculation of the conditional distribution

Conditional distribution

The conditional distribution is \(p(\psi_i|y_i;\hat{\theta})\) with \(\psi_i\) the individual parameters for individual i, \(\hat{\theta}\) the estimated population parameters, and \(y_i\) the data (observations) for individual i. The conditional distribution represents the uncertainty of the individual’s parameter value, taking into account the information at hand for this individual:

  • the observed data for that individual,
  • the covariate values for that individual,
  • and the fact that the individual belongs to the population for which we have already estimated the typical parameter value (fixed effects) and the variability (standard deviation of the random effects).

It is not possible to directly calculate the probability for a given \(\psi_i\) (no closed form), but is possible to obtain samples from the distribution using a Markov-Chain Monte-Carlo procedure (MCMC).

MCMC algorithm

MCMC methods are a class of algorithms for sampling from probability distributions for which direct sampling is difficult. They consist of constructing a stochastic procedure which, in its stationary state, yields draws from the probability distribution of interest. Among the MCMC class, we use the Metropolis-Hastings (MH) algorithm, which has the property of being able to sample probability distributions which can be computed up to a constant. This is the case for our conditional distribution, which can be rewritten as:

$$p(\psi_i|y_i)=\frac{p(y_i|\psi_i)p(\psi_i)}{p(y_i)}$$

\(p(y_i|\psi_i)\) is the conditional density function of the data when knowing the individual parameter values and can be computed (closed form solution). \(p(\psi_i)\) is the density function for the individual parameters and can also be computed. The likelihood \(p(y_i)\) has no closed form solution but it is constant.

In brief, the MH algorithm works in the following way: at each iteration k, a new individual parameter value is drawn from a proposal distribution for each individual. The new value is accepted with a probability that depends on \(p(\psi_i)\) and \(p(y_i|\psi_i)\). After a transition period, the algorithm reaches a stationary state where the accepted values follow the conditional distribution probability \(p(\psi_i|y_i)\). For the proposal distribution, three different distributions are used in turn with a (2,2,2) pattern (setting “Number of iterations of kernel 1/2/3” in Settings > Project Settings): the population distribution, a unidimensional Gaussian random walk, or a multidimensional Gaussian random walk. For the random walks, the variance of the Gaussian is automatically adapted to reach an optimal acceptance ratio (“target acceptance ratio” setting in Settings > Project Settings).

Conditional mean

The draws from the conditional distribution generated by the MCMC algorithm can be used to estimate any summary statistics of the distribution (mean, standard deviation, quantiles, etc). In particular we calculate the conditional mean by averaging over all draws:

$$ \hat{\psi}_i^{mean} = \frac{1}{K}\sum_{k=1}^{K}\psi_i^{k}$$

The standard deviation of the conditional distribution is also calculated.

Samples from the conditional distribution

Among all samples from the conditional distribution, a small number (between 1 and 10, see “Simulated parameters per individual” setting) is kept to be used in the plots. These samples are unbiased estimators and they present the advantage of not being affected by shrinkage, as shown for example on the documentation of the plot “distribution of the individual parameters“.

Shrinkage and the use of random samples from the conditional distribution are explained in more details here.

Stopping criteria

At iteration k, the conditional mean is calculated for each individual by averaging over all k previous iterations. The average conditional means over all individuals (noted E(X|y)), and the standard deviation of the conditional means over all individuals (noted sd(X|y)) are calculated and displayed in the pop-up window. The algorithm stops when, for all parameters, the average conditional means and standard deviations of the last 50 iterations (“Interval length” setting) do not deviate by more than 5% (2.5% in each direction, “relative interval” setting) from the average and standard deviation values at iteration k.

Running the conditional distribution estimation task

During the evaluation of the conditional distribution, the following plot pop-ups, displaying the average conditional means over all individuals (noted E(X|y)), and the standard deviation of the conditional means over all individuals (noted sd(X|y)) for each iteration of the MCMC algorithm.

The convergence criteria described above means that the blue line, which represents the average over all individuals of the conditional mean, must be within the tube. The tube is centered around the last value of the blue line and spans over 5% of that last value. The algorithm stops when all blue lines are in their tube.

Dependencies between tasks:

  • The “Population parameters” task must be run before launching the conditional distribution task.
  • The conditional distribution task is recommended before calculating the log-likelihood task without the linearization method (i.e log-likelihood via importance sampling).
  • The conditional distribution task is necessary for the statistical tests.
  • The samples generated during the conditional distribution task will be reused for the Standard errors task (without linearization).

Outputs

In the graphical user interface

In the Indiv.Param section of the Results tab, a summary of the estimated conditional mean is given (min, max and quartiles) as shown in the figure below.

In the output folder

After having run the conditional distribution task, the following files are available:

  • summary.txt: contains the summary statistics (as displayed in the GUI)
  • IndividualParameters/estimatedIndividualParameters.txt: the individual parameters for each subject-occasion are displayed. The conditional mean (*_mean) and the standard deviation (*_sd) of the conditional distribution are added to the file.
  • IndividualParameters/estimatedRandomEffects.txt: the individual random effects for each subject-occasion are displayed. Those corresponding to the conditional mean (*_mean) are added to the file, together with the standard deviation (*_sd).
  • IndividualParameters/simulatedIndividualParameters.txt: several simulated individual parameters (draws from the conditional distribution) are recorded for each individual. The rep column permits to distinguish the several simulated parameters for each individual.
  • IndividualParameters/simulatedRandomEffects.txt: the random effects corresponding to the simulated individual parameters are recorded.

More details about the content of the output files can be found here.

Settings

To change the settings, you can click on the settings button next the conditional distribution task.

  • Interval length (default: 50): number of iterations over which the convergence criteria is checked.
  • Relative interval (default: 0.05): size of the interval (relative to the current average or standard deviation) in which the last “interval length” iterations must be for the stopping criteria to be met. A value at 0.05 means that over the last “interval length” iterations, the value should not vary by more than 5% (2.5% in each direction).
  • Simulated parameters per individual (default: via calculation): number of draws from the conditional distribution that will be used in the plots. The number is calculated as min(10, idealNb) with idealNb = max(500 / number of subject , 5000 / number of observations). This means that the maximum number is 10 (which is usually the case for small data sets). For large data sets, the number may be reduced, but the number of individual times the number of simulated parameters should be at least 500, and the number of observations times the number of simulated parameters should be at least 5000. This ensures to have a sufficiently large but not unnecessarily large number of dots in the plots such as Observations versus predictions or Correlation between random effects.

3.4.EBEs

Purpose

EBEs stands for Empirical Bayes Estimates. The EBEs are the most probable value of the individual parameters (parameters for each individual), given the estimated population parameters and the data each individual. In a more mathematical language, they are the mode of the conditional parameter distribution for each individual.

These values are useful to compute the most probable prediction for each individual, for comparison with the data (for instance in the Individual Fits plot).

Calculation of the EBEs (conditional mode)

When launching the “EBEs” task, the mode of the conditional parameter distribution is calculated.

Conditional distribution

The conditional distribution is \( p(\psi_i|y_i;\hat{\theta})\) with \(\psi_i\) the individual parameters for individual i, \(\hat{\theta}\) the estimated population parameters, and \(y_i\) the data (observations) for individual i. The conditional distribution represents the uncertainty of the individual’s parameter value, taking into account the information at hand for this individual: the observed data for that individual, the covariate values for that individual and the fact that the individual belongs to the population for which we have already estimated the typical parameter value (fixed effects) and the variability (standard deviation of the random effects). It is not possible to directly calculate the probability for a given \(\psi_i\) (no closed form), but is possible to obtain samples from the distribution using a Markov-Chain Monte-Carlo procedure (MCMC). This is detailed more on the Conditional Distribution page.

Mode of the conditional distribution

The mode is the parameter value with the highest probability:

$$ \hat{\psi}_i^{mode} = \underset{\psi_i}{\textrm{arg max }}p(\psi_i|y_i;\hat{\theta})$$

To find the mode, we thus need to maximize the conditional probability with respect to the individual parameter value \(\psi_i\).

Individual random effects

Once the individual parameters values \(\psi_i\) are known, the corresponding individual random effects can be calculated using the population parameters and covariates. Taking the example of a parameter \(\psi\) having a normal distribution within the population and that depends on the covariate \(c\), we can write for individual \(i\):

$$ \psi_i = \psi_{pop} + \beta \times c_i + \eta_i$$

As \(\psi_i\) (estimated conditional mode), \(\psi_{pop}\) and \(\beta\) (population parameters) and \(c_i\) (individual covariate value) are known, the individual random effect \(\eta_i\) can easily be calculated.

Algorithm

For each individual, to find the \(\psi_i\) values that maximizes the conditional distribution, we use the Nelder-Mead Simplex algorithm [1].

As the conditional distribution does not have a closed form solution (i.e \(p(\psi_i|y_i;\hat{\theta})\) cannot be directly or easily calculated for a given \(\psi_i\)), we use the Bayes law to rewrite it in the following way (leaving the population parameters \(\hat{\theta}\) out for clarity):

$$p(\psi_i|y_i)=\frac{p(y_i|\psi_i)p(\psi_i)}{p(y_i)}$$

The conditional density function of  the data when knowing the individual parameter values (i.e \(p(y_i|\psi_i)\)) is easy to calculate, as well as the density function for the individual parameters (i.e \(p(\psi_i)\)), because they have closed form solutions. On the opposite, the likelihood \(p(y_i)\) has no closed form solution. But as it does not depend on \(\psi_i\), we can leave it out of the optimization procedure and only optimize \(p(y_i|\psi_i)p(\psi_i)\).

Running the EBEs task

When running the EBEs task, the progress is displayed in the pop-up window:

Dependencies between tasks:

Outputs

In the graphical user interface

In the Indiv.Param section of the Results tab, a summary of the individual parameters is proposed (min, max, median and quartiles) as shown in the figure below. The elapsed time for this task is also shown. To see the estimated parameter value for each individual, the user is invited to open the output files, which can be accessed via the folder icon at the bottom left.

In the output folder

After having run the EBEs task, the following files are available:

  • summary.txt: contains the summary statistics (as displayed in the GUI)
  • IndividualParameters/estimatedIndividualParameters.txt: the individual parameters for each subject-occasion are displayed. In addition to the already present approximation conditional mean from SAEM (*_SAEM), the conditional mode (*_mode) is added to the file.
  • IndividualParameters/estimatedRandomEffects.txt: the individual random effects for each subject-occasion are displayed (*_mode), in addition to the already present value based on the approximate conditional mean from SAEM (*_SAEM).

More details about the content of the output files can be found here.

Settings

The settings are accessible through the interface via the button next to the EBEs task.

 

  • Maximum number of iterations (default: 200): maximum number of iterations for the Nelder-Mead Simplex algorithm, for each individual. Even if the tolerance criteria is not met, the algorithm stops after that number of iterations.
  • Tolerance (default: 1e-6): absolute tolerance criteria. The algorithm stops when the change of the conditional probability value between two iterations is less than the tolerance.

3.5.Standard error using the Fisher Information Matrix

Purpose

The standard errors represent the uncertainty of the estimated population parameters. In Monolix, they are calculated via the estimation of the Fisher Information Matrix. They can for instance be used to calculate confidence intervals or detect model overparametrization.

 

Calculation of the standard errors

Several methods have been proposed to estimate the standard errors, such as bootstrapping or via the Fisher Information Matrix (FIM). In the Monolix GUI, the standard errors are estimated via the FIM. Bootstrapping will be available soon via a R package.

The Fisher Information Matrix (FIM)

The Fisher information matrix (FIM) \(I \) is minus the second derivatives of the observed likelihood:

$$ I(\hat{\theta}) = -\frac{\partial^2}{\partial\theta^2}\log({\cal L}_y(\hat{\theta})) $$

The log-likelihood cannot be calculated in closed form and the same applied to the Fisher Information Matrix. Two different methods are available in Monolix for the calculation of the Fisher Information Matrix: by linearization or by stochastic approximation.

Via linearization

This method can be applied for continuous data only. A continuous model can be written as:

$$\begin{array}{cl} y_{ij} &= f(t_{ij},z_i)+g(t_{ij},z_i)\epsilon_{ij} \\ z_i &= z_{pop}+\eta_i \end{array}$$

with \( y_{ij} \) the observations, f the prediction, g the error model, \( z_i\) the individual parameter value for individual i, \( z_{pop}\) the typical parameter value within the population and \(\eta_i\) the random effect.
Linearizing the model means using a Taylor expansion in order to approximate the observations \( y_{ij} \) by a normal distribution. In the formulation above, the appearance of the random variable \(\eta_i\) in the prediction f in a nonlinear way leads to a complex (non-normal) distribution for the observations \( y_{ij} \).
The Taylor expansion is done around the EBEs value, that we note \( z_i^{\textrm{mode}} \).

Standard errors

Once the Fisher Information Matrix has been obtained, the standard errors can be calculated as the square root of the diagonal elements of the inverse of the Fisher Information Matrix. The inverse of the FIM \(I(\hat{\theta})\) is the variance-covariance matrix \(C(\hat{\theta})\):

$$C(\hat{\theta})=I(\hat{\theta})^{-1}$$

The standard error for parameter \( \hat{\theta}_k \) can be calculated as:

$$\textrm{s.e}(\hat{\theta}_k)=\sqrt{\tilde{C}_{kk}(\hat{\theta})}$$

Note that in Monolix, the Fisher Information Matrix and variance-covariance matrix are calculated on the transformed normally distributed parameters. The variance-covariance matrix \( \tilde{C} \) for the untransformed parameters can be obtained using the jacobian \(J\):

$$\tilde{C}=J^TC J$$

Correlation matrix

The correlation matrix is calculated from the variance-covariance matrix as:

$$\text{corr}(\theta_i,\theta_j)=\frac{\tilde{C}_{ij}}{\textrm{s.e}(\theta_i)\textrm{  s.e}(\theta_j)}$$

Wald test

For the beta parameters characterizing the influence of the covariates, the relative standard error can be used to perform a Wald test, testing if the estimated beta value is significantly different from zero.

 

Running the standard errors task

When running the standard error task, the progress is displayed in the pop-up window. At the end of the task, the correlation matrix is also shown:

Dependencies between tasks:

The “Population parameters” task must be run before launching the Standard errors task. If the Conditional distribution task has already been run, the first iterations of the Standard errors (without linearization) will be very fast, as they will reuse the same draws as those obtained in the Conditional distribution task.

 

Output

In the graphical user interface

In the Pop.Param section of the Results tab, three additional columns appear in addition to the estimated population parameters:

  • S.E: the estimated standard errors
  • R.S.E: the relative standard error (standard error divided by the estimated parameter value)
  • P-VALUE (in case of covariates): p-values obtained from a Wald test on the beta parameters associated to covariates. The Wald test tests if the estimated beta value is significantly different from zero.

To help the user in the interpretation, a color code is used for the p-value and the RSE:

  • For the p-value: between .01 and .05, between .001 and .01, and less than .001.
  • For the RSE: between 50% and 100%, between 100% and 200%, and more than 200%.

When the standard errors were estimated both with and without linearization, the S.E, R.S.E and P-VALUE are displayed for both methods.

In the STD.ERRORS section of the Results tab, we display:

  • R.S.E: the relative standard errors
  • Correlation matrix: the correlation matrix of the population parameters
  • Eigen values: the smallest and largest eigen values, as well as the condition number (max/min)

To help the user in the interpretation, a color code is used:

  • For the correlation: between .5 and .8, between .8 and .9, and higher than .9.
  • For the RSE: between 50% and 100%, between 100% and 200%, and more than 200%.

When the standard errors were estimated both with and without linearization, both results appear in different subtabs.

If you hover on a specific value with the mouse, both parameters are highlighted to know easily which parameter you are looking at:

In the output folder

After having run the Standard errors task, the following files are available:

  • summary.txt: contains the s.e, r.s.e, p-values, correlation matrix and eigenvalues in an easily readable format
  • populationParameters.txt: contains the s.e, r.s.e and p-values in csv format, for the method with (*_lin) or without (*_sa) linearization
  • FisherInformation/correlationEstimatesSA.txt: correlation matrix of the population parameter estimates, method without linearization (stochastic approximation)
  • FisherInformation/correlationEstimatesLin.txt: correlation matrix of the population parameter estimates, method with linearization
  • FisherInformation/covarianceEstimatesSA.txt: variance-covariance matrix of the transformed normally distributed population parameter, method without linearization (stochastic approximation)
  • FisherInformation/covarianceEstimatesLin.txt: variance-covariance matrix of the transformed normally distributed population parameter, method with linearization

 

Interpreting the correlation matrix of the estimates

The color code of Monolix’s results allows to quickly identify population parameter estimates that are strongly correlated. This often reflects model overparameterization and can be further investigated using Mlxplore and the convergence assessment. This is explained in details in this video:





 

Settings

The settings are accessible through the interface via the button next to the Standard errors task:

  • Minimum number of iterations: minimum number of iterations of the stochastic approximation algorithm to calculate the Fisher Information Matrix.
  • Maximum number of iterations: maximum number of iterations of the stochastic approximation algorithm to calculate the Fisher Information Matrix. The algorithm stops even if the stopping criteria are not met.

 

Good practices and tips

When to use “use linearization method”?

Firstly, it is only possible to use the linearization method for continuous data. For the linearization is available, this method is generally much faster than without linearization (i.e stochastic approximation) but less precise. The Fisher Information Matrix by model linearization will generally be able to identify the main features of the model. More precise– and time-consuming – estimation procedures such as stochastic approximation will have very limited impact in terms of decisions for these most obvious features. Precise results are required for the final runs where it becomes more important to rigorously defend decisions made to choose the final model and provide precise estimates and diagnosis plots.

I have NANs as results for standard errors for parameter estimates. What should I do? Does it impact the likelihood?

NaNs as standard errors often appear when the model is too complex and some parameters are unidentifiable. They can be seen as an infinitely large standard error.
The likelihood is not affected by NaNs in the standard errors. The estimated population parameters having a NaN as standard error are only very uncertain (infinitely large standard error and thus infinitely large confidence intervals).

3.6.Log Likelihood estimation

Purpose

The log-likelihood is the objective function and a key information. The log-likelihood cannot be computed in closed form for nonlinear mixed effects models. It can however be estimated.

Log-likelihood estimation

Performing likelihood ratio tests and computing information criteria for a given model requires computation of the log-likelihood

$$ {\cal L}{\cal L}_y(\hat{\theta}) = \log({\cal L}_y(\hat{\theta})) \triangleq \log(p(y;\hat{\theta})) $$

where \(\hat{\theta}\) is the vector of population parameter estimates for the model being considered. The log-likelihood cannot be computed in closed form for nonlinear mixed effects models. It can however be estimated in a general framework for all kinds of data and models using the importance sampling Monte Carlo method. This method has the advantage of providing an unbiased estimate of the log-likelihood – even for nonlinear models – whose variance can be controlled by the Monte Carlo size.

Two different algorithms are proposed to estimate the log-likelihood: by linearization and by Importance sampling. The estimated log-likelihoods are computed and stored in the LogLikelihood folder in the result folder. In this folder, two files are stored:

  • logLikelihood.txt containing the OFV (objective function value), AIC, and BIC.
  • individualLL.txt containing the -2LL for each individual.

Log-likelihood by importance sampling

The observed log-likelihood \({\cal LL}(\theta;y)=\log({\cal L}(\theta;y))\) can be estimated without requiring approximation of the model, using a Monte Carlo approach. Since

$${\cal LL}(\theta;y) = \log(p(y;\theta)) = \sum_{i=1}^{N} \log(p (y_i;\theta))$$

we can estimate \(\log(p(y_i;\theta))\) for each individual and derive an estimate of the log-likelihood as the sum of these individual log-likelihoods. We will now explain how to estimate \(\log(p(y_i;\theta))\) for any individual i. Using the \(\phi\)-representation of the model (the individual parameters are transformed to be Gaussian), notice first that \(p(y_i;\theta)\) can be decomposed as follows:

$$p(y_i;\theta) = \int p(y_i,\phi_i;\theta)d\phi_i = \int p(y_i|\phi_i;\theta)p(\phi_i;\theta)d\phi_i = \mathbb{E}_{p_{\phi_i}}\left(p(y_i|\phi_i;\theta)\right)$$

Thus, \(p(y_i;\theta)\) is expressed as a mean. It can therefore be approximated by an empirical mean using a Monte Carlo procedure:

  1. Draw M independent values \(\phi_i^{(1)}\), \(\phi_i^{(2)}\), …, \(\phi_i^{(M)}\) from the marginal distribution \(p_{\phi_i}(.;\theta)\).
  2. Estimate \(p(y_i;\theta)\) with \(\hat{p}_{i,M}=\frac{1}{M}\sum_{m=1}^{M}p(y_i | \phi_i^{(m)};\theta)\)

By construction, this estimator is unbiased, and consistent since its variance decreases as 1/M:

$$\mathbb{E}\left(\hat{p}_{i,M}\right)=\mathbb{E}_{p_{\phi_i}}\left(p(y_i|\phi_i^{(m)};\theta)\right) = p(y_i;\theta) ~~~~\mbox{Var}\left(\hat{p}_{i,M}\right) = \frac{1}{M} \mbox{Var}_{p_{\phi_i}}\left(p(y_i|\phi_i^{(m)};\theta)\right)$$

We could consider ourselves satisfied with this estimator since we “only” have to select M large enough to get an estimator with a small variance. Nevertheless, it is possible to improve the statistical properties of this estimator.

The problem is that it is not possible to generate the \(\phi_i^{(m)}\) with this conditional distribution, since that would require to compute a normalizing constant, which here is precisely \(p(y_i;\theta)\).

Nevertheless, this conditional distribution can be estimated using the Metropolis-Hastings algorithm and a practical proposal “close” to the optimal proposal \(p_{\phi_i|y_i}\) can be derived. We can then expect to get a very accurate estimate with a relatively small Monte Carlo size M.

The mean and variance of the conditional distribution \(p_{\phi_i|y_i}\) are estimated by Metropolis-Hastings for each individual i. Then, the \(\phi_i^{(m)}\) are drawn with a noncentral student t-distribution:

$$ \phi_i^{(m)} = \mu_i + \sigma_i \times T_{i,m}$$

where \(\mu_i\) and \(\sigma^2_i\) are estimates of \(\mathbb{E}\left(\phi_i|y_i;\theta\right)\) and \(\mbox{Var}\left(\phi_i|y_i;\theta\right)\), and \((T_{i,m})\) is a sequence of i.i.d. random variables distributed with a Student’s t-distribution with \(\nu\) degrees of freedom.

Remark: The standard error of all the draws is proposed. It is a representation of impact of the variability of the draws of the proposed population parameters and not of the uncertainty of the model.

Remark: Even if \(\hat{\cal L}_y(\theta)=\prod_{i=1}^{N}\hat{p}_{i,M}\) is an unbiased estimator of \({\cal L}_y(\theta)\), \(\hat{\cal LL}_y(\theta)\) is a biased estimator of \({\cal LL}_y(\theta)\). Indeed, by Jensen’s inequality, we have :

$$\mathbb{E}\left(\log(\hat{\cal L}_y(\theta))\right) \leq \log \left(\mathbb{E}\left(\hat{\cal L}_y(\theta)\right)\right)=\log\left({\cal L}_y(\theta)\right)$$

Best practice: the bias decreases as M increases and also if \(\hat{\cal L}_y(\theta)\) is close to \({\cal L}_y(\theta)\). It is therefore highly recommended to use a proposal as close as possible to the conditional distribution \(p_{\phi_i|y_i}\), which means having to estimate this conditional distribution before estimating the log-likelihood (i.e. run task  “Conditional distribution” before).

Display and outputs

In case of estimation using the importance sampling method, a graphical representation is proposed to see the valuation of the mean value over the Monte Carlo iterations as on the following:

The final estimations are displayed in the result frame as below.

In terms of output, a folder called LogLikelihood is created in the result folder where the following files are created

  • logLikelihood.txt: containing for each computed method, the -2 x log-likelihood, the Akaike Information Criteria (AIC), and the Bayesian Information Criteria (BIC).
  • individualLL.txt: containing the -2 x log-likelihood for each individual for each computed method.

Advance settings for the log-likelihood

A t-distribution is used as proposal. The number of degrees of freedom of this distribution can be either fixed or optimized. In such a case, the default possible values are 1, 2, 5, 10 and 20 degree of freedom. A distribution with a small number of degree of freedom (i.e. heavy tails) should be avoided in case of stiff ODE’s defined models. We recommend to set a degree of freedom at 5.

Log-likelihood by linearization

The likelihood of the nonlinear mixed effects model  cannot be computed in a closed-form. An alternative is to approximate this likelihood by the likelihood of the Gaussian model deduced from the nonlinear mixed effects model after linearization of the function f (defining the structural model) around the predictions of the individual parameters \((\phi_i; 1 \leq i \leq N)\).
Notice that the log-likelihood can not be computed by linearization for discrete outputs (categorical, count, etc.) nor for mixture models.

Best practice: We strongly recommend to compute the conditional mode before computing the log-likelihood by linearization. Indeed, the linearization should be made around the most probable values as they are the same for both the linear and the nonlinear model.

Best practices: When should I use the linearization and when should I use the importance sampling?

Firstly, it is only possible to use the linearization algorithm for the continuous data. In that case, this method is generally much faster than importance sampling method and also gives good estimates of the LL. The LL calculation by model linearization will generally be able to identify the main features of the model. More precise– and time-consuming – estimation procedures such as stochastic approximation and importance sampling will have very limited impact in terms of decisions for these most obvious features. Selection of the final model should instead use the unbiased estimator obtained by Monte Carlo.

3.7.Algorithms convergence assessment

Monolix includes a convergence assessment tool. It allows to execute a workflow of estimation tasks several times, with different, randomly generated, initial values of fixed effects, as well as different seeds. The goal is to assess the robustness of the convergence.

Running the convergence assessment

For that, click on the button in the “Tasks” part.

A dedicated window pops up as in the figure below:

The user can define

    • the number of runs, or replicates
    • the type of assessment:
    • the initial parameters. By default, initial values are uniformly drawn from intervals defined around the estimated values if population parameters have been estimated, the initial estimates otherwise. Notice that it is possible to set one initial parameter constant while generating the others. The minimum and maximum of the generated parameters can be modified by the user.

Notice that

  • In the case of estimation of the standard errors and log-likelihood by linearization, the individual parameters with the conditional mode method are computed as well to have more relevant linearization.
  • In the case of estimation of the standard errors and log-likelihood without the linearization, the conditional distribution method is computed too to have more relevant estimation.
  • The workflow is the same between the runs and is not the one defined in the interface.

Click on Run to execute the tool. Thus you are able to estimate the population parameters using several initial seeds and/or several initial conditions.

Display and outputs

Several kinds of plots are given as a summary of the results.
First of all, the SAEM convergence assessment is proposed. The convergence of each parameter on each run is proposed. It allows to see if the convergence for each run is ok.

Then, a plot showing the estimated values for each replicate is proposed. If the estimation of the standard errors was included in the scenario, the estimated standard errors are also displayed as horizontal bars. It allows to see if all parameters converge statistically to the same values.

Finally, if log-likelihood without linearization is used, the curves for convergence of importance sampling are proposed.

In addition, a result folder is generated for each set of initial parameters. Along with all the runs, there is a summary of all the runs providing all the individual parameter estimates along with the -2LL, as in the following:

Parameters,Run_1,Run_2,Run_3,Run_4,Run_5
Cl_pop,0.03994527,0.04017999,0.04016216,0.04012077,0.0400175
V_pop,0.4575748,0.4556463,0.4560732,0.4557009,0.4569431
a,0.4239969,0.42482,0.4227559,0.4294611,0.435585
b,0.05653124,0.05684357,0.05700663,0.054965,0.05450724
ka_pop,1.527947,1.521184,1.5226,1.519333,1.519678
omega_Cl,0.2653109,0.2643172,0.268475,0.266199,0.2693083
omega_V,0.1293328,0.1274441,0.122951,0.1301242,0.1261098
omega_ka,0.6530206,0.6655251,0.643456,0.6425528,0.6424614
-2LL,339.387,339.417,339.429,339.444,339.462

Best practices: what is the use the convergence assessment tool?

We cannot claim that SAEM always converges (i.e., with probability 1) to the global maximum of the likelihood. We can only say that it converges under quite general hypotheses to a maximum – global or perhaps local – of the likelihood. A large number of simulation studies have shown that SAEM converges with high probability to a “good” solution – hopefully the global maximum – after a small number of iterations. The purpose of this tool is to evaluate the SAEM algorithm with initial conditions and see if the estimated parameters are the “global” minimum.
The trajectory of the outputs of SAEM depends on the sequence of random numbers used by the algorithm. This sequence is entirely determined by the “seed.” In this way, two runs of SAEM using the same seed will produce exactly the same results. If different seeds are used, the trajectories will be different but convergence occurs to the same solution under quite general hypotheses. However, if the trajectories converge to different solutions, that does not mean that any of these results are “false”. It just means that the model is sensitive to the seed or to the initial conditions. The purpose of this tool is to evaluate the SAEM algorithm with several seeds to see the robustness of the convergence.

3.8.What result files are generated by Monolix?

Monolix generates a lot of different output files depending on the tasks done by the user. Here is a complete listing of the files, along with the condition for creation and their content.

Population parameter estimation

summary.txt

Description: summary file.
Outputs:

  • Header: project file name, date and time of run, Monolix version
  • Estimation of the population parameters: Estimated population parameters & computation time

populationParameters.txt

Description: estimated population parameters (with SAEM).
Outputs:

  • First column (no name): contains the parameter names (e.g ‘V_pop’ and ‘omega_V’).
  • value: contains the estimated parameter values.

Individual parameters estimation

All the files are in the IndividualParameters folder of the result folder

estimatedIndividualParameters.txt

Description: Individual parameters (from SAEM, mode, and mean of the conditional distribution)
Outputs:

  • ID: subject name and occasion (if applicable). If there is one type of occasion, there will be an additional(s) column(s) defining the occasions.
  • parameterName_SAEM: individual parameter estimated by SAEM, it corresponds to the last iteration of SAEM.
  • parameterName_mode (if conditional mode was computed): individual parameter estimated by the conditional mode task, i.e mode of the conditional distribution \(p(\psi_i|y_i;\hat{\theta})\).
  • parameterName_mean (if conditional distribution was computed) : individual parameter estimated by the conditional distribution task, i.e mean of the conditional distribution \(p(\psi_i|y_i;\hat{\theta})\) .
  • parameterName_sd (if conditional distribution was computed): standard deviation of the conditional distribution \(p(\psi_i|y_i;\hat{\theta})\) calculated during the conditional distribution task.
  • COVname: continuous covariates values corresponding to all data set columns tagged as “Continuous covariate” and all the associated transformed covariates.
  • CATname: modalities associated to the categorical covariates (including latent covariates and the bsmm covariates) and all the associated transformed covariates.

estimatedRandomEffects.txt

Description: individual random effect, calculated using the population parameters, the covariates and the conditional mode or conditional mean. For instance if we have a parameter defined as \(k_i=k_{pop}+\beta_{k,WT}WT_i+\eta_i\), we calculate \(\eta_i=k_i – k_{pop}-\beta_{k,WT}WT_i\) with \(k_i\) the estimated individual parameter (mode or mean of the conditional distribution), \(WT_i\) the individual’s covariate, and \(k_{pop}\) and \(\beta_{k,WT}\) the estimated population parameters.
Outputs:

  • ID: subject name and occasion (if applicable). If there is one type of occasion, there will be an additional(s) column(s) defining the occasions.
  • eta_parameterName_SAEM: individual random effect estimated by SAEM, it corresponds to the last iteration of SAEM.
  • eta_parameterName_mode (if conditional mode was computed): individual random effect estimated by the conditional mode task, i.e mode of the conditional distribution \(p(\psi_i|y_i;\hat{\theta})\).
  • eta_parameterName_mean (if conditional distribution was computed) : individual random effect estimated by the conditional distribution task, i.e mean of the conditional distribution \(p(\psi_i|y_i;\hat{\theta})\) .
  • eta_parameterName_sd (if conditional distribution was computed): standard deviation of the conditional distribution \(p(\psi_i|y_i;\hat{\theta})\) calculated during the conditional distribution task.
  • COVname: continuous covariates values corresponding to all data set columns tagged as “Continuous covariate” and all the associated transformed covariates.
  • CATname: modalities associated to the categorical covariates (including latent covariates and the bsmm covariates) and all the associated transformed covariates.

simulatedIndividualParameters.txt

Description: Simulated individual parameter (by the conditional distribution)
Outputs:

  • rep: replicate of the simulation
  • ID: subject name and occasion (if applicable). If there is one type of occasion, there will be an additional(s) column(s) defining the occasions.
  • parameterName: simulated individual parameter corresponding to the draw rep.
  • COVname: continuous covariates values corresponding to all data set columns tagged as “Continuous covariate” and all the associated transformed covariate.
  • CATname: modalities associated to the categorical covariates (including latent covariates and the bsmm covariates) and all the associated transformed covariates.

simulatedRandomEffects.txt

Description: Simulated individual random effect (by the conditional distribution)
Outputs:

  • rep: replicate of the simulation
  • ID: subject name and occasion (if applicable). If there is one type of occasion, there will be an additional(s) column(s) defining the occasions.
  • eta_parameterName: simulated individual random effect corresponding to the draw rep.
  • COVname: continuous covariates values corresponding to all data set columns tagged as”Continuous covariate” and all the associated transformed covariate.
  • CATname: modalities associated to the categorical covariates (including latent covariates and the bsmm covariates) and all the associated transformed covariates.

Fisher Information Matrix calculation

summary.txt

Description: summary file.
Outputs:

  • Header: project file name, date and time of run, Monolix version (outputted population parameter estimation task)
  • Estimation of the population parameters: Estimated population parameters & computation time (outputted population parameter estimation task). Standard errors and relative standard errors are added.
  • Correlation matrix of the estimates: correlation matrix by block, eigenvalues and computation time

populationParameters.txt

Description: estimated population parameters, associated standard errors and p-value.
Outputs:

  • First column (no name): contains the parameter names (outputted population parameter estimation task)
  • Column ‘parameter’: contains the estimated parameter values (outputted population parameter estimation task)
  • se_lin / se_sa: contains the standard errors (s.e.) for the (untransformed) parameter, obtained by linearization of the system (lin) or stochastic approximation (sa).
  • rse_lin / rse_sa: contains the parameter relative standard errors (r.s.e.) in % (param_r.s.e. = 100*param_s.e./param), obtained by linearization of the system (lin) or stochastic approximation (sa).
  • pvalues_lin / pvalues_sa: for beta parameters associated to covariates, the line contains the p-value obtained from a Wald test of whether beta=0. If the parameter is not a beta parameter, ‘NaN’ is displayed.

Notice that if the Fisher Information Matrix is difficult to invert, some parameter’s standard error can maybe not be computed leading to NaN in the corresponding columns.

All the more detailed files are in the FisherInformation folder of the result folder.

jacobian.txt

Description: jacobian (i.e derivatives of the untransformed parameters \(\theta\) w.r.t the transformed and normally distributed parameters \(\zeta\))
Outputs: matrix with the project parameters as lines and columns. First column and first row contain the parameter names.

The elements of the jacobian J are defined by:

$$J_{ij}=\frac{\partial\theta_i}{\partial\zeta_j}$$

with \(\theta\) the untransformed parameters and \(\zeta\) the transformed and normally distributed parameters. Note that only the fixed effects get transformed by Monolix, while the standard deviations ‘omega’ are not (the diagonal elements are therefore 1 for those parameters).

covarianceEstimatesSA.txt and/or covarianceEstimatesLin.txt

Description: inverse of the Fisher Information Matrix (i.e the variance-covariance matrix) for the transformed normally distributed parameters
Outputs: matrix with the project parameters as lines and columns. First column contains the parameter names.

The variance-covariance matrix \(\Gamma\) for the transformed normally distributed parameters \(\zeta\) can be multiplied by the jacobian J (which elements are defined by \(J_{ij}=\frac{\partial\theta_i}{\partial\zeta_j}\), see jacobian.txt) to obtain the variance-covariance matrix \(\tilde{\Gamma}\) for the untransformed parameters \(\theta\):

$$\tilde{\Gamma}=J^T\Gamma J$$

correlationEstimatesSA.txt and/or correlationEstimatesLin.txt

Description: correlation matrix for the (untransformed) parameters
Outputs: matrix with the project parameters as lines and columns. First column contains the parameter names.

The correlation matrix is calculated as:

$$\text{corr}(\theta_i,\theta_j)=\frac{\text{covar}(\theta_i,\theta_j)}{\sqrt{\text{var}(\theta_i)}\sqrt{\text{var}(\theta_j)}}$$

This implies that the diagonal is unitary. The variance-covariance matrix for the untransformed parameters \(\theta\) is obtained from the inverse of the Fisher Information Matrix and the jacobian. See above for the formula.

Log-Likelihood calculation

summary.txt

Description: summary file.
Outputs:

  • Header: project file name, date and time of run, Monolix version (outputted population parameter estimation task)
  • Estimation of the population parameters: Estimated population parameters & computation time (outputted population parameter estimation task). Standard errors and relative standard errors are added.
  • Correlation matrix of the estimates: correlation matrix by block, eigenvalues and computation time
  • Log-likelihood Estimation: -2*log-likelihood, AIC and BIC values, together with the computation time

All the more detailed files are in the LogLikelihood folder of the result folder

logLikelihood.txt

Description: Summary of the log-likelihood calculation with the two methods.
Outputs:

  • criteria: OFV (Objective Function Value), AIC (Akaike Information Criteria), and BIC (Bayesian Information Criteria )
  • method: ImportanceSampling and/or linearization

individualLL.txt

Description: -2LL for each individual. Notice that we only have one by individual even if there are occasions.
Outputs:

  • ID: subject name
  • method: ImportanceSampling and/or linearization

Tests

Tables

predictions.txt

Description: predictions at the observation times
Outputs:

  • ID: subject name. If there are occasions, additional columns will be added to describe the occasions.
  • time: Time from the data set.
  • MeasurementName: Measurement from the data set.
  • RegressorName: Regressor value.
  • popPred_medianCOV: prediction using the population parameters and the median covariates.
  • popPred: prediction using the population parameters and the covariates, e.g \(V_i=V_{pop}\left(\frac{WT_i}{70}\right)^{\beta}\) (without random effects).
  • indivPred_SAEM: prediction using the mean of the conditional distribution, calculated using the last iterations of the SAEM algorithm.
  • indPred_mean (if conditional distribution was computed): prediction using the mean of the conditional distribution, calculated in the Conditional distribution task.
  • indPred_mode (if conditional mode was computed): prediction using the mean of the conditional distribution, calculated in the EBEs task.
  • indWRes_SAEM: weighted residuals \(IWRES_{ij}=\frac{y_{ij}-f(t_{ij}, \psi_i)}{g(t_{ij}, \psi_i)}\) with \(\psi_i\) the mean of the conditional distribution, calculated using the last iterations of the SAEM algorithm.
  • indWRes_mean (if conditional distribution was computed): weighted residuals \(IWRES_{ij}=\frac{y_{ij}-f(t_{ij}, \psi_i)}{g(t_{ij}, \psi_i)}\) with \(\psi_i\) the mean of the conditional distribution, calculated in the Conditional distribution task.
  • indWRes_mode (if conditional mode was computed): weighted residuals \(IWRES_{ij}=\frac{y_{ij}-f(t_{ij}, \psi_i)}{g(t_{ij}, \psi_i)}\) with \(\psi_i\) the mode of the conditional distribution, calculated in the EBEs task.

Notice that in case of several outputs, Monolix generates predictions1.txt, predictions2.txt, …

Charts data

All plots generated by Monolix can be exported as a figure or as text files in order to be able to plot it in another way or with other software for more flexibility. The description of all generated text files is described here.

3.9.Tests

Several statistical tests may be automatically performed to test the different components of the model. These tests use individual parameters drawn from the conditional distribution, which means that you need to run the task “Conditional distribution” in order to get these results. The tests for the residuals require to have first plot residuals (scatter plot or distribution).

Results of the tests are available in the tab “Results”  and selecting “Tests” in the left menu

The model for the individual parameters

Consider a PK example with the following model for the individual PK parameters (ka, V, Cl):

In this example, the different assumptions we make about the model are:

  • The 3 parameters are lognormally distributed
  • ka is function of age only
  • V is function of sex and weight. More precisely, the log-volume log(V) is a linear function of the log-weight \({\rm lw70 }= \log({\rm wt}/70)\).
  • Cl is not function of any of the covariates.
  • The random effects \eta_V and \eta_{Cl} are linearly correlated
  • \eta_{ka} is not correlated with \eta_V and \eta_{Cl}

Let’s see how each of these assumptions are tested:

The covariate model

Testing if covariates should be removed from the model

If an individual parameter is function of a continuous covariate, the linear correlation between the transformed parameter and the covariate is not 0 and the associated \beta coefficient is not 0 either. Then, Pearson’s correlation tests and Wald tests are used to test whether continuous covariates should be removed from the model. ANOVA and Wald tests are performed for categorical covariates in a same way.
In our example, we may want to test if the absorption rate constant ka is function of sex and if the volume V is function of sex and weight. The individual model looks like the figure below

If we look at the tests results for the covariate, the ANOVA for ka clearly shows that our hypothesis that ka is function of sex should be rejected:

Remark: The two covariates weight and sex are strongly dependent. Then, the fact that both lw70 and sex are significant on the parameter V does not mean that these two covariates should be kept in the model.

The Wald test using the standard errors estimated either by linearization or by stochastic approximation lead to the same conclusion:

Testing if covariates should be added to the model

Pearson’s correlation tests and ANOVA are performed to test if some relationships between random effects and covariates have not been taken into account in the model. In our example, only a relationship between weight and clearance could possibly be worthy of investigation.

The model for the random effects

Testing the normality of the random effects

Shapiro-Wilk tests are performed to test if the random effects are normally distributed

Testing the correlation between random effects

Pearson’s correlation tests are performed to test if the random effects are linearly correlated. In our example, the assumption that \eta_V and \eta_{Cl} are correlated should be rejected.

The distribution of the individual parameters

Individual parameters not dependent on covariates

When an individual parameter doesn’t depend on covariates, its distribution is a transformation of the normal distribution. Then, a Shapiro-Wilk test can be used for testing the normality of the transformed parameter. In our example, Cl does not depend on any covariate the hypothesis of lognormality should not be rejected:

Remark:  testing the normality of a transformed individual parameter that does not depend on covariates is equivalent to testing the normality of the associated random effect. We can check in our example that the  Shapiro-Wilk tests for \log(Cl) and \eta_{Cl} are equivalent.

Individual parameters dependent on covariates

Individual parameters that depend on covariates are not anymore identically distributed. Each transformed individual parameter is normally distributed, with its own mean that depends on the value of the individual covariate. In other words, the distribution of an individual parameter is a mixture of (transformed) normal distributions. A Kolmogorov-Smirnov test is used for testing the distributional adequacy of these individual parameters

The model for the observations

A combined error model is assumed in our example with normal residual errors.

The distribution of the residuals

Different tests are performed for the individual residuals, the npde’s and for the population parameters.

Testing the symmetry of the residual distribution

A Van Der Waerden test is used for testing the symmetry of the residuals. Indeed, symmetry of the residuals around 0 is an important property that deserve to be tested, in order to decide, for instance, if some transformation of the observations should be done

Testing the normality of the residuals

A Shapiro Wilk test is used for testing the normality of the residuals.

Remark:  the Shapiro Wilk test is known to be very powerful. Then, a small deviation of the empirical distribution from the normal distribution may lead to a very significant test (i.e. a very small p-value), which does not necessarily means that the model should be rejected!

3.10.Monolix API

On the use of a R-usable API

We now propose to use Monolix via an API.  It is possible to have access to the project exactly in the same way as you would do with the interface. All the functions are described below.

Installation and initialization

Installation

The R package is located in the installation directory as tar.gz ball. It must be installed with the following R command:

install.packages('<installDirectory>/mlxConnectors/R/MlxConnectors.tar.gz', repos = NULL, type="source")

with <installDirectory> the MonolixSuite2018R1 installation directory. By default, it is

  • “C:/ProgramData/Lixoft/MonolixSuite2018R1” for Windows OS
  • “/Applications/MonolixSuite2018R1.app/Contents/Resources/mlxsuite/” for MAC OS

Initializing

When starting a new R session, you need to load the library and initialize the connectors with the following commands

library(MlxConnectors) 
initializeMlxConnectors(software = "monolix")

In some cases, it may be necessary to specify the path to the installation directory of the Lixoft suite. If no path is given, the one written in the lixoft.ini file is used (usually “C:/ProgramData/Lixoft/MonolixSuite2018R2” for Windows).

library(MlxConnectors) 
initializeMlxConnectors(software = "monolix", mlxDirectory = "/path/to/MonolixSuite2018R2/")

 

Making sure the installation is ok

To test if the installation is ok, you can load and run a project from the demos as on the following:

demoPath = '<userFolder>/lixoft/monolix/monolix2018R1/demos/1.creating_and_using_models/'
loadProject(paste0(demoPath ,'1.1.libraries_of_models/theophylline_project.mlxtran'))
runScenario()
getEstimatedPopulationParameters()

where <userFolder> is the user’s home folder (on windows C:/Users/toto if toto is your username). These three commands should output the estimated population parameters (ka_pop, V_pop, Cl_pop, omega_ka, omega_V, omega_Cl, a, and b).

Notes

  • Due to possible conflicts, the package mlxR, whose function simulx can be used to perform simulations with Monolix, should not be loaded at the same time as MlxConnectors.
  • Running the plots task with the API saves the charts data in the result folder, if “Export charts data” is selected in Monolix’s preferences. The plots can only be generated with the Monolix GUI.

 

Description of the functions concerning the covariate model

Description of the functions concerning the individual model

Description of the functions concerning the observation model

  • getContinuousObservationModel: Get a summary of the information concerning the continuous observation models in the project.
  • getObservationInformation: Get the name, the type and the values of the observations present in the project.
  • setAutocorrelation: Add or remove auto-correlation from the error model used on some of the observation models.
  • setErrorModel: Set the error model type to be used with some of the observation models.
  • setObservationDistribution: Set the distribution in the Gaussian space of some of the observation models.
  • setObservationLimits: Set the minimum and the maximum values between which some of the observations can be found.

Description of the functions concerning the population parameters

  • getPopulationParameterInformation: Get the name, the initial value, the estimation method and, if relevant, MAP parameters value of the population parameters present in the project.
  • setInitialEstimatesToLastEstimates: Set the initial value of all the population parameters present within the current project (fixed effects + individual variances + error model parameters) to the ones previously estimated.
  • setPopulationParameterInformation: Set the initial value, the estimation method and, if relevant, the MAP parameters of one or several of the population parameters present within the current project (fixed effects + individual variances + error model parameters).

Description of the functions concerning the project management

  • getData: Get a description of the data used in the current project.
  • getStructuralModel: Get the model file for the structural model used in the current project.
  • loadProject: Load a project by parsing the mlxtran-formated file whose path has been given as an input.
  • newProject: Create a new empty project providing model and data specification.
  • saveProject: Save the current project as an Mlxtran-formated file.
  • setData: Set project data giving a data file and specifying headers and observations types.
  • setStructuralModel: Set the structural model.

Description of the functions concerning the results

  • getCorrelationOfEstimates: Get the inverse of the last estimated Fisher matrix computed either by all the Fisher methods used during the last scenario run or by the specific one passed in argument.
  • getEstimatedIndividualParameters: Get the last estimated values for each subject of some of the individual parameters present within the current project.
  • getEstimatedLogLikelihood: Get the values computed by using a log-likelihood algorithm during the last scenario run, with or without a method-based filter.
  • getEstimatedPopulationParameters: Get the last estimated value of some of the population parameters present within the current project (fixed effects + individual variances + correlations + latent probabilities + error model parameters).
  • getEstimatedRandomEffects: Get the random effects for each subject of some of the individual parameters present within the current project.
  • getEstimatedStandardErrors: Get the last estimated standard errors of population parameters computed either by all the Fisher methods used during the last scenario run or by the specific one passed in argument.
  • getLaunchedTasks: Get a list of the tasks which have results to provide.
  • getSAEMiterations: Retrieve the successive values of some of the population parameters present within the current project (fixed effects + individual variances + correlations + latent probabilities + error model parameters) during the previous run of the SAEM algorithm.
  • getSimulatedIndividualParameters: Get the simulated values for each replicate of each subject of some of the individual parameters present within the current project.
  • getSimulatedRandomEffects: Get the simulated values for each replicate of each subject of some of the individual random effects present within the current project.

Description of the functions concerning the scenario

Description of the functions concerning the settings

3.10.1.API concerning the covariate models

Description of the functions of the API

addCategoricalTransformedCovariate Create a new categorical covariate by transforming an existing one.
addContinuousTransformedCovariate Create a new continuous covariate by transforming an existing one.
addMixture .
getCovariateInformation Get the name, the type and the values of the covariates present in the project.
removeCovariate Remove some of the transformed covariates (discrete and continuous) and/or latent covariates.

Add categorical transformed covariate

Description

Create a new categorical covariate by transforming an existing one. Transformed covariates cannot be use to produce new covariates.
Call getCovariateInformation to know which covariates can be transformed.

Usage

addCategoricalTransformedCovariate(...)

Arguments


A list of comma-separated pairs {transformedCovariateName = { from = (array<(string)>)["basicCovariateNames"], transformed = (array<array<string>>)"transformation"} }

See Also

getCovariateInformation removeCovariate

Click here to see examples

## Not run:

addCategoricalTransformedCovariate( Country2 = list( reference = “A1”, from = “Country”,

transformed = list( A1 = c(“A”,”B”), A2 = c(“C”) ) ) )

## End(Not run)

)
Top of the page, Monolix API.


Add continuous transformed covariate

Description

Create a new continuous covariate by transforming an existing one. Transformed covariates cannot be use to produce new covariates.
Call getCovariateInformation to know which covariates can be transformed.

Usage

addContinuousTransformedCovariate(...)

Arguments


A list of comma-separated pairs {transformedCovariateName = (string)"transformation"}

See Also

getCovariateInformation removeCovariate

Click here to see examples

## Not run:

addContinuousTransformedCovariate( tWt2 = “3*exp(Wt)” )

## End(Not run)

)
Top of the page, Monolix API.


Add mixture to the covariate model
Add a new latent covariate to the current model giving its name and its modality number.

Description

Add mixture to the covariate model

Add a new latent covariate to the current model giving its name and its modality number.

Usage

addMixture(...)

Arguments


A list of comma-separated pairs {latentCovariateName = (int)modalityNumber}

See Also

getCovariateInformation removeCovariate

Click here to see examples

## Not run:

addMixture(lcat = 2)

## End(Not run)

)
Top of the page, Monolix API.


Get covariates information

Description

Get the name, the type and the values of the covariates present in the project.

Usage

getCovariateInformation()

Value

A list containing the following fields :

  • name : (vector<string>) covariate names
  • type : (vector<string>) covariate types. Existing types are "continuous", "continuoustransformed", "categorical", "categoricaltransformed" and "latent".
  • modalityNumber : (vector<int>) number of modalities (for latent covariates only)
  • covariate : a data frame giving the values of continuous and categorical covariates for each subject.
    Latent covariate values exist only if they have been estimated, ie if the covariate is used and if the population parameters have been estimated.
    Call getEstimatedIndividualParameters to retrieve them.

Click here to see examples

## Not run:

info = getCovariateInformation()

info

-> $name

c(“sex”,”wt”,”lcat”)

-> $type

c(sex = “categorical”, wt = “continuous”, lcat = “latent”)

-> $modalityNumber

c(lcat = 2)

-> $covariate

id sex wt

1 M 66.7

. . .

N F 59.0

## End(Not run)

)
Top of the page, Monolix API.


Remove covariate

Description

Remove some of the transformed covariates (discrete and continuous) and/or latent covariates.
Call getCovariateInformation to know which covariates can be removed.

Usage

removeCovariate(...)

Arguments


A list of covariate names.

See Also

getCovariateInformation addContinuousTransformedCovariate addCategoricalTransformedCovariate
addMixture

Click here to see examples

## Not run:

removeCovariate(“tWt”,”lcat1″)

## End(Not run)

)
Top of the page, Monolix API.

3.10.2.API concerning the observation models

Description of the functions of the API

getContinuousObservationModel Get a summary of the information concerning the continuous observation models in the project.
getObservationInformation Get the name, the type and the values of the observations present in the project.
setAutocorrelation Add or remove auto-correlation from the error model used on some of the observation models.
setErrorModel Set the error model type to be used with some of the observation models.
setObservationDistribution Set the distribution in the Gaussian space of some of the observation models.
setObservationLimits Set the minimum and the maximum values between which some of the observations can be found.

Get continuous observation models information

Description

Get a summary of the information concerning the continuous observation models in the project. The following information are provided.

  • prediction: (vector<string>) name of the associated prediction
  • formula: (vector<string>) formula applied on the observation
  • distribution: (vector<string>) distribution of the observation in the Gaussian space. The distribution type can be "normal", "logNormal", or "logitNormal".
  • limits: (vector< pair<double,double> >) lower and upper limits imposed to the observation.
    Used only if the distribution is logitNormal. If there is no logitNormal distribution, this field is empty.

  • errormodel: (vector<string>) type of the associated error model
  • autocorrelation: (vector<bool>) defines if there is auto correlation

Call getObservationInformation to get a list of the continuous observations present in the current project.

Usage

getContinuousObservationModel()

Value

A list associating each continuous observation to its model properties.

See Also

getObservationInformation setObservationDistribution setObservationLimits
setErrorModel setAutocorrelation

Click here to see examples

## Not run:

obsModels = getContinuousObservationModel()

obsModels

-> $prediction

c(Conc = “Cc”)

$formula

c(Conc = “Conc = Cc + (a+b*Cc)*e”)

$distribution

c(Conc = “logitNormal”)

$limits

list(Conc = c(0,11.5))

$errormodel

c(Conc = “combined1”)

$autocorrelation

c(Conc = TRUE)

## End(Not run)

)
Top of the page, Monolix API.


Get observations information

Description

Get the name, the type and the values of the observations present in the project.

Usage

getObservationInformation()

Value

A list containing the name of the observations, their type and their values (id, time and observationName (and occasion if present in the data set)).

Click here to see examples

## Not run:

info = getObservationInformation()

info

-> $name

c(“concentration”)

-> $type

c(concentration = “continuous”)

-> $concentration

id time concentration

1 0.5 0.0

. . .

N 9.0 10.8

## End(Not run)

)
Top of the page, Monolix API.


Set auto-correlation

Description

Add or remove auto-correlation from the error model used on some of the observation models.
Call getObservationInformation to get a list of the observation models present in the current project.

Usage

setAutocorrelation(...)

Arguments


Sequence of comma-separated pairs {(string)"observationModel",(boolean)hasAutoCorrelation}.

See Also

getContinuousObservationModel

Click here to see examples

## Not run:

setAutocorrelation(Conc = TRUE)

setAutocorrelation(Conc = TRUE, Effect = FALSE)

## End(Not run)

)
Top of the page, Monolix API.


Set error model

Description

Set the error model type to be used with some of the observation models.
Call getObservationInformation to get a list of the observation models present in the current project.

Usage

setErrorModel(...)

Arguments


A list of comma-separated pairs {observationModel = (string)errorModelType}.

Details

Available error model types are :

“constant” obs = pred + a*err
“proportional” obs = pred + (b*pred)*err
“combined1” obs = pred + (b*pred^c + a)*err
“combined2” obs = pred + sqrt(a^2 + (b^2)*pred^(2c))*err

Error model parameters will be initialized to 1 by default.
Call setPopulationParameterInformation to modify their initial value.
The value of the exponent parameter is fixed by default when using the "combined1" and "combined2" models.

Use setPopulationParameterInformation to enable its estimation.

See Also

getContinuousObservationModel setPopulationParameterInformation

Click here to see examples

## Not run:

setErrorModel(Conc = “constant”, Effect = “combined1”)

## End(Not run)

)
Top of the page, Monolix API.


Set observation model distribution

Description

Set the distribution in the Gaussian space of some of the observation models.
Available distribution types are "normal", "logNormal", or "logitNormal".
Call getObservationInformation to get a list of the available observation models within the current project.

Usage

setObservationDistribution(...)

Arguments


A list of comma-separated pairs {observationModel = (string)"distribution"}.

See Also

getContinuousObservationModel

Click here to see examples

## Not run:

setObservationDistribution(Conc = “normal”)

setObservationDistribution(Conc = “normal”, Effect = “logNormal”)

## End(Not run)

)
Top of the page, Monolix API.


Set observation model distribution limits

Description

Set the minimum and the maximum values between which some of the observations can be found.
Used only if the distribution of the error model is "logitNormal", else wise it will not be taken into account

Usage

setObservationLimits(...)

Arguments


A list of comma-separated pairs {observationModel = [(double)min,(double)max] }

See Also

getContinuousObservationModel getObservationInformation

Click here to see examples

## Not run:

setObservationLimits( Conc = c(-Inf,Inf), Effect = c(0,Inf) )

## End(Not run)

)
Top of the page, Monolix API.

3.10.3.API concerning the population parameters

Description of the functions of the API

getPopulationParameterInformation Get the name, the initial value, the estimation method and, if relevant, MAP parameters value of the population parameters present in the project.
setInitialEstimatesToLastEstimates Set the initial value of all the population parameters present within the current project (fixed effects + individual variances + error model parameters) to the ones previously estimated.
setPopulationParameterInformation Set the initial value, the estimation method and, if relevant, the MAP parameters of one or several of the population parameters present within the current project (fixed effects + individual variances + error model parameters).

Get population parameters information

Description

Get the name, the initial value, the estimation method and, if relevant, MAP parameters value of the population parameters present in the project.
It is available for fixed effects, random effects, error model parameters, and latent covariates probabilities.

Usage

getPopulationParameterInformation()

Value

A data frame giving, for each population parameter, the corresponding :

  • initialValue : (double) initial value
  • method : (string) estimation method
  • priorValue : (double) [MAP] typical value
  • priorSD : (double) [MAP] standard deviation

See Also

setPopulationParameterInformation

Click here to see examples

## Not run:

info = getPopulationParameterInformation()

info

name initialValue method typicalValue stdDeviation

ka_pop 1.0 MLE NA NA

V_pop 10.0 MAP 10.0 0.5

omega_ka 1.0 FIXED NA NA

## End(Not run)

)
Top of the page, Monolix API.


Initialize population parameters with the last estimated ones

Description

Set the initial value of all the population parameters present within the current project (fixed effects + individual variances + error model parameters) to the ones previously estimated.
These the values will be used in the population parameter estimation algorithm during the next scenario run.
WARNING: If there is any set after a run, it will not be possible to set the initial values as the structure of the project has changed since last results.

Usage

setInitialEstimatesToLastEstimates()

See Also

getEstimatedPopulationParameters getPopulationParameterInformation

Click here to see examples

## Not run:

setInitialEstimatesToLastEstimates()

## End(Not run)

)
Top of the page, Monolix API.


Population parameters initialization and estimation method

Description

Set the initial value, the estimation method and, if relevant, the MAP parameters of one or several of the population parameters present within the current project (fixed effects + individual variances + error model parameters).
Available methods are:

  • "FIXED": Fixed
  • "MLE": Maximum Likelihood Estimation
  • "MAP": Maximum A Posteriori

Call getPopulationParameterInformation to get a list of the initializable population parameters present within the current project.

Usage

setPopulationParameterInformation(...)

Arguments


A list of comma-separated pairs {paramName = list( initialValue = (double), method = (string)"method"}.
In case of "MAP" method, the user can specify the associated typical value and standard deviation values by using an additional list elements {paramName = list( priorValue = (double)1, priorSD = (double)2 )}.
By default, the prior value corresponds to the the population parameter and the prior standard deviation is set to 1.

See Also

getPopulationParameterInformation

Click here to see examples

## Not run:

setPopulationParameterInformation(Cl_pop = list(initialValue = 0.5, method = “FIXED”), V_pop = list(intialValue = 1), ka_pop = list( method = “MAP”, priorValue = 1.5, priorSD = 0.25 ) )

## End(Not run)

)
Top of the page, Monolix API.

3.10.4.API concerning the individual parameter models

Description of the functions of the API

getIndividualParameterModel Get a summary of the information concerning the individual parameter model.
getVariabilityLevels Get a summary of the variability levels (inter-individual and/or intra-individual variability) present in the current project.
setCorrelationBlocks Define the correlation block structure associated to some of the variability levels of the current project.
setCovariateModel Set which are the covariates influencing individual parameters present in the project.
setIndividualParameterDistribution Set the distribution of the estimated parameters.
setIndividualParameterVariability Add or remove inter-individual and/or intra-individual variability from some of the individual parameters present in the project.

Get individual parameter model

Description

Get a summary of the information concerning the individual parameter model. The available information are:

  • name: (string) name of the individual parameter
  • distribution: (string) distribution of the parameter values. The distribution type can be "normal", "logNormal", or "logitNormal".
  • formula: (string) formula applied on individual parameters distribution
  • variability: a list giving, for each variability level, if individual parameters have variability or not
  • covariateModel: a list giving, for each individual parameter, if the related covariates are used or not.
    If no covariate is used, this field is empty.

  • correlationBlocks : a list giving, for each variability level, the blocks of the correlation matrix of the random effects.
    A block is represented by a vector of individual parameter names. If there is no block, this field is empty.

Usage

getIndividualParameterModel()

Value

A list of individual parameter model properties.

See Also

setIndividualParameterDistribution setIndividualParameterVariability setCovariateModel

Click here to see examples

## Not run:

indivModel = getIndividualParameterModel()

indivModel

-> $name

c(“ka”,”V”,”Cl”)

$distribution

c(ka = “logNormal”, V = “normal”, Cl = “logNormal”)

$formula

“\\tlog(ka) = log(ka_pop) + eta_ka\\n\\n\\tlV = V_pop + eta_V\\n\\n\\tlog(Cl) = log(Cl_pop) + eta_Cl\\n\\n”

$variability

list( id = c(ka = TRUE, V = FALSE, Cl = TRUE) )

$covariateModel

list( ka = c(age = TRUE, sex = FALSE, wt = TRUE),

V = c(age = FALSE, sex = FALSE, wt = FALSE),

Cl = c(age = FALSE, sex = FALSE, wt = FALSE) )

$correlationBlocks

list( id = c(“ka”,”V”,”Tlag”) )

## End(Not run)

)
Top of the page, Monolix API.


Get variability levels

Description

Get a summary of the variability levels (inter-individual and/or intra-individual variability) present in the current project.

Usage

getVariabilityLevels()

Value

A collection of the variability levels present in the currently loaded project.

Click here to see examples

## Not run:

getVariabilityLevels()

## End(Not run)

)
Top of the page, Monolix API.


Set correlation block structure

Description

Define the correlation block structure associated to some of the variability levels of the current project.
Call getVariabilityLevels to get a list of the variability levels and getIndividualParameterModel to get a list of the available individual parameters within the current project.

Usage

setCorrelationBlocks(...)

Arguments


A list of comma-separated pairs {variabilityLevel = vector< (array<string>)parameterNames} > }.

See Also

getVariabilityLevels getIndividualParameterModel

Click here to see examples

## Not run:

setCorrelationBlocks(id = list( c(“ka”,”V”,”Tlag”) ), iov1 = list( c(“ka”,”Cl”), c(“Tlag”,”V”) ) )

## End(Not run)

)
Top of the page, Monolix API.


Set covariate model

Description

Set which are the covariates influencing individual parameters present in the project.
Call getIndividualParameterModel to get a list of the individual parameters present within the current project.
and getCovariateInformation to know which are the available covariates for a given level of variability and a given individual parameter.

Usage

setCovariateModel(...)

Arguments


A list of comma-separated pairs {parameterName = { covariateName = (bool)isInfluent, …} }

See Also

getCovariateInformation

Click here to see examples

## Not run:

setCovariateModel( ka = c( Wt = FALSE, tWt = TRUE, lcat2 = TRUE),

Cl = c( SEX = TRUE )

)

## End(Not run)

)
Top of the page, Monolix API.


Set individual parameter distribution

Description

Set the distribution of the estimated parameters.
Available distributions are "normal", "logNormal" and "logitNormal".
Call getIndividualParameterModel to get a list of the available individual parameters within the current project.

Usage

setIndividualParameterDistribution(...)

Arguments


A list of comma-separated pairs {parameterName = (string)"distribution"}.

See Also

getIndividualParameterModel

Click here to see examples

## Not run:

setIndividualParameterDistribution(V = “logNormal”)

setIndividualParameterDistribution(Cl = “normal”, V = “logNormal”)

## End(Not run)

)
Top of the page, Monolix API.


Individual variability management

Description

Add or remove inter-individual and/or intra-individual variability from some of the individual parameters present in the project.
Call getIndividualParameterModel to get a list of the available parameters within the current project.

Usage

setIndividualParameterVariability(...)

Arguments


A list of comma-separated pairs {variabilityLevel = {individualParameterName = (bool)hasVariability} }.

See Also

getIndividualParameterModel

Click here to see examples

## Not run:

setIndividualParameterVariability(ka = TRUE, V = FALSE)

setIndividualParameterVariability(id = list(ka = TRUE), iov1 = list(ka = FALSE))

## End(Not run)

)
Top of the page, Monolix API.

3.10.5.API concerning the scenario

Description of the functions of the API

abort Stop the current task run.
getLastRunStatus Return an execution report about the last run with a summary of the error which could have occurred.
getScenario Get the list of tasks that will be run at the next call to runScenario, the associated method (linearization true or false), and the associated list of plots.
isRunning Check if a scenario is currently running.
runConditionalDistributionSampling Estimate the individual parameters using conditional distribution sampling algorithm.
runConditionalModeEstimation Estimate the individual parameters using the conditional mode estimation algorithm (EBEs).
runLogLikelihoodEstimation Run the log-Likelihood estimation algorithm.
runPopulationParameterEstimation Estimate the population parameters with the SAEM method.
runScenario Run the current scenario.
runStandardErrorEstimation Estimate the Fisher Information Matrix and the standard errors of the population parameters.
setScenario Clear the current scenario and build a new one from a given list of tasks, the linearization option and the list of plots.

Stop the current task run

Description

Stop the current task run.

Usage

abort()

See Also

runScenario

Click here to see examples

## Not run:

abort()

## End(Not run)

)
Top of the page, Monolix API.


Get last run status

Description

Return an execution report about the last run with a summary of the error which could have occurred.

Usage

getLastRunStatus()

Value

A structure containing

  1. a boolean which equals TRUE if the last run has successfully completed,
  2. a summary of the errors which could have occurred.

See Also

runScenario abort isRunning

Click here to see examples

## Not run:

lastRunInfo = getLastRunStatus()

lastRunInfo$status

-> TRUE

lastRunInfo$report

-> “”

## End(Not run)

)
Top of the page, Monolix API.


Get current scenario

Description

Get the list of tasks that will be run at the next call to runScenario, the associated method (linearization true or false), and the associated list of plots.
The list of tasks consist of the following tasks: populationParameterEstimation, conditionalDistributionSampling, conditionalModeEstimation, standardErrorEstimation, logLikelihoodEstimation, and plots.

Usage

getScenario()

Value

The list of tasks that corresponds to the current scenario, indexed by algorithm names.

See Also

setScenario

Click here to see examples

## Not run:

scenario = getScenario()

scenario

-> $tasks

populationParameterEstimation conditionalDistributionSampling conditionalModeEstimation standardErrorEstimation logLikelihoodEstimation plots

TRUE TRUE TRUE FALSE FALSE FALSE

$linearization = T

$plotList = “outputplot”, “vpc”

## End(Not run)

)
Top of the page, Monolix API.


Get current scenario state

Description

Check if a scenario is currently running. If yes, information about the current running task are displayed.

Usage

isRunning(verbose = FALSE)

Arguments

verbose
(bool) Should information about the current running task be displayed in the console or not. Equals FALSE by default.

Value

A boolean which equals TRUE if a scenario is currently running.

See Also

runScenario abort

Click here to see examples

## Not run:

isRunning()

## End(Not run)

)
Top of the page, Monolix API.


Sampling from the conditional distribution

Description

Estimate the individual parameters using conditional distribution sampling algorithm. The associated method keyword is “conditionalMean”.
By default, this task is not processed in the background of the R session.
Notice that it does not impact the current scenario. Call

  1. isRunning to check if the scenario is still running and get information about the current task,
  2. abort to stop the execution.

To launch the function in the background, so that functions which do not modify the project (“get” functions for example) remains available, set the input argument “wait” to FALSE.

Usage

runConditionalDistributionSampling(wait = TRUE)

Arguments

wait
(bool) Should R wait for run completion before giving back the hand to the user. Equals TRUE by default.

See Also

isRunning abort

Click here to see examples

## Not run:

runConditionalDistributionSampling()

## End(Not run)

)
Top of the page, Monolix API.


Estimation of the conditional modes (EBEs)

Description

Estimate the individual parameters using the conditional mode estimation algorithm (EBEs). The associated method keyword is “conditionalMode”.
By default, this task is not processed in the background of the R session.
Notice that it does not impact the current scenario. Call

  1. isRunning to check if the scenario is still running and get information about the current task,
  2. abort to stop the execution.

To launch the function in the background, so that functions which do not modify the project (“get” functions for example) remains available, set the input argument “wait” to FALSE.

Usage

runConditionalModeEstimation(wait = TRUE)

Arguments

wait
(bool) Should R wait for run completion before giving back the hand to the user. Equals TRUE by default.

See Also

isRunning abort

Click here to see examples

## Not run:

runConditionalModeEstimation()

## End(Not run)

)
Top of the page, Monolix API.


Log-Likelihood estimation

Description

Run the log-Likelihood estimation algorithm. By default, this task is not processed in the background of the R session.
Notice that it does not impact the current scenario. Call

  1. isRunning to check if the scenario is still running and get information about the current task,
  2. abort to stop the execution.

To launch the function in the background, so that functions which do not modify the project (“get” functions for example) remains available, set the input argument “wait” to FALSE.
Existing methods:

Method Identifier
Log-Likelihood estimation by linearization linearization = T
Log-Likelihood estimation by Importance Sampling (default) linearization = F

The Log-likelihood outputs(-2LL, AIC, BIC) are available using getEstimatedLogLikelihood function

Usage

runLogLikelihoodEstimation(linearization = FALSE, wait = TRUE)

Arguments

linearization
option (boolean)[optional] method to be used. When no method is given, the importance sampling is used by default.

wait
(bool) Should R wait for run completion before giving back the hand to the user. Equals TRUE by default.

See Also

isRunning abort

Click here to see examples

## Not run:

runLogLikelihoodEstimation(linearization = T)

## End(Not run)

)
Top of the page, Monolix API.


Population parameter estimation

Description

Estimate the population parameters with the SAEM method. The associated method keyword is “saem”.
By default, this task is not processed in the background of the R session.
Notice that it does not impact the current scenario. Call

  1. isRunning to check if the scenario is still running and get information about the current task,
  2. abort to stop the execution.

To launch the function in the background, so that functions which do not modify the project (“get” functions for example) remains available, set the input argument “wait” to FALSE.
The initial values of the population parameters can be accessed by calling getPopulationParameterInformation and customized with setPopulationParameterInformation.
The estimated population parameters are available using getEstimatedPopulationParameters function.

Usage

runPopulationParameterEstimation(wait = TRUE)

Arguments

wait
(bool) Should R wait for run completion before giving back the hand to the user. Equals TRUE by default.

See Also

isRunning abort

Click here to see examples

## Not run:

runPopulationParameterEstimation()

## End(Not run)

)
Top of the page, Monolix API.


Run current scenario

Description

Run the current scenario. By default, this task is processed sequentially.
Call

  1. isRunning to check if the scenario is still running and get information about the current task,
  2. abort to stop the execution.

To launch the function in the background, so that functions which do not modify the project (“get” functions for example) remains available, set the input argument “wait” to FALSE.

Note: if the plots task is selected in the scenario, and if “Export charts data” is selected in Monolix’s preferences, the charts data are saved in the result folder. Generating the interactive plots requires to open the project in the GUI.

Usage

runScenario(wait = TRUE)

Arguments

wait
(bool) Should R wait for run completion before giving back the hand to the user. Equals TRUE by default.

See Also

setScenario getScenario abort isRunning

Click here to see examples

## Not run:

runScenario() # sequential run

runScenario(wait = TRUE) # background run

## End(Not run)

)
Top of the page, Monolix API.


Standard error estimation

Description

Estimate the Fisher Information Matrix and the standard errors of the population parameters. By default, this task is not processed in the background of the R session.
Notice that it does not impact the current scenario. Call

  1. isRunning to check if the scenario is still running and get information about the current task,
  2. abort to stop the execution.

To launch the function in the background, so that functions which do not modify the project (“get” functions for example) remains available, set the input argument “wait” to FALSE.

Usage

runStandardErrorEstimation(linearization = FALSE, wait = TRUE)

Arguments

linearization
option (boolean)[optional] method to be used. When no method is given, the stochastic approximation is used by default.

wait
(bool) Should R wait for run completion before giving back the hand to the user. Equals TRUE by default.

Details

Existing methods:

Method Identifier
Estimate the FIM by Stochastic Approximation linearization = F (default)
Estimate the FIM by Linearization linearization = T

The Fisher Information Matrix is available using getCorrelationOfEstimates function, while the standard errors are avalaible using getEstimatedStandardErrors function.

See Also

isRunning abort

Click here to see examples

## Not run:

runStandardErrorEstimation(linearization = T)

## End(Not run)

)
Top of the page, Monolix API.


Set scenario

Description

Clear the current scenario and build a new one from a given list of tasks, the linearization option and the list of plots.

The scenario is a list of 3 objects:

  • tasks: named vector of boolean, defining for each task if it should run or not
  • linearization: boolean, defining if linearization method should be used or not for standard errors and log-likelihood estimation
  • plotList: vector of strings, defining the list of graphics to generate

NOTE

by default the boolean is false.

Usage

setScenario(...)

Details

NOTE

Within a MONOLIX scenario, the order in which the different algorithms are run is fixed.

Options for the “task” object of the list:

Algorithm in GUI Keyword in connector
Population Parameter Estimation “populationParameterEstimation”
Conditional Mode Estimation (EBEs) “conditionalModeEstimation”
Sampling from the Conditional Distribution “conditionalDistributionSampling”
Standard Error and Fisher Information Matrix Estimation “standardErrorEstimation”
LogLikelihood Estimation “logLikelihoodEstimation”
Plots “plots”

Options for the “linearization” object of the list: TRUE or FALSE

Options for the “plotList” object of the list:

Name in GUI Keyword for connector
Observed data “outputplot”
Individual fits “indfits”
Observations vs predictions “obspred”
Scatter plot of the residuals “residualsscatter”
Distribution of the residuals “residualsdistribution”
Distribution of the individual parameters “parameterdistribution”
Distribution of the random effects “randomeffects”
Correlation between random effects “covariancemodeldiagnosis”
Individual parameters vs covariates “covariatemodeldiagnosis”
Visual predictive check “vpc”
Visual predictive check (discrete data) “categorizedoutput”
Numerical predictive check “npc”
BLQ predictive check “blq”
Prediction distribution “predictiondistribution”
Likelihood contribution “likelihoodcontribution”
Standard errors of the estimates “fisher”
SAEM “saemresults”
MCMC “condmeanresults”
Importance sampling “likelihoodresults”

 

See Also

getScenario

Click here to see examples

## Not run:

scenario = getScenario()

scenario$tasks = c(populationParameterEstimation = T, conditionalModeEstimation = T, conditionalDistributionSampling = T)

scenario$linearization = TRUE

scenario$plotList = c(“outputplot”,”fisher”)

setScenario(scenario)

## End(Not run)

)
Top of the page, Monolix API.

3.10.6.API concerning the results

Description of the functions of the API

getCorrelationOfEstimates Get the inverse of the last estimated Fisher matrix computed either by all the Fisher methods used during the last scenario run or by the specific one passed in argument.
getEstimatedIndividualParameters Get the last estimated values for each subject of some of the individual parameters present within the current project.
getEstimatedLogLikelihood Get the values computed by using a log-likelihood algorithm during the last scenario run, with or without a method-based filter.
getEstimatedPopulationParameters Get the last estimated value of some of the population parameters present within the current project (fixed effects + individual variances + correlations + latent probabilities + error model parameters).
getEstimatedRandomEffects Get the random effects for each subject of some of the individual parameters present within the current project.
getEstimatedStandardErrors Get the last estimated standard errors of population parameters computed either by all the Fisher methods used during the last scenario run or by the specific one passed in argument.
getLaunchedTasks Get a list of the tasks which have results to provide.
getSAEMiterations Retrieve the successive values of some of the population parameters present within the current project (fixed effects + individual variances + correlations + latent probabilities + error model parameters) during the previous run of the SAEM algorithm.
getSimulatedIndividualParameters Get the simulated values for each replicate of each subject of some of the individual parameters present within the current project.
getSimulatedRandomEffects Get the simulated values for each replicate of each subject of some of the individual random effects present within the current project.

Get the inverse of the Fisher Matrix

Description

Get the inverse of the last estimated Fisher matrix computed either by all the Fisher methods used during the last scenario run or by the specific one passed in argument.
WARNING: The Fisher matrix cannot be accessible until the Fisher algorithm has been launched once.
The user can choose to display only the Fisher matrix estimated with a specific method.
Existing Fisher methods :

Fisher by Linearization “linearization”
Fisher by Stochastic Approximation “stochasticApproximation”

WARNING: Only the methods which have been used during the last scenario run can provide results.

Usage

getCorrelationOfEstimates(method = "")

Arguments

method
rownames list of row names columnnames list of column names rownumber number of rows data vector<…> containing matrix raw values (column major)

)
Top of the page, Monolix API.


Get last estimated individual parameter values

Description

Get the last estimated values for each subject of some of the individual parameters present within the current project.
WARNING: Estimated individual parameters values cannot be accessible until the individual estimation algorithm has been launched once.
NOTE: The user can choose to display only the individual parameter values estimated with a specific method.
Existing individual estimation methods :

Conditional Mean SAEM “saem”
Conditional Mean “conditionalMean”
Conditional Mode “conditionalMode”

WARNING: Only the methods which have been used during the last scenario run can provide estimation results.

Usage

getEstimatedIndividualParameters(..., method = "")

Arguments


(string) Name of the individual parameters whose values must be displayed. Call getIndividualParameterModel to get a list of the individual parameters present within the current project.

method
getEstimatedRandomEffects

Click here to see examples

## Not run:

indivParams = getEstimatedIndividualParameters() # retrieve the values of all the available individual parameters for all methods

-> $saem

id Cl V ka

1 0.28 7.71 0.29

. … … …

N 0.1047.62 1.51

indivParams = getEstimatedIndividualParameters(“Cl”, “V”, method = “conditionalMean”) # retrieve the values of the individual parameters “Cl” and “V” estimated by the conditional mode method

## End(Not run)

)
Top of the page, Monolix API.


Get Log-Likelihood values

Description

Get the values computed by using a log-likelihood algorithm during the last scenario run, with or without a method-based filter.
WARNING: The log-likelihood values cannot be accessible until the log-likelihood algorithm has been launched once.
The user can choose to display only the log-likelihood values computed with a specific method.
Existing log-likelihood methods :

Log-likelihood by Linearization “linearization”
Log-likelihood by Important Sampling “importanceSampling”

WARNING: Only the methods which have been used during the last scenario run can provide results.

Usage

getEstimatedLogLikelihood(method = "")

Arguments

method
Top of the page, Monolix API.


Get last estimated population parameter value

Description

Get the last estimated value of some of the population parameters present within the current project (fixed effects + individual variances + correlations + latent probabilities + error model parameters).
WARNING: Estimated population parameters values cannot be accessible until the SAEM algorithm has been launched once.

Usage

getEstimatedPopulationParameters(...)

Arguments


[optional] (array<string>) Names of the population parameters whose value must be displayed. Call getPopulationParameterInformation to get a list of the population parameters present within the current project.
If this field is not specified, the function will retrieve the values of all the available population parameters.

Value

A named vector containing the last estimated value of each one of the population parameters passed in argument.

Click here to see examples

## Not run:

getEstimatedPopulationParameters(“V_pop”) -> [V_pop = 0.5]

getEstimatedPopulationParameters(“V_pop”,”Cl_pop”) -> [V_pop = 0.5, Cl_pop = 0.25]

getEstimatedPopulationParameters() -> [V_pop = 0.5, Cl_pop = 0.25, ka_pop = 0.05]

## End(Not run)

)
Top of the page, Monolix API.


Get estimated the random effects

Description

Get the random effects for each subject of some of the individual parameters present within the current project.
WARNING: Estimated random effects cannot be accessible until the individual estimation algorithm has been launched once.
The user can choose to display only the random effects estimated with a specific method.
NOTE: The random effects are defined in the gaussian referential, e.g. if ka is lognormally distributed around ka_pop, eta_i = log(ka_i)-log(ka_pop)
Existing individual estimation methods :

Conditional Mean SAEM “saem”
Conditional Mean “conditionalMean”
Conditional Mode “conditionalMode”

WARNING: Only the methods which have been used during the last scenario run can provide estimation results. Please call getLaunchedTasks to get a list of the methods whose results are available.

Usage

getEstimatedRandomEffects(..., method = "")

Arguments


(string) Name of the individual parameters whose random effects must be displayed. Call getIndividualParameterModel to get a list of the individual parameters present within the current project.

method
getEstimatedIndividualParameters

Click here to see examples

## Not run:

etaParams = getEstimatedRandomEffects() # retrieve the values of all the available random effects for all methods, without the associated standard deviations

-> $saem

id Cl V ka

1 0.28 7.71 0.29

. … … …

N 0.1047.62 1.51

etaParams = getEstimatedRandomEffects(“Cl”, “V”, method = “conditionalMode”) # retrieve the values of the individual parameters “Cl” and “V” estimated by the conditional mean from SAEM algorithm

## End(Not run)

)
Top of the page, Monolix API.


Get standard errors of population parameters

Description

Get the last estimated standard errors of population parameters computed either by all the Fisher methods used during the last scenario run or by the specific one passed in argument.
WARNING: The standard errors cannot be accessible until the Fisher algorithm has been launched once.
Existing Fisher methods :

Fisher by Linearization “linearization”
Fisher by Stochastic Approximation “stochasticApproximation”

WARNING: Only the methods which have been used during the last scenario run can provide results.

Usage

getEstimatedStandardErrors(method = "")

Arguments

method
Top of the page, Monolix API.


Get tasks with results

Description

Get a list of the tasks which have results to provide. A task is the association of:

  • an algorithm (string)
  • a vector of methods (string) relative to this algorithm for the standardErrorEstimation and the loglikelihoodEstimation, TRUE or FALSE for the other one.

Usage

getLaunchedTasks()

Value

The list of tasks with results, indexed by algorithm names.

Click here to see examples

## Not run:

tasks = getLaunchedTasks()

tasks

-> $populationParameterEstimation = TRUE

$conditionalModeEstimation = TRUE

$standardErrorEstimation = “linearization”

## End(Not run)

)
Top of the page, Monolix API.


Get SAEM algorithm iterations

Description

Retrieve the successive values of some of the population parameters present within the current project (fixed effects + individual variances + correlations + latent probabilities + error model parameters) during the previous run of the SAEM algorithm.
WARNING: Convergence history of population parameters values cannot be accessible until the SAEM algorithm has been launched once.

Usage

getSAEMiterations(...)

Arguments


[optional] (array<string>) Names of the population parameters whose convergence history must be displayed. Call getPopulationParameterInformation to get a list of the population parameters present within the current project.
If this field is not specified, the function will retrieve the values of all the available population parameters.

Value

A list containing a pair composed by the number of exploratory and smoothing iterations and a data frame which associates each wanted population parameter to its successive values over SAEM algorithm iterations.

Click here to see examples

## Not run:

report = getSAEMiterations()

report

-> $iterationNumbers

c(50,25)

$estimates

V Cl

0.25 0

0.3 0.5

. .

0.35 0.25

## End(Not run)

)
Top of the page, Monolix API.


Get simulated individual parameters

Description

Get the simulated values for each replicate of each subject of some of the individual parameters present within the current project.
WARNING: Simulated individual parameters values cannot be accessible until the individual estimation with conditional mean algorithm has been launched once.

Usage

getSimulatedIndividualParameters(...)

Arguments


(string) Name of the individual parameters whose values must be displayed. Call getIndividualParameterModel to get a list of the individual parameters present within the current project.

Value

A list giving the last simulated values of the individual parameters of interest for each replicate of each subject.

See Also

getSimulatedRandomEffects

Click here to see examples

## Not run:

simParams = getSimulatedIndividualParameters() # retrieve the values of all the available individual parameters

simParams

rep id Cl V ka

1 1 0.022 0.37 1.79

1 2 0.033 0.42 -0.92

. . … … …

2 1 0.021 0.33 1.47

. . … … …

## End(Not run)

)
Top of the page, Monolix API.


Get simulated random effects

Description

Get the simulated values for each replicate of each subject of some of the individual random effects present within the current project.
WARNING: Simulated individual random effects values cannot be accessible until the individual estimation algorithm with conditional mean has been launched once.

Usage

getSimulatedRandomEffects(...)

Arguments


(string) Name of the individual parameters whose values must be displayed. Call getIndividualParameterModel to get a list of the individual parameters present within the current project.

Value

A list giving the last simulated values of the individual random effects of interest for each replicate of each subject.

See Also

getIndividualParameterModel

Click here to see examples

## Not run:

simEtas = getSimulatedRandomEffects() # retrieve the values of all the available individual random effects

simEtas

rep id Cl V ka

1 1 0.022 0.37 1.79

1 2 0.033 0.42 -0.92

. . … … …

2 1 0.021 0.33 1.47

. . … … …

## End(Not run)

)
Top of the page, Monolix API.

3.10.7.API concerning the project management

Description of the functions of the API

getData Get a description of the data used in the current project.
getStructuralModel Get the model file for the structural model used in the current project.
loadProject Load a project by parsing the mlxtran-formated file whose path has been given as an input.
newProject Create a new empty project providing model and data specification.
saveProject Save the current project as an Mlxtran-formated file.
setData Set project data giving a data file and specifying headers and observations types.
setStructuralModel Set the structural model.

Get project data

Description

Get a description of the data used in the current project. Available information are:

  • dataFile (string): path to the data file
  • header (array<character>): vector of header names
  • headerTypes (array<character>): vector of header types
  • observationNames (vector<string>): vector of observation names
  • observationTypes (vector<string>): vector of observation types
  • nbSSDoses (int) : number of doses (if there is a SS column)

Usage

getData()

Value

A list describing project data.

See Also

setData

Click here to see examples

## Not run:

data = getData()

data

-> $dataFile

"/path/to/data/file.txt"

$header

c("ID","TIME","CONC","SEX","OCC")

$headerTypes

c("ID","TIME","OBSERVATION","CATEGORICAL COVARIATE","IGNORE")

$observationNames

c("concentration")

$observationTypes

c(concentration = "continuous")

## End(Not run)

Top of the page, Monolix API.


Get structural model file

Description

Get the model file for the structural model used in the current project.

Usage

getStructuralModel()

Value

A string corresponding to the path to the structural model file.

See Also

setStructuralModel

Click here to see examples

## Not run:

getStructuralModel() => "/path/to/model/inclusion/modelFile.txt"

## End(Not run)

Top of the page, Monolix API.


Load project from file

Description

Load a project by parsing the mlxtran-formated file whose path has been given as an input.
WARNING: R is sensitive between ‘\’ and ‘/’, only ‘/’ can be used

Usage

loadProject(projectFile)

Arguments

projectFile
(character) Path to the project file. Can be absolute or relative to the current working directory.

See Also

saveProject

Click here to see examples

## Not run:

loadProject("/path/to/project/file.mlxtran") for Linux platform

loadProject("C:/Users/path/to/project/file.mlxtran") for Windows platform

## End(Not run)

Top of the page, Monolix API.


Create new project

Description

Create a new empty project providing model and data specification. The data specification is:

  • dataFile (string): path to the data file
  • headerTypes (array<character>): A vector of header types.
    The possible header types are: “id”, “time”, “observation”, “amount”, “contcov”, “catcov”, “occ”, “evid”, “mdv”, “obsid”, “cens”, “limit”, “regressor”,”admid”, “rate”, “tinf”, “ss”, “ii”, “addl”, “date”, “ignore”.
    Notice that these are not the types displayed in the interface, these one are shortcuts. They are not case-sensitive.
  • observationTypes (list): A list giving the type of each observation present in the data file. If there is only one y-type, the corresponding observation name can be omitted.
    The possible observation types are “continuous”, “discrete”, and “event”.
  • nbSSDoses [optional](int): Number of doses (if there is a SS column).

Usage

newProject(modelFile, data)

Arguments

modelFile
(character) Path to the model file. Can be absolute or relative to the current working directory.

data
(list) Structure describing the data.

See Also

newProject saveProject

Click here to see examples

## Not run:

newProject(data = list(dataFile = "/path/to/data/file.txt",
           headerTypes = c("IGNORE","OBSERVATION"),
           observationTypes = "continuous"),
           modelFile = "/path/to/model/file.txt")

## End(Not run)

## Example with warfarin_data.txt from demos and oral1_1cpt_kaVCl.txt from libraries in the current directory
data = list(dataFile= "./warfarin_data.txt",
headerTypes =c("id", "time", "amount", "observation", "obsid", "contcov", "catcov", "ignore"),
observationTypes = list(y1 = "continuous", y2 = "continuous" ))
modelFile <- './oral1_1cpt_kaVCl.txt'
newProject(modelFile = modelFile, data = data)

Top of the page, Monolix API.


Save current project

Description

Save the current project as an Mlxtran-formated file.

Usage

saveProject(projectFile = "")

Arguments

projectFile
<a href="character“>optional Path where to save a copy of the current mlxtran model. Can be absolute or relative to the current working directory.
If no path is given, the file used to build the current configuration is updated.

See Also

newProject loadProject

Click here to see examples

## Not run:

saveProject("/path/to/project/file.mlxtran") # save a copy of the model

saveProject() # update current model

## End(Not run)

Top of the page, Monolix API.


Set project data

Description

Set project data giving a data file and specifying headers and observations types.

Usage

setData(dataFile, headerTypes, observationTypes, nbSSDoses = NULL)

Arguments

dataFile
(character): Path to the data file. Can be absolute or relative to the current working directory.

headerTypes
(array<character>): A collection of header types.
The possible header types are: “id”, “time”, “observation”, “amount”, “contcov”, “catcov”, “occ”, “evid”, “mdv”, “obsid”, “cens”, “limit”, “regressor”,”admid”, “rate”, “tinf”, “ss”, “ii”, “addl”, “date”, “ignore”
Notice that these are not the types displayed in the interface, these one are shortcuts. They are not case-sensitive.

observationTypes
(list): A list giving the type of each observation present in the data file. If there is only one y-type, the corresponding observation name can be omitted.
The possible observation types are “continuous”, “discrete”, and “event”

nbSSDoses
<a href="int“>optional: Number of doses (if there is a SS column).

See Also

getData

Click here to see examples

## Not run:

setData(dataFile = "/path/to/data/file.txt", headerTypes = c("IGNORE","OBSERVATION"), 
        observationTypes = "continuous")

setData(dataFile = "/path/to/data/file.txt", headerTypes = c("IGNORE","OBSERVATION","YTYPE"), 
        observationTypes = list(Concentration = "continuous", Level = "discrete"))

## End(Not run)

Top of the page, Monolix API.


Set structural model file

Description

Set the structural model.

Usage

setStructuralModel(modelFile)

Arguments

modelFile
(character) Path to the model file. Can be absolute or relative to the current working directory.

See Also

getStructuralModel

Click here to see examples

## Not run:

setStructuralModel("/path/to/model/file.txt")

## End(Not run)

Top of the page, Monolix API.

3.10.8.API concerning the settings

Description of the functions of the API

getConditionalDistributionSamplingSettings Get the conditional distribution sampling settings.
getConditionalModeEstimationSettings Get the conditional mode estimation settings.
getGeneralSettings Get a summary of the common settings for Monolix algorithms.
getLogLikelihoodEstimationSettings Get the loglikelihood estimation settings.
getMCMCSettings Get the MCMC algorithm settings of the current project.
getPopulationParameterEstimationSettings Get the population parameter estimation settings.
getPreferences Get a summary of the project preferences.
getProjectSettings Get a summary of the project settings.
getStandardErrorEstimationSettings Get the standard error estimation settings.
setConditionalDistributionSamplingSettings Set the value of one or several of the conditional distribution sampling settings.
setConditionalModeEstimationSettings Set the value of one or several of the conditional mode estimation settings.
setGeneralSettings Set the value of one or several of the common settings for Monolix algorithms.
setLogLikelihoodEstimationSettings Set the value of the loglikelihood estimation settings.
setMCMCSettings Set the value of one or several of the MCMC algorithm specific settings of the current project.
setPopulationParameterEstimationSettings Set the value of one or several of the population parameter estimation settings.
setPreferences Set the value of one or several of the project preferences.
setProjectSettings Set the value of one or several of the settings of the project.
setStandardErrorEstimationSettings Set the value of one or several of the standard error estimation settings.

Get conditional distribution sampling settings

Description

Get the conditional distribution sampling settings. Associated settings are:

 

“ratio” (0< double <1) Width of the confidence interval.
“nbMinIterations” (int >=1) Minimum number of iterations.
“nbSimulatedParameters” (int >=1) Number of replicates.

Usage

getConditionalDistributionSamplingSettings(...)

Arguments


[optional] (string) Name of the settings whose value should be displayed. If no argument is provided, all the settings are returned.

Value

An array which associates each setting name to its current value.

See Also

setConditionalDistributionSamplingSettings

Click here to see examples

## Not run:

getConditionalDistributionSamplingSettings() # retrieve a list of all the conditional distribution sampling settings

getConditionalDistributionSamplingSettings("ratio","nbMinIterations") # retrieve a list containing only the value of the settings whose name has been passed in argument

## End(Not run)


Top of the page, Monolix API.


Get conditional mode estimation settings

Description

Get the conditional mode estimation settings. Associated settings are:

“nbOptimizationIterationsMode” (int >=1) Maximum number of iterations.
“optimizationToleranceMode” (double >0) Optimization tolerance.

Usage

getConditionalModeEstimationSettings(...)

Arguments


[optional] (string) Name of the settings whose value should be displayed. If no argument is provided, all the settings are returned.

Value

An array which associates each setting name to its current value.

See Also

setConditionalModeEstimationSettings

Click here to see examples

## Not run:

getConditionalModeEstimationSettings() # retrieve a list of all the conditional mode estimation settings

getConditionalModeEstimationSettings("nbOptimizationIterationsMode") # retrieve a list containing only the value of the settings whose name has been passed in argument

## End(Not run)


Top of the page, Monolix API.


Get project general settings

Description

Get a summary of the common settings for Monolix algorithms. Associated settings are:

“autoChains” (bool) Automatically adjusted the number of chains to have at least a minimum number of subjects.
“nbChains” (int >0) Number of chains. Used only if “autoChains” is set to FALSE.
“minIndivForChains” (int >0) Minimum number of individuals by chain.

Usage

getGeneralSettings(...)

Arguments


[optional] (string) Name of the settings whose value should be displayed. If no argument is provided, all the settings are returned.

Value

An array which associates each setting name to its current value.

See Also

setGeneralSettings

Click here to see examples

## Not run:

getGeneralSettings() # retrieve a list of all the general settings

getGeneralSettings("nbChains","autoChains") # retrieve a list containing only the value of the settings whose name has been passed in argument

## End(Not run)


Top of the page, Monolix API.


Get LogLikelihood algorithm settings

Description

Get the loglikelihood estimation settings. Associated settings are:

“nbFixedIterations” (int >0) Monte Carlo size for the loglikelihood evaluation.
“samplingMethod” (string) Should the loglikelihood estimation use a given number of freedom degrees (“fixed”) or test a sequence of degrees of freedom numbers before choosing the best one (“optimized”).
“nbFreedomDegrees” (int >0) Degree of freedom of the Student t-distribution. Used only if “samplingMethod” is “fixed”.
“freedomDegreesSampling” (vector<int(>0)>) Sequence of freedom degrees to be tested. Used only if “samplingMethod” is “optimized”.

Usage

getLogLikelihoodEstimationSettings(...)

Arguments


[optional] (string) Name of the settings whose value should be displayed. If no argument is provided, all the settings are returned.

Value

An array which associates each setting name to its current value.

See Also

setLogLikelihoodEstimationSettings

Click here to see examples

## Not run:

getLogLikelihoodEstimationSettings() # retrieve a list of all the loglikelihood estimation settings

getLogLikelihoodEstimationSettings("nbFixedIterations","samplingMethod") # retrieve a list containing only the value of the settings whose name has been passed in argument (here, the number of fixed iterations and the method)

## End(Not run)


Top of the page, Monolix API.


Get MCMC algorithm settings

Description

Get the MCMC algorithm settings of the current project. Associated settings are:

“strategy” (vector<int>[3]) Number of calls for each one of the three MCMC kernels.
“acceptanceRatio” (double) Target acceptance ratio.

Usage

getMCMCSettings(...)

Arguments


[optional] (string) Names of the settings whose value should be displayed. If no argument is provided, all the settings are returned.

Value

An array which associates each setting name to its current value.

See Also

setMCMCSettings

Click here to see examples

## Not run:

getMCMCSettings() # retrieve a list of all the MCMC settings

getMCMCSettings("strategy") # retrieve a list containing only the value of the settings whose name has been passed in argument (here, the strategy)

## End(Not run)


Top of the page, Monolix API.


Get population parameter estimation settings

Description

Get the population parameter estimation settings. Associated settings are:

“nbBurningIterations” (int >=0) Number of iterations in the burn-in phase.
“nbExploratoryIterations” (int >=0) If “exploratoryAutoStop” is set to FALSE, it is the number of iterations in the exploratory phase. Else wise, if “exploratoryAutoStop” is set to TRUE, it is the maximum of iterations in the exploratory phase.
“exploratoryAutoStop” (bool) Should the exploratory step automatically stop.
“exploratoryInterval” (int >0) Minimum number of interation in the exploratory phase. Used only if “exploratoryAutoStop” is TRUE
“exploratoryAlpha” (0<= double <=1) Convergence memory in the exploratory phase. Used only if “exploratoryAutoStop” is TRUE
“nbSmoothingIterations” (int >=0) If “smoothingAutoStop” is set to FALSE, it is the number of iterations in the smoothing phase. Else wise, if “smoothingAutoStop” is set to TRUE, it is the maximum of iterations in the smoothing phase.
“smoothingAutoStop” (bool) Should the smoothing step automatically stop.
“smoothingInterval” (int >0) inimum number of interation in the smoothing phase. Used only if “smoothingAutoStop” is TRUE.
“smoothingAlpha” (0.5< double <=1) Convergence memory in the smoothing phase. Used only if “smoothingAutoStop” is TRUE.
“smoothingRatio” (0< double <1) Width of the confidence interval. Used only if “smoothingAutoStop” is TRUE.
“simulatedAnnealing” (bool) Should annealing be simulated.
“tauOmega” (double >0) Proportional rate on variance. Used only if “simulatedAnnealing” is TRUE.
“tauErrorModel” (double >0) Proportional rate on error model. Used only if “simulatedAnnealing” is TRUE.
“variability” (string) Estimation method for parameters without variability: “firstStage” | “decreasing” | “none”. Used only if arameters without variability are used in the project.
“nbOptimizationIterations” (int >=1) Number of optimization iterations.
“optimizationTolerance” (double >0) Tolerance for optimization.

Usage

getPopulationParameterEstimationSettings(...)

Arguments


[optional] (string) Name of the settings whose value should be displayed. If no argument is provided, all the settings are returned.

Value

An array which associates each setting name to its current value.

See Also

setPopulationParameterEstimationSettings

Click here to see examples

## Not run:

getPopulationParameterEstimationSettings() # retrieve a list of all the population parameter estimation settings

getPopulationParameterEstimationSettings("nbBurningIterations","smoothingInterval") # retrieve a list containing only the value of the settings whose name has been passed in argument (here, the number of burning iterations and the smoothing interval)

## End(Not run)


Top of the page, Monolix API.


Get project preferences

Description

Get a summary of the project preferences. Preferences are:

“relativePath” (bool) Use relative path for save/load operations.
“threads” (int >0) Number of threads.
“timeStamping” (bool) Create an archive containing result files after each run.
“dpi” (bool) Apply high density pixel correction.
“imageFormat” (string) Image format used to save Monolix plots.
“delimiter” (string) Character used as delimiter in exported result files (“comma”, “,”, “semicolon”, “;”, “space”, ” “, “tab”, “\t”).
“exportGraphics” (bool) Should plots images be exported.
“exportGraphicsData” (bool) Should charts data be exported.

Usage

getPreferences(...)

Arguments


[optional] (string) Name of the preference whose value should be displayed. If no argument is provided, all the preferences are returned.

Value

An array which associates each preference name to its current value.

See Also

setGeneralSettings

Click here to see examples

## Not run:

getPreferences() # retrieve a list of all the general settings

getPreferences("imageFormat","exportGraphics") # retrieve a list containing only the value of the preferences whose name has been passed in argument

## End(Not run)


Top of the page, Monolix API.


Get project settings

Description

Get a summary of the project settings. Associated settings are:

“directory” (string) Path to the folder where simulation results will be saved. It should be a writable directory.
“exportResults” (bool) Should results be exported.
“seed” (0< int <2147483647) Seed used by random generators.
“grid” (int) Number of points for the continuous simulation grid.
“nbSimulations” (int) Number of simulation for the plots (in VPC, NPC, …).
“dataAndModelNextToProject” (bool) Should data and model files be saved next to project.

Usage

getProjectSettings(...)

Arguments


[optional] (string) Name of the settings whose value should be displayed. If no argument is provided, all the settings are returned.

Value

An array which associates each setting name to its current value.

See Also

getProjectSettings

Click here to see examples

## Not run:

getProjectSettings() # retrieve a list of all the project settings

getProjectSettings("directory","seed") # retrieve a list containing only the value of the settings whose name has been passed in argument

## End(Not run)


Top of the page, Monolix API.


Get standard error estimation settings

Description

Get the standard error estimation settings. Associated settings are:

“minIterations” (int >=1) Minimum number of iterations.
“maxIterations” (int >=1) Maximum number of iterations.

Usage

getStandardErrorEstimationSettings(...)

Arguments


[optional] (string) Name of the settings whose value should be displayed. If no argument is provided, all the settings are returned.

Value

An array which associates each setting name to its current value.

See Also

setStandardErrorEstimationSettings

Click here to see examples

## Not run:

getStandardErrorEstimationSettings() # retrieve a list of all the standard error estimation settings

getStandardErrorEstimationSettings("minIterations","maxIterations") # retrieve a list containing only the value of the settings whose name has been passed in argument

## End(Not run)


Top of the page, Monolix API.


Set conditional distribution sampling settings

Description

Set the value of one or several of the conditional distribution sampling settings. Associated settings are:

“ratio” (0< double <1) Width of the confidence interval.
“nbMinIterations” (int >=1) Minimum number of iterations.
“nbSimulatedParameters” (int >=1) Number of replicates.

Usage

setConditionalDistributionSamplingSettings(...)

Arguments


A collection of comma-separated pairs {settingName = settingValue}.

See Also

getConditionalDistributionSamplingSettings

Click here to see examples

## Not run:

setConditionalDistributionSamplingSettings(ratio = 0.05, nbMinIterations = 50)

## End(Not run)


Top of the page, Monolix API.


Set conditional mode estimation settings

Description

Set the value of one or several of the conditional mode estimation settings. Associated settings are:

“nbOptimizationIterationsMode” (int >=1) Maximum number of iterations.
“optimizationToleranceMode” (double >0) Optimization tolerance.

Usage

setConditionalModeEstimationSettings(...)

Arguments


A collection of comma-separated pairs {settingName = settingValue}.

See Also

getConditionalModeEstimationSettings

Click here to see examples

## Not run:

setConditionalModeEstimationSettings(nbOptimizationIterationsMode = 20, optimizationToleranceMode = 0.1)

## End(Not run)


Top of the page, Monolix API.


Set common settings for algorithms

Description

Set the value of one or several of the common settings for Monolix algorithms. Associated settings are:

“autoChains” (bool) Automatically adjusted the number of chains to have at least a minimum number of subjects.
“nbChains” (int >0) Number of chains to be used if “autoChains” is set to FALSE.
“minIndivForChains” (int >0) Minimum number of individuals by chain.

Usage

setGeneralSettings(...)

Arguments


A collection of comma-separated pairs {settingName = settingValue}.

See Also

getGeneralSettings

Click here to see examples

## Not run:

setGeneralSettings(autoChains = FALSE, nbchains = 10)

## End(Not run)


Top of the page, Monolix API.


Set loglikelihood estimation settings

Description

Set the value of the loglikelihood estimation settings. Associated settings are:

“nbFixedIterations” (int >0) Monte Carlo size for the loglikelihood evaluation.
“samplingMethod” (string) Should the loglikelihood estimation use a given number of freedom degrees (“fixed”) or test a sequence of degrees of freedom numbers before choosing the best one (“optimized”).
“nbFreedomDegrees” (int >0) Degree of freedom of the Student t-distribution. Used only if “samplingMethod” is “fixed”.
“freedomDegreesSampling” (vector<int(>0)>) Sequence of freedom degrees to be tested. Used only if “samplingMethod” is “optimized”.

Usage

setLogLikelihoodEstimationSettings(...)

Arguments


A collection of comma-separated pairs {settingName = settingValue}.

See Also

getLogLikelihoodEstimationSettings

Click here to see examples

## Not run:

setLogLikelihoodEstimationSettings(nbFixedIterations = 20000)

## End(Not run)


Top of the page, Monolix API.


Set settings associated to the MCMC algorithm

Description

Set the value of one or several of the MCMC algorithm specific settings of the current project. Associated settings are:

“strategy” (vector<int>[3]) Number of calls for each one of the three MCMC kernels.
“acceptanceRatio” (double) Target acceptance ratio.

Usage

setMCMCSettings(...)

Arguments


A collection of comma-separated pairs {settingName = settingValue}.

See Also

getMCMCSettings

Click here to see examples

## Not run:

setMCMCSettings(strategy = c(2,1,2))

## End(Not run)


Top of the page, Monolix API.


Set population parameter estimation settings

Description

Set the value of one or several of the population parameter estimation settings. Associated settings are:

“nbBurningIterations” (int >=0) Number of iterations in the burn-in phase.
“nbExploratoryIterations” (int >=0) If “exploratoryAutoStop” is set to FALSE, it is the number of iterations in the exploratory phase. Else wise, if “exploratoryAutoStop” is set to TRUE, it is the maximum of iterations in the exploratory phase.
“exploratoryAutoStop” (bool) Should the exploratory step automatically stop.
“exploratoryInterval” (int >0) Minimum number of interation in the exploratory phase. Used only if “exploratoryAutoStop” is TRUE
“exploratoryAlpha” (0<= double <=1) Convergence memory in the exploratory phase. Used only if “exploratoryAutoStop” is TRUE
“nbSmoothingIterations” (int >=0) If “smoothingAutoStop” is set to FALSE, it is the number of iterations in the smoothing phase. Else wise, if “smoothingAutoStop” is set to TRUE, it is the maximum of iterations in the smoothing phase.
“smoothingAutoStop” (bool) Should the smoothing step automatically stop.
“smoothingInterval” (int >0) Minimum number of interation in the smoothing phase. Used only if “smoothingAutoStop” is TRUE.
“smoothingAlpha” (0.5< double <=1) Convergence memory in the smoothing phase. Used only if “smoothingAutoStop” is TRUE.
“smoothingRatio” (0< double <1) Width of the confidence interval. Used only if “smoothingAutoStop” is TRUE.
“simulatedAnnealing” (bool) Should annealing be simulated.
“tauOmega” (double >0) Proportional rate on variance. Used only if “simulatedAnnealing” is TRUE.
“tauErrorModel” (double >0) Proportional rate on error model. Used only if “simulatedAnnealing” is TRUE.
“variability” (string) Estimation method for parameters without variability: “firstStage” | “decreasing” | “none”. Used only if arameters without variability are used in the project.
“nbOptimizationIterations” (int >=1) Number of optimization iterations.
“optimizationTolerance” (double >0) Tolerance for optimization.

Usage

setPopulationParameterEstimationSettings(...)

Arguments


A collection of comma-separated pairs {settingName = SettingValue}.

See Also

getPopulationParameterEstimationSettings

Click here to see examples

## Not run:

setPopulationParameterEstimationSettings(exploratoryAutoStop = TRUE, tauOmega = 0.95)

## End(Not run)


Top of the page, Monolix API.


Set preferences

Description

Set the value of one or several of the project preferences. Prefenreces are:

“relativePath” (bool) Use relative path for save/load operations.
“threads” (int >0) Number of threads.
“timeStamping” (bool) Create an archive containing result files after each run.
“dpi” (bool) Apply high density pixel correction.
“imageFormat” (string) Image format used to save Mnolix plots.
“delimiter” (string) Character used as delimiter in exported result files (“comma”, “,”, “semicolon”, “;”, “space”, ” “, “tab”, “\t”).
“exportGraphics” (bool) Should plots images be exported.
“exportGraphicsData” (bool) Should charts data be exported.

Usage

setPreferences(...)

Arguments


A collection of comma-separated pairs {preferenceName = settingValue}.

See Also

getPreferences

Click here to see examples

## Not run:

setPreferences("exportGraphics" = FALSE, "delimiter" = ",")

## End(Not run)


Top of the page, Monolix API.


Set project settings

Description

Set the value of one or several of the settings of the project. Associated settings are:

“directory” (string) Path to the folder where simulation results will be saved. It should be a writable directory.
“exportResults” (bool) Should results be exported.
“seed” (0< int <2147483647) Seed used by random generators.
“grid” (int) Number of points for the continuous simulation grid.
“nbSimulations” (int) Simulation number.
“dataAndModelNextToProject” (bool) Should data and model files be saved next to project.

Usage

setProjectSettings(...)

Arguments


A collection of comma-separated pairs {settingName = settingValue}.

See Also

getProjectSettings

Click here to see examples

## Not run:

setProjectSettings(directory = "/path/to/export/directory", seed = 12345)

## End(Not run)


Top of the page, Monolix API.


Set standard error estimation settings

Description

Set the value of one or several of the standard error estimation settings. Associated settings are:

“minIterations” (int >=1) Minimum number of iterations.
“maxIterations” (int >=1) Maximum number of iterations.

Usage

setStandardErrorEstimationSettings(...)

Arguments


A collection of comma-separated pairs {settingName = settingValue}.

See Also

getStandardErrorEstimationSettings

Click here to see examples

## Not run:

setStandardErrorEstimationSettings(minIterations = 20, maxIterations = 250)

## End(Not run)


Top of the page, Monolix API.

4.Plots

What kind of plots can be generated by Monolix?

The list of plots below corresponds to all the plots that Monolix can generate. They are computed with the task “Plots”, and the list of plots to compute can be selected by clicking on the button next to the task as shown below, prior to running the task.

By default, only the subset of plots are selected, as one can see on this figure. Plots can be selected or unselected one-by-one, by groups or all at once.

In addition to selecting plots, this menu can be used to directly generate one particular plot, by clicking on the green arrow next to it, as can be seen below. The green arrow is not visible if the required information for the chosen plot has not been computed yet. For example, generating the plot “likelihood contribution” requires first to run the “Likelihood” task.

Data

  • Observed data: This plot displays the original data w.r.t. time as a spaghetti plot, along with some additional information.

Model for the observations

  • Individual fits: This plot displays the individual fits: individual predictions using the individual parameters and the individual covariates w.r.t. time on a continuous grid, with the observed data overlaid.
  • Obervations vs predictions: This plot displays observations w.r.t. the predictions computed using the population parameters or the individual parameters.
  • Scatter plot of the residuals: This plot displays the PWRES (population weighted residuals), the IWRES (individual weighted residuals), and the NPDE (Normalized Prediction Distribution Errors) w.r.t. the time and the prediction.
  • Distribution of the residuals: This plot displays the distributions of PWRES, IWRES and NPDE as histograms for the probability density function (PDF) or as cumulative distribution functions (CDF).

Diagnosis plots based on individual parameters

Predictive checks and predictions

Convergence diagnosis

Tasks results

Interacting with the plots

Within the frame “Plots”, the right part of the interface holds a panel with several tabs to interact with the plots:

  • The tab “Settings” provides options specific to each plot, such as hiding or displaying elements of the plot, modifying some elements, or changing axes scales and limits.
  • The tab “Stratify” can be used to select one or several covariates for splitting, filtering or coloring the points of the plot. See below for more details.
  • The tab “Preferences” alows to customize graphical aspects such as colors, font size, dot radius, line width, …

These tabs are marked in purple on the following figure, which is the panel that is showed for observed data:

Stratification: split, color, filter

The stratification panel allows to create and use covariates for stratification purposes. It is possible to select one or several covariates for splitting, filtering or coloring the data set or the diagnosis plots as exposed on the following video.



 

The following figure shows a plot of the observed data from the warfarin dataset, stratified by coloring individuals according to the continuous covariate wt: the observed data is divided into three groups, which were set to equal size with the button “rescale”. It is also possible to set groups of equal width, or to personalize dividing values.

Moreover, clicking on a group highlights only the individuals belonging to this group, as can be seen below:

Values of categorical covariates can also be assigned to new groups, which can then be used for stratification.

 

Saving plots

The user can choose to export each plot as an image with an icon on top of it, or all plots at once with the menu Export. It is also possible to export plots data as table, for example to build new plots with external tools.

Note that:

  • the export starts after the display of the plots,
  • the plots are exported in the result folder,
  • only plot selected in Plots tasks are exported,
  • legends and information frames are not exported.

Automatic exporting can be chosen in the project Preferences (in Settings), as well as the exporting format (png or svg):

 

Layout




The layout can be modified with buttons on top of each plot.

The first button can be used to select a set of subplots to display in the page. For example, as shown below, it is possible to display 9 individual fits per page instead of 12 (default number). The layout is then automatically adapted to balance the number of rows and columns.

The second button can be used to choose a custom layout (number of rows and columns). On the example figures below, the default layout with 3 subplots (left) is modified to arrange them on a single column (right).

 

4.1.Data

4.1.1.Observed data

Purpose

The purpose of this plot, also called a spaghetti plot, is to display the original data w.r.t. time.

Example

 In the example below, the concentration of warfarin from the warfarin data set is displayed. A subject is highlighted in yellow by hovering on the line.

One can plot the output in a log-scale to have a better evaluation of the elimination part for example as in the figure below.

An interesting feature is the possibility to display the dosing time as on the figure below. In the proposed example (PKVK_project of the demos), the individual dosing time of the individual is displayed when the user hovers an individual.

Information are also provided. We propose

  • The total number of subjects
  • The average number of doses per subject
  • The total, average, minimum and maximum number of observations per individual.

In addition, if we split the graphic with a covariate, all the information are recomputed to manage the information of the group as in the following plot.

Settings

  • General: Add/remove the legend or the grid,
  • Axes: Add/remove log-scale, modify labels,
  • Stratify: Split, color and filter by covariates,
  • Preferences: Add/remove elements or change colors and sizes for axes, observations, censored (BLQ) observations, highlighting.

Best practices

  • It is always good to have a look first at the spaghetti plot before running the parameter estimation. Indeed, it is very convenient to see if all the data is consistent, or if some outliers appear. Moreover, looking at the plot can help to identify hypotheses about the model, such as covariate effects.
  • It is possible to generate the Spaghetti plot just after loading the data. For that, click on “Show dataviewer” next to the data file choice.
  • For a better understanding and/or exploration of the data set, it is also possible to export the data set in Datxplore.

4.2.Model for the observations

4.2.1.Individual fits

Purpose

The figure displays the observed data for each subject, as well as two curves from simulations using the design and the covariates of each subject:

  • the predicted profile given by the estimated population model (Population fits),
  • the predicted profile given by the estimated individual model (Individual fits). If the EBEs and/or the conditional distribution tasks were performed, the user can choose either the conditional means or the conditional modes, estimated by MCMC, as estimators. Otherwise, approximations of the conditional means from SAEM are used.

This is a good way to see on each subject the validity of the model, and the actual fit proposed, as well as the inter-individual variability in the kinetics. It is possible to show the computed individual parameters on the figure. Moreover, it is also possible to display an individual predictive check: the median and a confidence interval for (y_{ij}) estimated with a Monte Carlo procedure.

Examples

  • Individual fits and population fits

In the example below, the concentration for the theophylline data set is shown with simulations of a one-compartment model with first-order absorption and linear elimination. For each subject, the data are displayed with blue points along with the individual fit and population fit (the prediction using the estimated individual and population parameters respectively).

  • Individual parameters

Information on individual parameters can be used in two ways, as shown below. By clicking on Information (marked in green on the figure) in the General panel, individual parameter values can be displayed on each individual plot. Moreover, the plots can be sorted according to the values for a given parameter, in ascending or descending order (Sorting panel marked in orange). By default, the individual plots are sorted by subject id, with the same order as in the data set.

  • Individual predictive check

Individual predictive checks can be added to the plots: for each individual a prediction interval is computed based on multiple simulations with the population parameters and the design structure of this individual. The median line of the interval is also drawn. The interval allows to check whether the observed data are compatible with the population prediction, taking into account the inter-individual variability. The example below shows that the first subject in the theophylline data set show too much variability from the rest of the population to be correctly described by the population model.

  • Dosing times

Dosing times can also be overlayed, which is useful to visualize the effect of doses on the prediction. As an example, the following figure shows the observations of an individual from the tobramycin data set along with the corresponding individual fit and multiple dosing times.

  • Special zoom

User-defined constraints for the zoom are available. They allow to zoom in according to one axis only instead of both axes. Moreover, a link between plots can be set in order to perform a linked zoom on all individual plots at once. This is shown on the figure below with observations from the remifentanil example, and individual fits from a two-compartment model. It is thus possible to focus on the same time range or observation values for all individuals. In this example it is used to zoom on time on the elimination phase for all individuals, while keeping the Y axis in log scale unchanged for each plot.

  • Censored data

When a data is censored, this data is different to a “classical” observation and has thus a different representation. We represent it as a bar from the censored value specified in the data set and the associated limit.

If there is no limit column then is goes to Infinity as in the following example. However, in any case, the user can choose the limit of the plot.

Settings

  • Grid arrange. The user can define the number of subjects that are displayed, as well as the number of rows and the number of columns. Moreover, a slider is present to be able to change the subjects under consideration.
  • General
    • Legend: hide/show the legend. The legends adapts automatically to the elements displayed on the plot. The same legend box applies to all subplots and it is possible to drag and drop the legend at the desired place.
    • Grid : hide/show the grid in the background of the plots.
    • Information: hide/show the individual parameter values for each subject (conditional mode or conditional mean depending on the “Individual estimates” choice is the setting section “Display”).
    • Dosing times: hide/show dosing times as vertical lines for each subject.
    • Link between plots: activate the linked zoom for all subplots. The same zooming region can be applied on all individuals only on the x-axis, only on the Y-axis or on both (option “none”).
  • Display
    • Observed data: hide/show the observed data.
    • Censored intervals [if censored data present]: hide/show the data marked as censored (BLQ), shown as a rectangle representing the censoring interval (for instance [0, LOQ]).
    • Split occasions [if IOV present]: Split the individual subplots by occasions in case of IOV.
    • Individual fits: Model prediction for each individual using the subject’s design and the individual parameters. The individual parameters can be the conditional mode or the conditional mean depending on the choice in the “Individual estimates” section.
    • Population fits [if no covariates in the model]: Model prediction for each individual using the subject’s design and the population parameters.
    • Population fits (individual covariates) [if covariates present in the model]: Model prediction for each individual using the subject’s design, the population parameters and the individual covariates values.
    • Population fits (population covariates) [if covariates present in the model]: Model prediction for each individual using the subject’s design, the population parameters and the median covariates values (median from all individuals of the data set).
    • Individual estimates [if EBEs task has run]: depending on the tasks that have been calculated, choice between conditional mode (given by EBEs task), conditional mean (approximation given by the population parameter estimated task) or conditional mean (given by the conditional distribution task).
    • Individual predictive check: For each individual, 500 (see “number of simulations” in the PLOTS task settings) data sets are simulated using the individual’s design (dose and regressor values). The parameter values used for the simulation include the population parameter values, the individual covariate values and random effects sampled from the population distribution. The simulated data sets include residual errors. The prediction interval represents the interval containing 90% (see “level” setting) of the simulated data points. The predicted median is the median of all simulated data points. The individual predictive check allows to visualize the inter-individual variability (unexplained by covariates) and compare the population prediction to the individual observations.
  • Sorting: Sort the subjects by ID or individual parameter values in ascending or descending order.

By default, only the observed data and the individual fits are displayed.

4.2.2.Observation versus Prediction

Purpose

This figure displays observations (\(y_{ij}\)) versus the corresponding predictions (\(\hat{y}_{ij}\)) computed using either the population parameters, or with the individual parameters. This figure is useful to detect misspecifications in the structural model. The 90% prediction interval, which depends on the residual error model, can be overlaid. Predictions that are outside of the interval are denoted as outliers. A high proportion of outliers suggest misspecifications in the model. Moreover, the  distribution of the observations should be symmetrical around the corresponding predicted values.

Population and individual predictions vs observations

The following example corresponds to the observations and predicted concentrations for the PK of warfarin, modeled by a one-compartment model with a first-order absorption and a linear elimination.On the left, predictions are made using the population parameters while on the right they correspond to the individual parameters. More points appear with the individual predictions: for each observation point, ten predictions are displayed, corresponding to ten simulated individual parameters.

Visual guides

In addition to the line y = x, it is possible to display the 90% prediction interval, as well as a spline interpolation.
The 90% prediction interval represents the uncertainty of predictions due to the residual error model defined in the observation model. In the figure below, the shape of this interval can be seen for the four existing residual error models (constant, proportional, combined1, combined2) when the observation model is defined with a normal distribution:

The next figure corresponds to data that follow a log-normal distribution:

Choosing an observation model with a logit-normal distribution for the data is useful to take into account bounded data. The figure below shows the shape of the prediction intervals for the different error models associated with data that follow a logit-normal distribution in [0.1-10]:

 

The prediction interval for the same example as above on the PK of warfarin characterizes a residual error model that combines a constant and a proportional term:

On the figure above it can be noted that several zero observations measured at low times correspond to nonzero predictions that fall outside the 90% prediction interval, and thus cannot be explained by the residual error. This could be explained by a delay between the administration and absorption of warfarin, therefore a model with a delayed absorption might fit better the data.

Outliers proportion

The outliers proportion can be displayed: it is the proportion of residuals outside the 90% prediction interval.

Individual estimates

As for all diagnosis plots based on individual parameters, it is possible to choose the individual estimates that are used to compute the plot of observations vs individual predictions, among the different estimates computed during the individual parameter estimation: conditional modes (EBEs) or means of the conditional distributions, or simulated individual parameters drawn from the conditional distributions (by default). In the latter case, each observation is associated with a set of individual predictions derived from a set of individual parameters simulated from the same individual conditional distribution. On the two figures below, one can compare the plot based on simulated parameters from the conditional distribution (top) and the same plot based on conditional modes (bottom).

Highlight

As shown on the figures below, hovering on a point of observed data reveals the subject id and time corresponding to this point. All the points corresponding to this subject are highlighted in yellow. On the left, there are several predictions per observation, and the ten points corresponding to the hovered observation are indicated with a bigger diameter. On the right, there is only one prediction per observation, and all points corresponding to the same individual are linked with segments to visualize the time chronology.

Log scale

A log scale is useful to focus on low observation values. It can be set for each axis separately or both together.
A second example below displays the predicted concentrations of remifentanil, modeled by a two-compartments model with a linear elimination. In this example, the log-log scale reveals a clear misspecification of the model: the small observations are under-predicted. These observations correspond to high times: this means that the elimination is not properly captured by the two-compartment model. A three-compartment model might give better results.
that here 10 predictions are displayed for each observation, corresponding to different simulated parameters drawn from the conditional distribution during the individual parameter estimation task.

Settings

  • General
    • Legend and grid : add/remove the legend or the grid. There is only one legend for both plots.
    • Outliers proportion: display/hide the proportion of points outside the 90% prediction interval.
  • Subplots
    • Population prediction: add/remove the figure with the comparison between the population predictions and the observations.
    • Individual prediction: add/remove the figure with the comparison between the individual predictions and the observations.
  •  Display
    • Observed data: Add/remove the points corresponding to pairs of observations and predictions.
    • BLQ data : show and put in a different color the data that are BLQ (Below the Limit of Quantification)
    • Individual estimates: select the estimates condition mean or mode, or simulated estimates from the conditional distribution (by default).
    • Visual cues: add/remove visual guidelines such as the line y = x, a spline interpolation, and the 90% prediction interval indicated with dotted lines.

By default, only the individual predictions are displayed.

4.2.3.Scatter plot of the residuals

Purpose

These plots display the PWRES (population weighted residuals), the IWRES (individual weighted residuals), and the NPDEs (normalized prediction distribution errors) as scatter plots with respect to the time or the prediction.
The PWRES and NPDEs are computed using the population parameters and the IWRES are computed using the individual parameters. For discrete outputs, only NPDEs are used.
These plots are useful to detect misspecifications in the structural and residual error models: if the model is true, residuals should be randomly scattered around the horizontal zero-line.

Definition

 

Population Weighted Residuals \(\text{PWRES}_{ij}\)

\(\text{PWRES}_{ij}\) are defined as the normalized difference between the observations and their mean. Let \(y_i = (y_{ij}, 1 \leq j \leq n_i)\) be the vector of observations for subject i. The mean of \(y_i\) is the vector \(\mathbb{E}(y_i)=(\mathbb{E}(f(t_{ij};\psi_i), 1 \leq j \leq n_i)\). Let \( \textrm{V}_i\) be the \(n_i \times n_i\) variance-covariance matrix of \(y_i\). Then, the ith vector of the population weighed residuals \( \text{PWRES}_i = \{\text{PWRES}_{ij}, 1\leq j \leq n_i\} \) is defined by

$$\text{PWRES}_i = \text{V}_i^{-1/2}(y_i-\mathbb{E}(y_i))$$

\(\mathbb{E}(y_i) \) and \(V_i\) are not known in practice but are estimated empirically by Monte-Carlo simulation without any approximation of the model.

 

Individual weighted residuals \(\text{IWRES}_{ij}\)

\(\text{IWRES}_{ij}\) are estimates of the standardized residual (\(\epsilon_{ij}\)) based on individual predictions, with \(g\) the function defining the residual error model:

$$\text{IWRES}_{ij} = \dfrac{ y_{ij}-f(t_{ij};\hat{\psi}_i)}{g(t_{ij};\hat{\psi}_i)}$$

If the residual errors are assumed to be correlated, the individual weighted residuals can be decorrelated by multiplying each individual vector \(\text{IWRES}_i = (\text{IWRES}_{ij} ; 1\leq j\leq n_i)\) by \(\hat{\text{R}}_i^{-1/2}\), where \(\hat{\text{R}}_i\) is the estimated correlation matrix of the vector of residuals \((\epsilon_{ij}; 1\leq j \leq n_i)\).

 

Normalized prediction distribution errors \(\text{NPDE}_{ij}\)

\(\text{NPDE}_{ij}\) are a nonparametric version of \(\text{PWRES}_{ij}\) based on a rank statistic. For any (i,j), let \(\text{F}_{ij} = \text{F}_{\text{PWRES}_{ij}}(\text{PWRES}_{ij})\) where \(\text{F}_{\text{PWRES}_{ij}}\) is the cumulative distribution function (cdf) of \(\text{PWRES}_{ij}\). NPDEs are then obtained from \(\text{F}_{ij}\) by applying the inverse of the standard normal cdf \(\Phi\).

In practice, one simulates a large number \(K\) of simulated data set \(y^{(k)}\) using the model, and estimate \(\text{F}_{ij}\) as the fraction of simulated data below the original data, i.e:

$$\hat{\text{F}}_{ij}=\frac{1}{K}\sum_{k=1}^K 1_{y_{ij}^{(k)}\leq y_{ij}^{\text{obs}}}$$

By definition, the distribution of \(\text{F}_{ij}\) is uniform on [0,1], we thus rather use \(\Phi^{-1}(\text{F}_{ij})\), which follows a standard normal distribution (with \(\Phi\) the cdf of the standard normal distribution). NPDEs are defined as an empirical estimation of \(\Phi^{-1}(\text{F}_{ij})\), i.e \(\text{NPDE}_{ij}=\Phi^{-1}(\hat{\text{F}}_{ij})\).

 

 

Examples

In the following example, the parameters of a two-compartment model with iv unfusion and linear elimination are estimated on the remifentanil data set. One can see the PWRES, the IWRES and the NPDE w.r.t. the time (on top), and the prediction (at the bottom).
Since the points are clearly scattered unevenly around the horizontal zero-line, these plots suggest a misspecifcation of the structural model.
The corresponding distributions can be seen on this page.

It is possible to select some of the subplots to focus on, with the panel Subplots in Settings:

 

 

 

Presets

A number of element can be overlaid or hidden from the plots in the panel Display. Only the horizontal zero-line, representing the theoretical mean, is always displayed. Two presets with predefined selections of displayed elements are available: the first one called “Scatter” hides all elements except the points for residuals, while the second called “VPC” displays instead empirical and predicted percentiles for the residuals as lines, as well as prediction intervals as colored areas. This figure is detailed below.

 

Predictive checks

The preset “VPC” displays prediction intervals for the median, 10th and 90th percentiles, obtained with simulations of the residuals, as well as the empirical percentiles to compare the behavior of the model to the data. Residual points are hidden, but the trend is represented with a spline interpolation.

Misspecification in the structural model, the error model, and the covariate model can be detected by discrepancies between the observed percentiles and their prediction intervals, as can be seen for example on the plots of IWRES vs time and NPDE vs time below, with log-scale on the x-axis. Population residuals greatly depart from the data at all time points, while individual residuals show better predictions for low times only.

Outliers (empirical percentiles outside the prediction intervals) can be marked with red points or red areas:

 

Comparing PWRES and NPDEs

NPDEs are quite similar to PWRES, but are simulation-based, and therefore account for the heterogeneity in study design by comparing the observations with their own distribution. NPDEs are thus displayed by default rather than PWRES.

 

Comparing IWRES and NPDEs

The IWRES are based on individual predictions, therefore the values on the X axis with respect to predictions are not the same as for NPDEs and PWRES, as can be seen on the plots below. If the tasks EBEs and Conditional distribution have been run, several different individual estimates are available to be used for the individual predictions. The next section shows how to choose the estimates.

 

 

Preventing shrinkage in IWRES

The individual estimates used to compute the IWRES can be chosen in the Display panel:

By default, the individual estimates are drawn from the conditional distributions rather than coming from usual estimators such as conditional modes (EBEs) or conditional means. This choise is recommended in order to prevent shrinkage, a phenomenon that occurs when the individual data are not sufficiently informative with respect to one or more parameters. If overfitting occurs, IWRES computed from biased estimators might thus shrink toward 0.

 

 

Highlight

Hovering on a point highligths all the points from the same individual in yellow on all plots, and reveals the corresponding subject id and time. If the individual estimates selected in Display are the simulated condition distribution, each observation corresponds to a set of IWRES computed from a set of simulated individual parameters. When the observation is hovered, the points from this set are indicated with a bigger diameter.

If the individual estimates selected in Display are condition modes (EBEs) or conditional means, there is only one residual per observation, and all points corresponding to the same individual are linked with segments to visualize the time chronology.

 

Binning

As for VPC, data binning used to compute percentiles can be changed. Several strategies exist to segment the data: equal-width binning, equal-size binning, and a least-squares criterion. The number of bins can also be either set by the user, or automatically selected to obtain a good trade off.
On the three figures below where NPDEs are displayed with respect to log-scaled time, 5 bins are selected with equal width on the left, equal size in the center, and the least-squares criteria on the right. Observations are overlaid in light purple to visualize the data density in each bin. Equal width in particular shows low density for some bins, and result in a less informative plot for low times were data density is high.

On the figure below, the number of bins for least-squares criteria is automatically set, allowing a more precise display.

 

 

Censored data

The residuals for censored data appear in a different color. They are by default based on simulated observations that take into account the censoring interval.

An option available in the panel “Display” can be used to select the method of calculation for the residuals corresponding to censored data: either based on simulated observations (by default), or based on LOQ (values from the observation column in the dataset).

Discrete data

For categorical or count data, only NPDEs are used. Here again, NPDEs correspond to the rank of each observation among a set of simulations based on the model. However, to prevent problems with discrete values, both observations and simulations are slightly perturbed with a uniform distribution before computing the ranks.

 

 

Settings

  • Subplots
    • Residuals
      • Population residuals: Add/remove scatterplots for PWRES. Hidden by default.
      • Individual residuals: Add/remove scatterplots for IWRES, using the individual parameter estimated using the conditional mode or the conditional mean. By default, individual parameters come from the conditional mode estimation.
      • NPDE: Add/remove scatterplots for NPDE.
    • X-axis
      • time: Add/remove the scatterplots w.r.t. the time.
      • prediction: Add/remove the scatterplots w.r.t. the prediction.
  • Display
    • Presets: apply the preselections of elements for scatter plots or VPC
    • Residuals: Add/remove observed data.
    • Censored data: Add/remove BLQ data (with a different color) if present.
    • Empirical percentiles: Add/remove empirical percentiles for the 10%, 50% and 90% quantiles.
    • Predicted percentiles: Add/remove theoretical percentiles for the 10% and 90% quantiles.
    • Prediction interval: Add/remove prediction intervals given by the model for the 10% and 90% quantiles (in blue) and the 50% quantile (in pink), with user-defined level (by default, 90).
    • Outliers: Add/remove dots or areas to mark outliers.
    • Individual estimates: Choose the individual estimates among conditional modes (EBEs), conditional means (computed with SAEM), or simulated parameters from the conditional distributions.
    • Calculations – linear interpolation: Choose the display for prediction intervals: by default linear interpolation is used, otherwise the display is piecewise.
    • Calculations – Use censored data: Choose the display for censored data: by default simulated BLQ observations are used, otherwise the LOQ from the observation column in the data set can be used.
    •  Visual cues: Add/remove spline interpolation.
    • Bins
      • Bin values: Add/remove vertical lines on the scatterplots to indicate the bins.
      • Binning criteria: Choose the bining criteria among equal width (default), equal size or least-squares.
      • Number of bins: Choose a fixed number of bins or a range, with the range for the number of data points per bin.

4.2.4.Distribution of the residuals

Purpose

These plots display the empirical distributions of the residuals: the PWRES (population weighted residuals), the IWRES (individual weighted residuals), and the NPDEs (normalized prediction distribution errors) for continuous outputs, together with the standard Gaussian probability density function and cumulative distribution function.
If the model is true, the PWRES, IWRES and NPDEs should behave as independent standardized normal random variables. These plots are thus useful to detect misspecifications in the structural and residual error models.

Example

In the following example, the parameters of a two-compartment model with iv unfusion and linear elimination are estimated on the remifentanil data set.
Below, one can see on top the comparison between the empirical and theoretical probability density function (PDF) of the PWRES, IWRES and NPDE, and at the bottom the comparison between the empirical and theoretical cumulative distribution function (CDF).

The corresponding scatter plots can be seen on this page.

Settings

By default, all the residuals and all the plots are displayed.

  • Subplots
    • Residuals: choose the plots to display according to residuals

      • Population residuals: PWRES
      • Individual residuals: IWRES, using the individual parameter estimated using the conditional distribution, the conditional mode, or the conditional mean.
      • NPDE
    • X-axis
      • PDF: Probability density function of residuals and empirical distribution as histograms.
      • CDF: Theoretical and empirical cumulative distribution functions.
  • Display
    • Individual estimates: choose the estimator for individual parameters as parameters drawn from the conditional distributions. or the modes or means of the conditional posterior distributions.

4.3.Model for the individual parameters

4.3.1.Distribution of the individual parameters

Purpose

This figure can be used to compare:

  • the empirical distribution of the individual parameters, estimated with the conditional means, the conditional modes, or simulated from the conditional distributions,
  • the theoretical distribution defined in the statistical model, with the estimated population parameter.

Further analysis such as stratification by covariate or shrinkage assessment can be performed and will be detailed below.

 

PDF and CDF

In the warfarinPK_project, several parameters are estimated. It is possible to display the theoretical distribution and the histogram of the empirical disitribution as proposed below.

The distributions are represented as histograms for the probability density function (PDF). Hovering on the histogram also reveals the density value of each bin as shown on the figure below

Notice that the theoretical pdf is a pure log-normal distribution. However, in case of covariate use with the parameters, it is not a pure log-normal but rather a combinaison of log-normal distribution. If for example, on set the SEX covariate on the parameter V, a parameter beta_V_SEX_1 is created and the individual parameter distribution becomes as the following.

Cumulative distribution functions (CDF) is proposed too.

Again, overlaying the plots display the information concerning the parameter value and its empirical and theoretical cdf.

 

Getting away with shrinkage using simulated individual parameters

If the data does not contain enough information to estimate correctly some individual parameters, individual estimates that come from the means or the modes of the individual conditional distributions are shrunk towards the same population value, which is respectively the mean and the mode of the population distribution of the parameter. For a parameter \(\psi_i\) which is a function of a random effect \(\eta_i\), this phenomenon can be quantified by defining the \(\eta\)-shrinkage as:

$$\eta\text{-shrinkage} = 1 -\frac{Var(\hat{\eta})}{\hat{\omega}^2} $$

where \(\text{Var}\left(\hat{\eta}_i\right)\) is the empirical variance of the estimated random effects \(\hat{\eta}_i\)’s. It is possible to display the shrinkage value on top of the histograms, as can be seen below:

The “simulated individual parameters” option uses instead individual parameters drawn from the conditional distribution, simulated by the MCMC procedure. This method is recommended as it permits to obtain unbiased estimators that are not affected by possible shrinkage, and leads to more reliable results. For more details see the page Understanding shrinkage and how to circumvent it.

In the same example, the simulated individual parameters provide much better shrinkage as can be seen below.


The following table compiles the shrinkage calculation (in %) for all methods

Method\Parameters Tlag ka V Cl
Conditional mean 71.5 69.8 8.87 0.23
Conditional mode 74.2 74.7 10.3 -0.2
Simulated individual parameters -17.1 3.66 2.63 1.01

Example of stratification

It is possible to stratify the population by some covariate values and obtain the distributions of the individual parameters in each group. This can be useful to check covariate effect, in particular when the distribution of a parameter exhibits two or more peaks for the whole population. On the following example, the distribution of the parameter k from the same example as above has been split for two groups of individuals according to the value of the continuous covariate AGE, allowing to visualize two clearly different distributions.

Settings

  • General: add/remove the legend, the grid, and the shrinkage in %.
  • Display
    • Empirical: add/remove histogram of empirical distribution.
    • Theoretical: add/remove curve of theoretical distribution.
    • Distribution function: The user can choose to display either the probability density function (PDF) as histogram or the cumulative distribution function (CDF).
    • Individual estimates: The user can define which estimator is used for the definition of the individual parameters.

Simulated individual parameters are used by default, otherwise the conditional mode is the default estimation if it has been computed with the “Individual parameters estimation” task.

4.3.2.Distribution of the random effects

Purpose

This plot displays the distribution of the standardized random effects with boxplots or with histograms. Since random effects shoulf follow normal probability laws, it is useful to compare the distributions to standard Gaussian distributions.

Example

In the following example, one can see the distributions of two parameters of a two-compartment bolus model with linear elimination, estimated on the tobramycin example. On the left, the distributions are represented as boxplots, in the middle as histograms for the probability density function (PDF), and on the right as cumulative distribution functions (CDF). In each case, marks to compare the results to standard Gaussian distributions are overlaid: dotted horizontal lines indicate the interquartile interval of a standard Gaussian distribution for the boxplots, and black curves represent the PDF and CDF of a standard Gaussian distribution.

On the figure below, the individuals have been split into two groups according to the value of the continuous covariate CLCR. One can notice differences on the boxplots for the distributions of random effects between both groups, in particular for the parameter k.

Settings

  • Display
    • Distribution function. The user can choose which type of plot is used to represent the distributions of the random effects: boxplots, pdf (probability density function) or cdf (cumulative distribution function).
    • Individual estimates. The user can define which estimator is used for the definition of the individual parameters and thus for the random effects (conditional mean, conditional mode, or simulated random effects)
    • Visual cues: If boxplot has been selected, the user can choose to add or hide dotted lines to mark the median or quartiles of a standard Gaussian distribution.

By default, the distributions of simulated random effects are displayed as boxplots.

4.3.3.Correlation between the random effects

Purpose

This plot displays scatter plots for each pair of random effects. It allows to identify correlations between random effects, which can then be introduced in the models for the probability distributions for the individual parameters.

Example

In the following example, one can see pairs of random effects estimated for all parameters of a one-compartment model with delayed first-order absorption and linear elimination estimated on the PK of warfarin data set. The estimators for random  effects have been simulated from conditional distributions of individual random effects.

Visual guidelines

In addition to regression lines, Pearson correlation coefficients can been added to see the correlation between random effects, as well as spline interpolations.

Selection

It is possible to select a subset of random effects, whose pairs of correlations are then displayed, as shown below. In the selection panel, a set of contiguous rows can be selected with a single extended click, or a set of non-contiguous rows can be selected with several clicks while holding the Ctrl key.

Highlight

Similarly to other plots, hovering on a point provides information on the corresponding subject id, and highlights other points corresponding to the same individual, if multiple individual parameters have been simulated by sampling the conditional distribution.
On the figure below, we can see the same plot with only one parameter value per individual (left) or ten values (right). Notice that the correlation coefficient is more reliable when multiple parameters are available, as they take into account the uncertainty of individual parameters.

Stratification: coloring and filtering

Stratification can be applied by creating groups of covariate values. As can be seen below, these groups can then be colored (left) or filtered (right), allowing to check the effect of the covariate on the correlation between two parameters. The correlation coefficient is updated according to the filtering.

Settings

  • General
    • Legend and grid : add/remove the legend or the grid. There is only one legend for all plots.
    • Correlation: display/hide the correlation coefficient associated with each scatter plot.
  • Display
    • Selection. The user can select some of the parameters to display only the corresponding scatter plots. A simple click selects one parameter, whereas multiple clicks while holding the Ctrl key selects a set of parameters.
    • Individual estimates. The user can define which estimates are used for the definition of the individual parameters and thus for the random effects (conditional mean, conditional mode, simulated parameters)
    • Visual cues. Add/remove the regression line or the spline interpolation.

By default, all scatter plots are proposed with simulated individual parameters.

4.3.4.Individual parameters versus covariates

Purpose

The figure displays the estimators of the individual parameters, and those for random effects, as a function of the covariates. It allows to identify correlation effects between the individual parameters and the covariates.

The estimators can be:

  • simulated parameters: individual parameters and individual random effects are sampled from the distributions \(p(\psi_i|y_i;\hat{\theta})\) and \(p(\eta_i|y_i;\hat{\theta})\). These estimators lead to more reliable results, especially when individual data are sparse and the distributions of conditional modes and means of individual parameters are affected by shrinkage.
  • the conditional means \(E(\psi_i|y_i;\hat{\theta})\) for parameters \(\psi_i\)and \(E(\eta_i|y_i;\hat{\theta})\) for random effects \(\eta_i\),
  • the conditional modes of the same distributions,

Identifying correlation effects

In the example below, we can see the parameters of a one-compartment PK model with delayed first-order absorption and linear elimination estimated on the warfarin data set. The simulated individual parmeters of the 4 parameters of the PK model are displayed with respect to the covariates: the weight wt, a transformed version of the weight (lw70=log(wt/70)) and the sex category.

Visual guidelines

In order to help identifying correlations, regression lines, spline interpolations and Pearson correlation coefficients can be overlaid on the plots for continuous covariates. Here we can see a strong correlation between the parameter V and the covariate lw70.

Highlight

Hovering on a point reveals the corresponding individual and, if multiple individual parameters have been simulated from the conditional distribution for each individual, highlights all the points points from the same individual. This is useful to identify possible outliers and subsequently check their behavior in the observed data.

Selection

It is possible to select a subset of covariates or parameters, as shown below. In the selection panel, a set of contiguous rows can be selected with a single extended click, or a set of non-contiguous rows can be selected with several clicks while holding the Ctrl key. This is useful when there are many parameters or covariates. In particular, it is frequent to introduce transformed covariates, the selection allows to focus on the transformed versions rather than the original.

Comparing individual parameters and random effects

By default, the values on the Y-axis are computed with the individual parameters. One can choose to display the random effects instead. If some individual parameters are already modelled with covariates, this is taken into account by the random effects values, thus allowing to focus on remaining correlations.
The figures below show the diagnosis plots with individual parameters or random effects when the models for parameters V includes the covariate lw70. On the top, one can identify the correlations between individual parameters and covariates: the log-volume (log(V)) clearly increases with the log-transformed weight, as well with sex. On the other hand, the random effects on the bottom allow to focus on correlations that are not yet taken into account in the covariate model. Because the model already includes a linear relationship between the log-volume and the log-transformed weight, \eta_{V} shows no correlation with lw70. There is no correlation either between \eta_{V} and sex, because of an existing correlation between lw70 and sex.

Stratification

Stratification can be applied by creating groups of covariate values. As can be seen below, these groups can then be split, colored or filtered, allowing to check the effect of the covariate on the correlation between two parameters. The correlation coefficient is updated according to the split or filtering.

 Settings

  • General
    • Legend and grid : add/remove the legend or the grid. There is only one legend for all plots.
    • Correlation: display/hide the correlation coefficient associated with each scatter plot.
  • Display
    • Y-axis. The user can choose to see either the individual parameters or the random effects.
    • Selection. The user can select some of the parameters or covariates to display only the corresponding plots. A simple click selects one parameter (or covariate), whereas multiple clicks while holding the Ctrl key selects a set of parameters.
    • Individual estimates. The user can define which estimators are used for the definition of the individual parameters and thus for the random effects (conditional mean, conditional mode, simulated parameters)
    • Visual cues. Add/remove a regression line or a spline interpolation.

By default, all plots are proposed with simulated individual parameters.

4.4.Predictive checks and predictions

4.4.1.Visual predictive checks

Purpose

The VPC (Visual Predictive Check) offers an intuitive assessment of misspecification in structural, variability, and covariate models. The principle is to assess graphically whether simulations from a model of interest are able to reproduce both the central trend and variability in the observed data, when plotted versus an independent variable (typically time). It summarizes in the same graphic the structural and statistical models by computing several quantiles of the empirical distribution of the data after having regrouped them into bins over successive intervals.
More precisely, the goal is to compare the two following elements:

  • Empirical percentiles: percentiles of the observed data, calculated either for each unique value of time, or pooled by adjacent time intervals (bins). By default, the 10th, 50th and 90th percentiles are displayed as green lines. These quantiles summarize the distribution of the observations.
  • Theoretical percentiles: percentiles of simulated data are computed from multiple Monte Carlo simulations with the model of interest and the design structure of the original dataset (i.e., dosing, timing, and number of samples). For each simulation, the same percentiles are computed across the same bins as for empirical percentiles. Prediction intervals for each percentile are then estimated across all simulated data and displayed as colored areas (pink for the 50th percentile, blue for the 10th and 90th percentiles). By default, prediction intervals are computed with a level of 90%.

If the model is correct, the observed percentiles should be close to the predicted percentiles and remain within the corresponding prediction intervals.

 

 

Examples

VPCs vary slightly for different types of data. For joint models for multivariate outcomes, VPCs are available for each outcome.

 

  • Continuous outcomes

warfarinPK_project (data = ‘warfarin_data.txt’, model = ‘lib:oral1_1cpt_TlagkaVCl.txt’)

In the following example, the parameters of a one-compartment model with delayed first-order absorption and linear elimination are estimated on the warfarin dataset. A constant residual error model was used. The figure presents the VPC with the prediction intervals for the 10th, 50th and 90th percentiles. Outliers are highlighted with red dots and areas. Here the three quantiles appear closer together than the model would suggest, therefore the VPC suggests that a proportional component should be added to the error model.

For joint models for continuous PK and time-to-event data, VPCs are available for each type of data. However it is important to note that dropout events are not taken into account in the VPC corresponding to the continuous data. Therefore, in the case of non-random dropout events in the dataset, this can result in discrepancies between observed and simulated data and thus hamper the diagnosis value of the VPC. Correcting this bias would require to include the simulated dropout in VPC, as well as adapt the design structure to compensate observed dropouts, an approach that is problematic when the design structure is complex.

 

  • Non-continuous outcomes: count data and categorical data

VPCs for count data and categorical data compare the observed and predicted frequencies of the categorized data over time. The predicted frequency is associated with a blue prediction interval.

The following figure shows the VPC for a project with a continuous time Markov chain model and time varying transition rates.

  • markov3b_project (data = ‘markov3b_data.txt’, model = ‘markov3b_model.txt’)

In addition to the categorization over time (binning on X), count data are also binned into groups of count values on the VPC (binning on Y). The number of bins and binning method can be set in Settings under “Y Bins”.
As an example, the VPC below corresponds to a project where a Poisson model is used for fitting the data. Observations are binned in 3 groups on the Y axis and 20 bins on the X axis.

  • count1a_project (data = ‘count1_data.txt’, model = ‘count_library/poisson_mlxt.txt’)

 

  • Time-to-event data

In case of time-to-event data, two visual predictive checks are available, based on the Kaplan-Meier plot (survival function) and the mean number of events per individual.

The example below shows these two figures, computed with a model for the survival of patients with advanced lung cancer from the Veterans’ Administration Lung Cancer study. Censored data has been selected and displayed on the Kaplan-Meier plot. Note that censored data also cause an overprediction biais in the VPC based on the mean number of events per individual, because censored individuals contribute to the prediction interval but not to the empirical curve.

 

Details

 

  • Binning criteria

Correctly defining the intervals (or bins) into which the data are grouped is crucial to construct a VPC that avoids distortion between the original and approximated distributions. Several strategies exist to segment the data: equal-width binning, equal-size binning, and a least-squares criterion. The number of bins can also be either set by the user, or automatically selected to obtain a good tradeoff. Indeed, a small number of bins leads to a poor approximation but a good estimation of the data’s distribution, while a large number of bins leads to a good approximation but poor estimation.

As an example, the VPCs below are computed on the PK model built for remifentanil pharmacokinetics, a dataset that involves a large variability in doses. The bins are delimited with vertical lines. The first VPC on the left is computed with 5 bins, the number automatically selected for this dataset. On the other hand, the second VPC on the right is computed with 15 bins. We notice that in this case the heterogeneity of the data results in a poor estimation of the data’s distribution. To keep a good estimation, a small number of bins is required, but the approximation then prevents from visualizing the kinetics in details. The absorption phase is for example not visible.

 

  • Corrected predictions

As shown above, VPCs can be misleading if applied to data that include a large variability in dose and/or influential covariates, or that follow adaptive designs such as dose adjustments. The prediction-corrected VPC (pcVPC), with prediction correction, was developed to maintain the diagnosis value of a VPC in these cases. In each bin, the observed and simulated data are normalized based on the typical population prediction for the median time in the bin. This removes the variability coming from binning across independent variables.
The example below shows the pcVPC computed on the PK model built for remifentanil pharmacokinetics with 15 bins: the figure now gives a good estimation of the data’s distribution, including the absorption phase.

 

  • Stratification

When possible, another useful approach to deal with heterogeneous data can be to split the VPC into groups of subjects that are more homogeneous. As an example, the VPCs below are computed again on the PK model built for remifentanil pharmacokinetics, with 15 bins, but the data was first split by a categorical covariate that characterizes groups of similar doses.

Settings

  • General: Add/remove legend or grid
  • Subplots (for TTE data)
    • Add/remove plot for survival function (Kapan-Meier plot) or plot for mean number of events per individual
  • Display
    • Observed data
      • Observed data: Add/remove observed data.
      • BLQ: Add/remove BLQ data if present.
      • Use BLQ: Choose to use BLQ data or to ignore it to compute the VPC. BLQ data can be simulated, or can be equal to the limit of quantification (LOQ). The latter case induces strong bias .
    • Empirical percentiles: Add/remove empirical percentiles for the 10%, 50% and 90% quantiles.
    • Predicted percentiles: Add/remove theoretical percentiles for the 10% and 90% quantiles.
    • Prediction interval: Add/remove prediction intervals given by the model for the 10% and 90% quantiles (in blue) and the 50% quantile (in pink).
      • Set interpercentile level and higher percentile for prediction intervals (for continuous data by default the level is 90 and the higher percentile is 90%), or number of bands for TTE data
    • Outliers
      • Dots: Add/remove red dots indicating empirical percentiles that are outside prediction intervals
      • Areas: Add/remove red areas indicating empirical percentiles that are outside prediction intervals
    • Calculations

      • Corrected predictions: compute the pcVPC using Uppsala prediction correction (see details above)
      • Set piecewise display for prediction intervals (by default the display is linear)
  • Bins – for categorical data, X Bins and DV Bins (for Y axis) can be specified

    • Bin limits: Add/remove vertical lines on the scatterplots to indicate the bins.
    • Binning criteria: Choose the bining criteria among equal width (default), equal size or least-squares
    • Number of bins: Choose a fixed number of bins or a range for automatic selection, and a range for the number of data points per bin.

All colors, points and lines can be modified by the user.

4.4.2.Numerical predictive checks

Purpose

This plot displays the numerical predictive check (NPC). The NPC is a model diagnosis tool for continuous data which is closely related to the VPC procedure: the principle is similar, with a different way to visualize the resulting information. While the VPC maintains the time dimension and can be used to point out at which time points the model overpredicts or underpredicts the data, the NPC allows to compare the empirical cumulative distribution function (CDF) of the observations, computed on the original data set, with the theoretical cumulative distribution, computed from data simulated with the model of interest and the design structure of the original data set.
Note that since the NPC compares each observation with its own simulated distribution, there is no concern of data binning like for the VPC.

Examples

In the following example, the parameters of a two-compartment model with iv infusion and linear elimination are estimated on the remifentanil data set.

One can see the empirical CDF of remifentanil concentration in blue, compared to the theoretical CDF based on simulated data in black. The 90% prediction interval corresponding to the theoretical CDF is visualized as a light blue area. Discrepancies between the empirical CDF and this area are marked in red. The log-scale on the x-axis allows to focus on small observations. The plot shows that the model underpredicts small observations, and tends to overpredict some observations between 20 and 40 units.

Settings

  • General: Add/remove legend or grid.
  • Display
    • Empirical distribution: Add/remove empirical CDF.
    • Predicted median: Add/remove theoretical CDF.
    • Prediction interval: Add/remove the prediction interval for the theoretical CDF, and set the interpercentile level for the prediction interval (by default the level is 90) and its associated level.
    • Outliers (area): Add/remove red areas indicating where the empirical CDF is outside the prediction interval.
    • Calculations:
      • Set the number of evaluation points in the NPC.
      • Use BLQ: Choose to use BLQ data or to ignore it to compute the VPC. BLQ data can be simulated, or can be equal to the limit of quantification (LOQ). The latter case induces strong bias.

4.4.3.BLQ predictive checks

Purpose

This plot displays the proportion of censored data w.r.t. time. It is possible to choose the censoring interval. This plot is only available for projects with censored data.

Example of graphic

The figure presents the simulated and empirical BLQ frequencies w.r.t.time (example taken from the censored1_project of the demos)

Settings

  • General: Add/remove legend or grid
  • Display
    • BLQ cumulated freq.

      • Empirical: Add/remove empirical percentiles for the 10% and 90% quantiles.
      • Theoretical : Add/remove median frequency of BLQ observations calculated by simulation
      • Prediction interval: Add/remove prediction intervals given by the model for the 10% and 90% quantiles (in blue) and its level
      • Outliers (area): Add/remove red areas indicating empirical percentiles that are outside prediction intervals
    • Calculations

      • Censored interval: min and max for censored data. By default, the limit and the censored values are used. However, one can look at smaller censored interval for example.
      • The number of point for the discretization

By default, the censored area corresponds to the data set description and the BLQ frequency observation, the prediction interval, the outliers and the legend are displayed.

4.4.4.Prediction distribution

Purpose

This plot displays the prediction distribution. It allows to compare the observations with the theoretical distribution of the predictions. It is based on multiple simulations of all individuals from the dataset, without the residual error.

Example

Prediction distribution plots vary slightly for different types of data. For joint models for multivariate outcomes, a separate plot is available for each type of data.

  • Continuous outcomes

In the following example, the parameters of a one compartmental model with first order absorption and linear elimination are estimated on the theophylline data set. One can see the prediction distribution of the concentration overlayed with the data set.

  • Non-continuous outcomes: count data and categorical data

count2_project (data = ‘count2_data.txt’, model = ‘count_library/poissonTimeVarying_mlxt.txt’)

Prediction distribution plots for count data and categorical data show the predicted frequencies of the categorized data over time, computed by Monte-Carlo. In the following example, predictions come from a Poisson distribution with a time varying intensit. Note that hovering on a band reveals the corresponding modality.

Settings

  • General: Add/remove legend or grid
  • Display (for continuous data)

    • Observed data
    • BLQ: Add/remove BLQ data if present.
    • Median: Add/remove the median of predictions.
    • Level: set the level (90 by default). The distribution corresponds to [50-\frac{level}{2}, 50+\frac{level}{2}].
    • Number of bands: set the number of bands (9 by default) and the associated percentile in case of a discrete representation
  • X Bins (for discrete data)
    • Bin values: Add/remove vertical lines on the plot to indicate the bins.
    • Bining criteria: Choose the bining criteria among equal width (default), equal size or least-squares
    • Number of bins: Choose a fixed number of bins or a range for automatic selection, and a range for the number of data points per bin.

By default, only the prediction distribution and the median are displayed (for continuous data).

4.5.Tasks results

4.5.1.Likelihood contribution

Purpose

This plot displays the contribution of each individual to the log-likelihood. It is only available if the log-likelihood was previously computed. It can be sorted by index or by log-likelihood value (either via linearization of importance sampling, depending on which has been computed).

Example

In the following example, the parameters of a one-compartment model with first-order absorption, linear elimination and a delay are estimated on the warfarin data set. The figure shows the top ten contributions from individuals to the log-likelihood, computed via linearization or importance sampling methods. All the contributions appear in small size on the mini-plot on top, as well as a window indicating the selection of individuals for the main histogram.

Here we notice that the subject with id 8 has a much higher contribution to the log-likelihood than all other subjects, meaning that its reponse is less well captured by the model than others. It is worth checking this individual in the plots of individual fits, and remove it from the data set if its observations look abnormal.

 

Settings

  • General. Add/remove the legend, grid, or a mini-plot that allows to select a range of ranks to display.
  • Methods. Add/remove histograms bins for values computed by linearization or importance sampling, if they have been computed.
  • Sorting. The user can choose to sort the histogram by index, or by contribution value computed with linearization or importance sampling, if they have been computed.
  • Label. The user can choose to display labels on top of the bins to indicate subject indices or names (ids).

4.5.2.Standard errors of the estimates

Purpose

This plot displays a histogram with the relative standard error of each parameter estimate, computed with the Fisher Information Matrix. It is only available if the standard errors were previously computed, and it provides a graphical representation of the information already available in the column “R.S.E (%)” in the Pop.Param tab of the Results frame.

Example

In the following example, the parameters of a one-compartment model with first-order absorption, linear elimination and a delay are estimated on the warfarin data set. The figure shows the relative standard errors of all population estimates, computed via linearization and importance sampling methods, with the exact values written in front of each bin.
This figure allows here to highlight beta_Cl_tsex_F as an estimate associated with a high standard error. This suggests to check the relevance of this covariate effect, by looking at the estimate for this parameter and the result of the Wald test in the statistical tests.

Settings

  • General. Add/remove the legend or grid.
  • Methods. Add/remove histograms bins for values computed by linearization or stochastic approximation, if they have been computed.

4.6.Convergence diagnosis

4.6.1.SAEM

Purpose

This plot displays the sequence of estimates for population parameters computed after each iteration of the SAEM algorithm. The purpose is to check the convergence of the algorithm. In addition, a convergence indicator gives the estimation for -2 x log-likelihood along the iterations.

Example

In the following example, the parameters of a one-compartment model with first-order absorption and linear elimination are estimated on the theophylline data set. The vertical line indicates where the algorithm switches from the first phase to the second. Notice also that the convergence indicator is displayed.

Settings

  • Select plots and arrange layout. It is convenient when there are many parameters and the user wants to focus on some particular parameter convergence for example.

By default, 12 parameters are displayed.

4.6.2.MCMC

Purpose

This plot displays the sequence of estimates for the conditional means and the conditional standard deviations along the iterations of the MH algorithm during individual parameters estimation by MCMC. The purpose is to check the convergence of the algorithm. The algorithm stops when these sequences remain in an interval of a given amplitude for a certain number of iterations: this interval is visualized on the figure with horizontal lines. The plot is shown and interactively updated while the task is running, and can be found after the end of the task in the Plots frame.

Example

In the following example, the parameters of a one-compartment model with first-order absorption and linear elimination are estimated on the theophylline data set.

Settings

  • Select plots and arrange layout. It is convenient when there are many parameters and the user wants to focus on some particular parameter convergence for example.

By default, 12 parameters are displayed.

4.6.3.Importance sampling

Purpose

This plot displays the sequence of estimates for observed log-likelihood computed by Monte Carlo approach. The purpose is to check the convergence of the algorithm.

Example

In the following example, the parameters of a three-compartments infusion model with linear elimination are estimated on the remifentanil data set. As explained on this page, the bias of the log-likelihood estimator decreases with the number of iterations, before the estimation value stabilizes. The number of points in the plot is usually smaller than the number of iterations, and depends on the total number of observations.

Settings

  • Add/remove grid.

4.7.Export charts

All plots generated by Monolix can be exported

All plots generated by Monolix can be exported as a figure or as text files in order to be able to plot it in another way or with other software for more flexibility.
All the files can be exported in R for example using the following command

 read.table("/path/to/file.txt", sep = ",", comment.char = "")

Remarks

  • The separator is the one defined in the user preferences. We set “,” in this example as it is the one by default.
  • The command comment.char = "" is needed for some files because to define groups or color, we use the character # that can be interpreted as a comment character by R.

The list of plots below corresponds to all the plots that Monolix can generate. They are computed with the task “Plots”, and the list of plots to compute can be selected by clicking on the button next to the task as shown below, prior to running the task.
Exporting the charts data can be made through the Export menu or through the preferences as described here.

In the following, we describe all the files generated by the export function

Charts concerning the Data

Observed data (continuous, categorical, and count)

xxx_observations.txt

Description: observation values

Full output file description

Column Description Comment
id Subject identifier
OCC Occasion value (optional) if there is IOV in the data set
time Observation time
y Observation value (loq) The name of the column is the observation name
censored 1 if the observation is censored, 0 otherwise
split Name of the split the subject belongs to
color Name of the color the observation is colored with
filter 1 if the subject is filtered, 0 otherwise

Observed data (event)

xxx_curves.txt

Description: observation values

Full output file description

Column Description Needed Task
time Observation times
survivalFunction Survival of first event
averageEventNumber Average number of event at that time
split Name of the split the subject occasion belongs to

xxx_censored.txt

Description: censored values

Full output file description

Column Description Needed Task
time Observation times
values Survival of first event
split Name of the split the subject occasion belongs to

Model for the observations

Individual Fits

xxx_observations.txt

Description: observation values

Full output file description

Column Description Comment
id Subject identifier
OCC Occasion value (optional) if there is IOV in the data set
time Observation time
y Observation value (loq)
median Prediction interval median
piLower Lower percentile of the individual prediction interval
piUpper Upper percentile of the individual prediction interval
censored 1 if the observation is censored, 0 otherwise

xxx_fits.txt

Description: individual fits based on population parameters and individual parameters

Full output file description

Column Description Needed Task
id Subject identifier
OCC Occasion value (optional) if there is IOV in the data set
time Continuous time grid used to compute fits
pop Individual fit base on population parameter values without taking the covariates into account
popPred Individual fit base on population parameter values
indivPredMean Individual prediction based on the individual parameter values estimated by conditional mean Conditional distribution need to be computed
indivPredMode Individual prediction based on the individual parameter values estimated by conditional mode EBEs need to be computed

Observation vs Prediction

xxx_obsVsPred.txt

Description: observation and prediction (pop & indiv) values

Full output file description

Column Description Needed Task
id Subject identifier
OCC Occasion value (optional) if there is IOV in the data set
time time of the observation
y Observation value (loq)
y_simBlq Observation value (simulated blq)
popPred Predictions based on population parameter values
indivPredMean Predictions based on the individual parameter values estimated by conditional mean Conditional distribution need to be computed
indivPredMode Predictions based on the individual parameter values estimated by conditional mode EBEs need to be computed
censored 1 if the observation is censored, 0 otherwise
split Name of the split the subject occasion belongs to
color Name of the color the observation is colored with
filter 1 if the subject is filtered, 0 otherwise

xxx_obsVsSimulatedPred.txt

Description: observation and simulated prediction values

Full output file description

Column Description Needed Task
rep Replicate id
id Subject identifier
OCC Occasion value (optional) if there is IOV in the data set
time time of the observation
y Observation value (loq)
y_simBlq Observation value (simulated blq)
indivPredSimulated Predictions based on the simulated individual parameter values estimated by conditional distribution Conditional distribution need to be computed
censored 1 if the observation is censored, 0 otherwise
split Name of the split the subject occasion belongs to
color Name of the color the observation is colored with
filter 1 if the subject is filtered, 0 otherwise

xxx_visualGuides.txt

Description: splines and confidence intervals for predictions

Full output file description

Column Description Needed Task
popPred Continuous grid over population prediction values
popPred_spline Spline ordinates for population predictions
popPred_piLower Lower percentile of prediction interval for population predictions
popPred_piUpper Upper percentile of prediction interval for population predictions
indivPred Continuous grid over individual prediction values
indivPred_spline Spline ordinates for individual predictions
indivPred_piLower Lower percentile of prediction interval for individual predictions
indivPred_piUpper Upper percentile of prediction interval for individual predictions
split Name of the split the visual guides belong to

Distribution of the residuals

xxx_pdf.txt

Description: probability density function of each residual type (pwres, iwres, npde)

Full output file description

Column Description Needed Task
pwRes_abscissa Abscissa for pwres pdf
pwRes_pdf Pdf of pwres
iwRes_abscissa Abscissa for iwres pdf
iwRes_pdf Pdf of the iwres
npde_abscissa Abscissa for npde pdf
npde_pdf Pdf of the npde
split

xxx_cdf.txt

Description: cumulative distribution function of each residual type (pwres, iwres, npde)

Full output file description

Column Description Needed Task
pwRes_abscissa Abscissa for pwres cdf
pwRes_cdf Cdf of pwres
iwRes_abscissa Abscissa for iwres cdf
iwRes_cdf Cdf of the iwres
npde_abscissa Abscissa for npde cdf
npde_cdf Cdf of the npde
split

theoreticalGuides.txt

Description: theoretical guides for the pdf and the cdf

Full output file description

Column Description Needed Task
abscissa,pdf,cdf Abscissa for the theoretical curves
pdf Theoretical value of the pdf
cdf Theoretical value of the cdf

Scatter plot of the residuals

xxx_prediction_percentiles_iwRes.txt

Description: prediction percentiles of the iwREs to plot iwRes w.r.t. the prediction. The same files exists with the pwres and the npde.

Full output file description

Column Description Needed Task
prediction Value of the prediction
empirical_median Empirical median of the iwRes
empirical_lower Empirical lower percentile of the iwRes
empirical_upper Empirical upper percentile of the iwRes
theoretical_median Theoretical median of the iwRes
theoretical_lower Theoretical lower of the iwRes
theoretical_upper Theoretical upper of the iwRes
theoretical_median_piLower Lower bound of the theoretical median prediction interval
theoretical_median_piUpper Upper bound of the theoretical median prediction interval
theoretical_lower_piLower Lower bound of the theoretical lower prediction interval
theoretical_lower_piUpper Upper bound of the theoretical lower prediction interval
theoretical_upper_piLower Lower bound of the theoretical upper prediction interval
theoretical_upper_piUpper Upper bound of the theoretical upper prediction interval
split Name of the split the subject occasion belongs to

xxx_time_percentiles_iwRes.txt

Description: time percentiles of the iwREs to plot iwRes w.r.t. the time. The same files exists with the pwres and the npde.

Full output file description

Column Description Needed Task
time Value of the time
empirical_median Empirical median of the iwRes
empirical_lower Empirical lower percentile of the iwRes
empirical_upper Empirical upper percentile of the iwRes
theoretical_median Theoretical median of the iwRes
theoretical_lower Theoretical lower of the iwRes
theoretical_upper Theoretical upper of the iwRes
theoretical_median_piLower Lower bound of the theoretical median prediction interval
theoretical_median_piUpper Upper bound of the theoretical median prediction interval
theoretical_lower_piLower Lower bound of the theoretical lower prediction interval
theoretical_lower_piUpper Upper bound of the theoretical lower prediction interval
theoretical_upper_piLower Lower bound of the theoretical upper prediction interval
theoretical_upper_piUpper Upper bound of the theoretical upper prediction interval
split Name of the split the subject occasion belongs to

xxx_residuals.txt

Description: residuals values (pwres, iwres, npde)

Full output file description

Column Description Needed Task
id Subject identifier
OCC Occasion value (optional) if there is IOV in the data set
time Observation times
prediction_pwRes Predictions based on population parameter values SAEM
pwRes PwRes (computed with observations) SAEM
pwRes_blq PwRes (computed with simulated blq)
prediction_iwRes_mean Predictions based on the individual parameter values estimated by conditional mean (INDIVESTIM) if available, SAEM either
iwRes_mean IwRes (computed with observations and individual parameter values estimated by conditional mean (INDIVESTIM) if available, SAEM either)
iwRes_mean_simBlq IwRes (computed with simulated blq and individual parameter values estimated by conditional mean (INDIVESTIM) if available, SAEM either)
prediction_iwRes_mode Predictions based on the individual parameter values estimated by conditional mode (INDIVESTIM)
iwRes_mode IwRes (computed with observations and the individual parameter values estimated by conditional mode (INDIVESTIM))
iwRes_mean_simBlq IwRes (computed with simulated blq and the individual parameter values estimated by conditional mode (INDIVESTIM))
prediction_npde Predictions based on population parameter values
npde Npde (computed with observations)
npde_simBlq Npde (computed with simulated blq) SAEM – If there are some censored data in the data set
censored 1 if the observation is censored, 0 otherwise
split Name of the split the subject occasion belongs to
color Name of the color the observation is colored with
filter 1 if the subject is filtered, 0 otherwise

xxx_simulatedResiduals.txt

Description: simulated residuals values

Full output file description

Column Description Needed Task
rep replicate
id Subject identifier
OCC Occasion value (optional) if there is IOV in the data set
time Observation times
prediction_iwRes Predictions based on the simulated individual parameter values based on the conditional distribution
iwRes_simulated IwRes (computed with observations and the simulated individual parameter values)
iwRes_simulated_simBlq IwRes (computed with simulated blq and the simulated individual parameter values)
censored 1 if the observation is censored, 0 otherwise
split Name of the split the subject occasion belongs to
color Name of the color the observation is colored with
filter 1 if the subject is filtered, 0 otherwise

xxx_spline.txt

Description: splines (residuals values against time and prediction)

Full output file description

Column Description Needed Task
time_pwRes Time grid for pwRes spline SAEM
time_pwRes_spline pwRes against time spline SAEM
time_iwRes Time grid for iwRes spline At least SAEM
time_iwRes_spline iwRes against time spline At least SAEM
time_npde Time grid for npde spline SAEM
time_npde_spline npde against time spline SAEM
prediction_pwRes Prediction grid for pwRes spline SAEM
prediction_pwRes_spline pwRes against population prediction spline SAEM
prediction_iwRes Prediction grid for iwRes spline At least SAEM
prediction_iwRes_spline iwRes against individual prediction spline At least SAEM
prediction_npde Prediction grid for npde spline SAEM
prediction_npde_npde npde against population prediction spline SAEM
split Name of the split the visual guides belong to If the chart is splitted

xxx_{time,population,individual}Bins.txt

Description: bins values for the corresponding axis.

Full output file description

Column Description Needed Task
binsValues Abscissa bins values
split Name of the split the bins refer to If the chart is splitted

Model for the individual parameters

Distribution of the individual parameters

cdf.txt

Full output file description

Column Description Needed Task
param_abscissa Abscissa of the cdf of the individual parameter param
param_cdf Empirical cdf of the individual parameter param
split Name of the split the subject occasion belongs to

pdf.txt

Full output file description

Column Description Needed Task
param_abscissa Abscissa of the pdf of the individual parameter param
param_pdf Empirical pdf of the individual parameter param
split Name of the split the subject occasion belongs to

visualGuides.txt

Full output file description

Column Description Needed Task
param_abscissa Abscissa of the pdf and the pdf of the individual parameter param
param_pdf Theoretical pdf of the individual parameter param
param_cdf Theoretical cdf of the individual parameter param

Distribution of the random effects

cdf.txt

Full output file description

Column Description Needed Task
param_abscissa Abscissa of the cdf of the standardized random effect of  param
param_cdf Empirical cdf of the standardized random effect of  param
split Name of the split the subject occasion belongs to

pdf.txt

Full output file description

Column Description Needed Task
param_abscissa Abscissa of the pdf of the standardized random effect of  param
param_pdf Empirical pdf of the standardized random effect of  param
split Name of the split the subject occasion belongs to

StandardizedEta.txt

Description: standardized random effects of the individual parameters

Full output file description

Column Description Needed Task
id Subject identifier
OCC Occasion value (optional) if there is IOV in the data set
standEta_X_method Standardized random effects values on each individual parameter with variability.

It can be StandEta_SAEM, StandEta_Mean, StandEta_Mode

filter 1 if the subject is filtered, 0 otherwise

SimulatedStandardizedEta.txt

Description: simulated standardized random effects of the individual parameters

Full output file description

Column Description Needed Task
rep Replicate id
id Subject identifier
OCC Occasion value (optional) if there is IOV in the data set
standEta_X_simulated Standardized random effects values on each individual parameter with variability.
filter 1 if the subject is filtered, 0 otherwise

Correlation between Random Effects

eta.txt

Description: standard error on individual parameter predictions

Full output file description

Column Description Needed Task
id Subject identifier
OCC Occasion value (optional) if there is IOV in the data set
eta_X_method Random effects values on each individual parameter with variability. It can be

eta_SAEM, eta_Mean, or eta_Mode

color Name of the color the ID is colored with
filter 1 if the subject is filtered, 0 otherwise

simulatedEta.txt

Description: standard error on individual parameter predictions

Full output file description

Column Description Needed Task
rep Replicate number
id Subject identifier
OCC Occasion value (optional) if there is IOV in the data set
eta_X_simulated Simulated random effects values on each individual parameter
color Name of the color the ID is colored with
filter 1 if the subject is filtered, 0 otherwise

visualGuides.txt

Description: spline and linear regression for each couple of individual parameters plotted one against the other

This is done for each combination of parameter p1 and p2 to have p1 w.r.t. p2

Full output file description

Column Description Needed Task
p1_vs_p2_abscissa Abscissa
p1_vs_p2_spline Spline ordinates
p1_vs_p2_regression Linear regression ordinates

Individual Parameters Vs Covariates

covariates.txt

Description: individual parameters and random effects and covariate value for each subject

Full output file description

Column Description Needed Task
id Subject identifier
OCC Occasion value (optional) if there is IOV in the data set
X_method Individual parameter value. It can be X_SAEM, X_mean, or X_mode
eta_X_method Random effects values. It can be  eta_X_SAEM, eta_Mean, or eta_Mode
covariate Covariates values
split
color Name of the color the observation is colored with
filter 1 if the subject is filtered, 0 otherwise

simulatedCovariates.txt

Description: simulated individual parameters and random effects and covariate value for each subject

Full output file description

Column Description Needed Task
rep Replicate of the simulation
id Subject identifier
OCC Occasion value (optional) if there is IOV in the data set
X_method Individual parameter value. It can be X_SAEM, X_mean, or X_mode
eta_X_method Random effects values. It can be  eta_X_SAEM, eta_Mean, or eta_Mode
covariate Covariates values
split
color Name of the color the observation is colored with
filter 1 if the subject is filtered, 0 otherwise

Predictive checks and prediction

Visual Predictive Checks (continuous)

xxx_observations.txt

Description: observation values

Full output file description

Column Description Needed Task
id Subject identifier
OCC Occasion value (optional) if there is IOV in the data set
time Observation times
y Observation values (loq)
y_simBlq Observation values (simulated blq)
censored 1 if the observation is censored, 0 otherwise
split Name of the split the subject occasion belongs to
color Name of the color the observation is colored with
filter 1 if the subject is filtered, 0 otherwise

xxx_bins.txt

Description: bins values for the corresponding axis.

Full output file description

Column Description Needed Task
binsValues Abscissa bins values
split Name of the split the bins refer to If the chart is splitted

xxx_percentiles.txt

Description: empirical and theoretical percentiles values (lower, median & upper) + confidence interval on theoretical percentiles (lower & upper)

Full output file description

Column Description Needed Task
bins_middles Abscissa bin middles
empirical_median Empirical median
empirical_lowerPercentile Empirical lower percentile
empirical_upperPercentile Empirical upper percentile
theoretical_median Theoretical median
theoretical_median_piLower Lower percentile of prediction interval  of theoretical median
theoretical_median_piUpper Upper percentile of prediction interval  of theoretical median
theoretical_lowerPercentile Median of the prediction interval for the lower percentile
theoretical_lower_piLower Lower percentile of the prediction interval for the lower percentile
theoretical_lower_piUpper Upper percentile of the prediction interval for the lower percentile
theoretical_upperPercentile median of the prediction interval for the upper percentile
theoretical_upper_piLower Lower percentile of the prediction interval for the upper percentile
theoretical_upper_piUpper Upper percentile of the prediction interval for the upper percentile
split Name of the split the visual guides belong to If the chart is splitted

Visual Predictive Checks (discrete)

xxx_distribution.txt

Description: discrete observation modalities theoretical distribution (among continuous time grid)

Full output file description

Column Description Needed Task
binsTimeBefore Abscissa bins time value before this time
binsTimeAfter Abscissa bins time value after this time
propCategory_empirical Empirical proportion of the modality set represented by the subchart
propCategory_median Median of the prediction interval for the proportion of the modality set represented by the subchart
propCategory_piLower Lower percentile of the prediction interval for proportion of the modality set represented by the subchart
propCategory_piUpper Upper percentile of the prediction interval for proportion of the modality set represented by the subchart
category Label for modality sets
split Name of the split the distributions refer to

xxx_xBins.txt

Description: bins values for the x-axis.

Full output file description

Column Description Needed Task
binsValues Abscissa bins values
split Name of the split the bins refer to If the chart is splitted

Visual Predictive Checks (event)

xxx_curves.txt

Description: observation values

Full output file description

Column Description Needed Task
time Observation times
survivalFunction_empirical Empirical survival of first event
survivalFunction_median Survival median
survivalFunction_pXX Survival percentile XX
averageEventNumber_empirical Empirical mean number of events
averageEventNumber_median Median mean number of events
averageEventNumber_pXX Percentile XX mean number of events
split Name of the split the subject occasion belongs to If the chart is splitted

xxx_censored.txt

Description: censored values

Full output file description

Column Description Needed Task
time Observation times
values Survival of first event
split Name of the split the subject occasion belongs to

BLQ Predictive Checks

xxx_cumulatedBLQfrequencies.txt

Description: censored simulated observations cumulated frequency

Full output file description

Column Description Needed Task
time Time
empiricalCumulatedFrequencies Empirical fraction of data that is BLQ between time 0 and time t
median Prediction interval median
piLower Lower bound of prediction interval
piUpper Upper bound of prediction interval
split Name of the split the visual guides belong to

Numerical Predictive Check

xxx_cdf.txt

Description: empirical and theoretical cumulative distribution function of observations

Full output file description

Column Description Needed Task
time Cdf continuous grid time
empiricalCdf Empirical cdf based on observations
theoreticalCdf Median of prediction interval for cdf
piLower Lower percentile of prediction interval for cdf
piUpper Upper percentile of prediction interval for cdf
split Name of the split the visual guides belong to

Prediction distribution (continuous)

xxx_observations.txt

Description: observation values

Full output file description

Column Description Needed Task
id Subject identifier
OCC Occasion value (optional) if there is IOV in the data set
time Observation times
y Observation values (loq)
censored 1 if the observation is censored, 0 otherwise
split Name of the split the subject occasion belongs to
color Name of the color the observation is colored with
filter 1 if the subject occasion is filtered, 0 otherwise

xxx_percentiles.txt

Description: theoretical percentiles computed on continuous grid

Full output file description

Column Description Needed Task
time Continuous time grid used for simulation
median Median
pPercentile Percentile
split Name of the split the visual guides belong to If the chart is splitted

Prediction distribution (discrete)

xxx_distribution.txt

Description: discrete observation modalities theoretical distribution (among continuous time grid)

Full output file description

Column Description Needed Task
binsTimeBefore Abscissa bins time value before this time
binsTimeAfter Abscissa bins time value after this time
propCategory_empirical Empirical proportion of the modality set represented by the subchart
propCategory_median Median of the prediction interval for the proportion of the modality set represented by the subchart
propCategory_piLower Lower percentile of the prediction interval for proportion of the modality set represented by the subchart
propCategory_piUpper Upper percentile of the prediction interval for proportion of the modality set represented by the subchart
category Label for modality sets
split Name of the split the distributions refer to

xxx_xBins.txt

Description: bins values for the x-axis.

Full output file description

Column Description Needed Task
binsValues Abscissa bins values
split Name of the split the bins refer to If the chart is splitted

Convergence diagnosis

SAEM

CvParam.txt

Description: evolution of the parameters during SAEM iterations

Full output file description

Column Description Needed Task
iteration Iteration number
convergenceIndicator Convergence Indicator
phase 1 for exploratory and 2 for smoothing
X Parameter X

MCMC

convergences.txt

Description: evolution of the convergence with respect to the MCMC iterations

Full output file description

Column Description Needed Task
iteration Iteration number
E_X Conditional expectation of the parameter X
sd_X Conditional standard error of the parameter X

bounds.txt

Description: bounds for each parameter corresponding to the bounds on the graph. The first line corresponds to the minimum and the second one corresponds to the maximum.

Full output file description

Column Description Needed Task
E_X Conditional expectation of the parameter X
sd_X Conditional standard error of the parameter X

Standard errors of the estimates

rse.txt

Description: Standards errors for each parameter

Full output file description

Column Description Needed Task
X parameter
rse_lin Relative standard error with linearization method
rse_sa Relative standard error with linearization method

5.FAQ

This page summarizes the frequent questions about Monolix.

Monolix2018R1: Evolution from Monolix2016R1

Resolution and display

  • OpenGL technology impact on remote access: Monolix and Datxplore interface were updated with OpenGL technology. Unfortunately, remote access using direct rendering is not compatible with OpenGL, as the OpenGL application sends instructions directly to the local hardware bypassing the target X server. As a consequence, MonolixSuite cannot be used with X11 forwarding. Instead, an indirect rendering should be used, where the remote application sends instructions to the X server which transfers them to the graphics card. It is possible to do that with ssh application, but it requires a dedicated configuration depending on the machine and the operating system. Other applications such as VNC or Remina can also be used for an indirect rendering.
  • If the graphical user interface appears with too high or too low resolution, follow these steps:
    • open and close Datxplore
    • open Monolix
    • load any project from the demos
    • in the menu, go to Settings > Preferences and disable the “High dpi scaling” in the Options.
    • close Monolix
    • restart Monolix

 

Regulatory

  • What is needed for a regulatory submission using Monolix2018? Monolix is used for regulatory submissions (including the FDA and the EMA) of population PK and PK/PD analyses. The summary of elements needed for submission can be found here.
  • How to cite Monolix2018R1? To cite Monolix, please reference it as here
    Monolix version 2018R1. Antony, France: Lixoft SAS, 2018.
    http://lixoft.com/products/monolix/

Running Monolix

  • On what operating systems does Monolix run? MonolixSuite runs on Windows, Linux and MacOS platform.
  • Is it possible to run Monolix using a simple command line? Yes, see here. In addition, there is a full R -api providing the full flexibility on running and modifying a Monolix project as can be seen here

Initialization

  • How to initialize my parameters? There are several ways to initialize your parameters and visualize the impact. See here the different possibilities.

Results

  • Can I define myself the result folder? By default, the result folder corresponds to the project name. However, you can define it by yourself. See here to see how to define it on the user interface.
  • What result files are generated by Monolix? Monolix proposes a lot of different output files depending on the tasks done by the user. Here is a complete listing of the files along with the condition for creation. See here for more information.
  • Can I replot the plots using another plotting software? Yes, if you go to the menu Export and click on “Export charts data”, all the data needed to reproduce the plots are stored in text files. See here for the description of all the files generated along with the plots.
  • When I open a project, my results are not loaded (message “Results have not been loaded due to an old inconsistent project”). Why? When loading a project, Monolix checks that the project (i.e all the information saved in the .mlxtran file) being loaded and the project that has been used to generate the results are the same. If not, the error message is shown. For instance if one runs a project, then do “use last estimates”, save and try to reload the project, the saved project has the “last estimates” as initial values which are different from the initial values used to run and generate the results. In that case the results will not be loaded because they are inconsistent with the loaded project.
    It is possible to check what is preventing the load of the results by comparing the content of the .mlxtran file to load and the .mlxtran file located in the hidden .Internals folder in the result folder. To see the .Internals folder, the “show the hidden files/folders” must be activated on the machine.

Tasks

  • How are the censored data handled? The handling of censored data is described here.
  • How are the parameters without variability handled? The different methods for parameters without variability are explained here.
  • What is the convergence indicator, displayed during SAEM? The convergence indicator is the complete log-likelihood. It can help to follow convergence. Note that the complete likelihood is not the same as the log-likelihood computed as separate task. Indeed, the log-likelihood is defined as \(\sum_{i=1}^{N_{\text{ind}}}\log\left(p(y_i; \theta)\right)\). It is the relevant quantity to compare model, but unfortunately it cannot be computed in closed form because the individual parameters \(\phi_i\) are unobserved. Thus, to estimate the log-likelihood an importance sampling Monte Carlo method is used in a separate task (or an approximation is calculated via linearization of the model). To know more on the log-likelihood calculation using linearization or importance sampling, see here. On the contrary, the complete likelihood refers to the joint distribution \(\sum_{i=1}^{N_{\text{ind}}}\log\left(p(y_i, \phi_i; \theta)\right)\). By decomposing the terms as \(p(y_i, \psi_i; \theta)=p(y_i| \psi_i; \theta)p(\psi_i; \theta)\), we see that this quantity can easily be computed using as \(\phi_i\) the individual parameters drawn by MCMC for the current iteration of SAEM. This quantity is calculated at each SAEM step and is useful to assess the convergence of the SAEM algorithm.
  • When estimating the log-likelihood via importance sampling, the log-likelihood does not seem to stabilize. What can I do? The log-likelihood estimator obtained by importance sampling is biased by construction (see here for details). To reduce the bias, the conditional distribution \(p_{\phi_i|y_i}\) should be known as well as possible. For this, run the “conditional distribution” task before estimating the log-likelihood.

Model definition

  • Is it possible to use time-varying covariates? Yes, however the covariates relationship must be defined in the model instead of the GUI. See here how to do that.
  • Is it possible to define complex covariate-parameter relationships such as Michaelis-Menten for instance? Yes, this can be done directly in the model file. See here how to do it.
  • Is it possible to define a categorical covariate effect on the standard deviation of a parameter? Yes, this can be done directly in the model file. See here how to do it.
  • Is it possible to define mixture of structural models? Yes, it may be necessary in some situations to introduce diversity into the structural models themselves using between-subject model mixtures (BSMM) or within-subject model mixtures (WSMM). The handling of mixture of structural models is defined here. Notice that in the case of a BSMM, the proportionbetween groups is a population parameter of the model to estimate. There is no inter-patient variability on p: all the subjects have the same probability and a logit-normal distribution for p  must be used to constrain it to be between 0 and 1 without any variability.
  • Is it possible to define mixture of distributions? Yes, the handling of mixture of structural models is defined here.
  • Can I set bounds on the population parameters for example between a and b? It is not possible to set bounds for the estimated population parameters. However it is possible to define bounded parameter distributions, which as a consequence also bound the estimated fixed effect parameter. See here how to do it.
  • Can I put any distribution on the population parameters? Not directly through the interface. Using the interface, you can only put normal, lognormal, logitnormal and probitnormal. However, you can set any transformation of your parameter in the EQUATION: part of the structural model and apply any transformation on it.
  • Can I set a custom error model? No, this is not possible. It may however be possible to transform the data such that the error model can be picked from the list. For an example with a model-based meta-analysis project, see here.
  • What are the units of estimated parameters? Can I define a scale factor? The units of the estimated parameters depend on the units of the data set and are implicit. Check here to learn how to include a scale factor.

Tricks

  • How to compute AUC, time interval AUC, … using Mlxtran in a Monolix project?  See here.
  • How can I calculate the coefficient of variation? The coefficient of variation is not outputted by Monolix but can easily be calculated manually. The coefficient of variation is defined as the ratio of the standard deviation divided by the mean. It is often reported for log-normally distributed parameters where it can be calculated as: \( \textrm{CV}=\sqrt{e^{\omega^2}-1} \) with \( \omega \) the estimated standard deviation.

Export

5.1.Evolutions from Monolix2016R1 to Monolix2018R1

Monolix had a complete transformation to have a better interface and plots, better performance and be easier to use.

Monolix project definition, settings, and outputs

Monolix Connectors

Monolix Interface

The Monolix user interface is fresh new with a new javascript technology. It is not only one single frame anymore. There are now frames

Welcome frame

In this frame, it is possible to
– create a new project
– load a project
– load a recent project
– load a demo
– look at Monolix web documentation

Data frame

In this frame, the user defines its data set and tag each column of its data set. The possible column are the same but lot of names were changed to be more intuitive. Notice that, when the user defines the observation column, it should define its type (continuous/discrete/event)
When clicking on OK, it validates the data set and provide the possible use of it. When the data set is validated, a DATA VIEWER button appears providing the possibility to explore the data set parallel to the project.
Data frame enhancements
– error messages pop up when there is an error in the data set or in the consistency between the data set and the header definition.
– warning messages pop up when there is a warning in the interpretation of the data set.
– it is possible to scroll down the data set while keeping the header visible
– it is possible to sort the data set by any column
– loading a large data set is much more efficient
– it is possible to visualize the whole data
– number of doses can be chosen if there is steady-state

Structural model frame

In this frame, the user defines it structural model. The user can
– browse the file from any folder
– load a file from the library
– open in MlxEditor
– reload it (if it has been changed in the editor for example)
– error messages pop up when there is an error in the model or in the consistency between the data set and the proposed model.
– custom lixoft library browser to choose easily the model

Initial estimates frame

There are two possibilities

CHECK INITIAL ESTIMATES to see how the structural model fits each individual.

Enhancements
– it is possible to define the number of individuals showed and the associated layout
– it is possible to define the same x-axis and/or y-axis
– in case of bsmm, the two models are plotted in full and dotted red respectively
– the calculation is much faster and dedicated to the considered frame.
Evolution
– it is not possible to check the initial values of the beta’s anymore
– the grid for the prediction takes doses into account

Set values of the INITIAL ESTIMATES.

Enhancements
– there is a new link to fix all parameters.
– there is a new link to fix estimate all parameters (error model parameter ‘c’ is not affected by the ‘estimate all’ feature if its value is 1).
– there is a new link to use the last estimated values as initial estimates. Notice that this link is usable only if there has been no modification of the project.
– there is a new link to use only the last estimated fixed effects values as initial estimates.
– to define the estimation method is not a right click anymore, the user has to click on the wheel next to the parameter value.
– when the user clicks on the value, the associated constraint (typically: “Value must be >0”) is displayed to define the domain of definition of the parameter.
– there are error messages when the initial values are not set to a correct value (due to the associated distribution).
– in case of IOV, all the random effects are on the same frame.
– in case of use of a categorical covariate with several modalities in the statistical model, the user can initialize all associated beta’s independently
– in case of use of a categorical covariate with several modalities in the statistical model, the user can define the estimation method on all associated beta’s independently.
Evolution
– it is not possible to use the last estimate if there were any modification of the project
– for bayesian, only the MAP option is available.
– the methods color evolved: black for MLE, orange for fix and purple for MAP

Statistical model and tasks frame

Tasks

– The task for the calculation of the individual parameter was splitted in two tasks (Ebes referring to the conditional mode and conditional distribution allowing the conditional mean).
– The task for the calculation of the individual parameter is displayed before the other one to be consistent with a scenario usage.
– Use of the linearization method is now shared between the standard errors calculation and the log-likelihood calculation.
– The convergence assessment is now using a user defined scenario and not the current one. Three scenario are proposed (computed the se and the LL and if linearization method is used). Notice that the plots are not run.
– Assessment: new plot last convergences (dot) for each run.
– Assessment: ‘Stop’ button stops the current run and keep only the previous ones.
– Assessment: interactivity with graphs in real time (zoom, layout, selected subplots).
– Assessment: there is a summary provided in the Assessment folder in the project result folder.
– Assessment: the scenario of the assessment is now independent of the scenario of the project. The user can choose between three scenari.
– The settings for each tasks is now available with a button on next to each task.
– It is not possible to reload a previous convergence assessment using the interface. However, all the results are in the result folder in an Assessment folder.
– The list of plots is now arranged in categories to increase readability
– Lists of plots can be selected (All, none) by categories or for all the plots.

Observation model

– A button formula was added to show the formula associated to the error model in case of continuous error model in real time
– Additional and customizable error model are proposed. Now, the user can choose in a list of both distribution (normal/lognormal/logitnormal) and error models (constant/proportional/combined1/combined2)
– Generalization of error models: parameter c is always a parameter of proportional and combined1/combined2 models (fixed to 1 by default)
– it is possible to choose the minimum and the maximum of the logit function when chosen.
– there are error messages when the minimum and maximum values are set to a correct value
– there is an error message if the user try to set the distribution as lognormal and it is not possible (in case of negative observations for example)
– Type of discrete models display

Individual model

– The display is very different and more synthetic
– A button formula was added to show the formula associated to the error model in case of continuous error model in real time
– The names of the parameters and the covariates are displayed
– In case of IOV, all levels are displayed in the same frame
– The choice of the parameter distribution is done by choosing in a list
– The choice of adding or removing variability is performed by clicking in the column “Random Effects”
– There are two buttons to add and remove variability on all parameters at the same time
– The correlation is not defined as a matrix anymore, the user must define groups and add parameters on those groups
– Adding a covariate on an individual parameter is performed by clicking in the covariate name column
– In case of IOV, the covariates are arranged by level of variability
– There is dedicated buttons to add a transformed continuous covariate, transformed categorical covariate and mixture
– To add a transformed continuous covariate, the user click on the button CONTINUOUS and the user can define a Name and a Formula.
— a Name is proposed
— the list of available covariates is proposed
— by clicking on one available covariates write it in the Formula
— overlaying an available covariate show the min, mean, median, and max of the covariate
— the formula can be any Mlxtran compatible expression
— the Formula can contains several covariates
– To add a transformed categorical covariate, the user click on the button CATEGORICAL and the user can define a Name and a groups.
— a Name is proposed
— the list of modalities is proposed
— one can allocate, reset an allocation, and modify the allocation
— the user can choose the reference category
— the user can choose the name of the groups
– To add a transformed categorical covariate, the user click on the button MIXTURE and the user can define the name and the number of modalities
– a magnifying glass icon is proposed to be able to locate the covariate when there are several covariates
– For each transformed covariate, there is a possibility to edit and remove this covariate
– there are errors displayed explaining the reason of the error if the action is not possible

Results

– New section so see all the tasks results
– better representation
– It contains a section for Population parameter estimates
– It contains a section for Individual parameter estimates with the conditional mode and conditional mean
– It contains a section for Correlation matrix of the estimates (and RSE) with the linearization method or the stochastic approximation
– The values of the elements of the correlation matrix and the rse are colored to improve readability and faster diagnosis
– A selected correlation in the matrix set a focus on both associated population parameters
– It contains a section for the Estimated log-likelihood and information criteria with the linearization method or the importance sampling method
– It contains a section for all the statistical tests
– The values of the elements of the tests are colored to improve readability and faster diagnosis
– It is possible to open the output folder directly from here
– Results display is loaded if the project has results

Monolix calculation engine

Better performance thanks to the parallelization

It is now possible to parallelize the calculation of Monolix over several machines using open mpi.

Better performance in structural model evaluation

– Faster analytical solutions
– Faster calculation for ODEs
– No restrictions to use analytical solutions if regressors are constant over the subject time. Sequential models (using a PK model and its associated analytical solution) will be much faster.
– Less restrictive conditions to use analytical solutions when IOV occurs
Bug fixed:
– A time varying initial condition (for DDE models) is now well taken into account
– A regressor as initial condition is now well taken into account

Algorithms settings

– Constraints for settings
– Names and reorganization modified for a better comprehension
– all the settings are now available through a button next to the task.

SAEM algorithm

– Addition of new error models. The user can now defined both the distribution and the error model.
– Optimization of SAEM strategy when the error model has several parameters (typically for combined1 and combined2 model).
– Strategy with simulated annealing for conbined1 and combined2 (improve convergence)
– Evolution of SAEM strategy when the error model is proportional (there were issues when the prediction was very close to zero).
– CPU time optimization of SAEM strategy.
– When latent covariate are used, the methodology to estimate the probabilities and the associated betas is now based on the mixing law and not on a individual probability draw. It allows a better evaluation of the Log-likelihood and better convergence properties.
– When there are parameters without variability,
— With the no-variability method, the maximum number of iterations depends on the number of parameters without variability
— With the no-variability method, the optimization is much faster.
— With the decreasing variability methods, the decreasing speed of the artificial variance is lower
— For normal law, better strategy to initialize variance (more consistent)
— when there is a latent covariate on the parameter, all methods can be used.
– When no parameter has variability and the no-variability method is used. Only one iteration of SAEM is done.
– Two settings of SAEM were updated to provide a better convergence
— The minimum number of iterations in the exploratory phase is now at 150 (it was 100 previously)
— The step size exponent in the smoothing phase is now at .7 (it was 1 previously)
– Constraints for settings
– If SAEM reaches the maximum number of iterations during the exploratory phase, a warning arises.

Removed feature
– We removed the possibility to add different variances depending on the modality of a categorical covariate
– We removed the possibility to choose to work either with standard errors or variances. Only standard errors are proposed. However, variance project can be loaded.
– We removed the possibility to have a bayesian posterior distribution
– We removed the possibility to have a custom distribution of the individual parameters
– Autocorrelation can not be added anymore in graphical interface. However, it can be loaded or added by the connectors

Conditional distribution

– Conditional distribution can now be computed for discrete and event models.
– New setting: number of simulations by individual parameters
– Adaptative number of simulations value according to the data size
– If the Fisher Information Matrix by stochastic approximation has already been computed, all the MCMC draws are reused and providing a much faster calculation.

Conditional mode

– The calculation is now tremendously faster. (between 20 to 100 time faster)

Standard error calculation

– Fisher information matrix can now be computed with discrete and event models and IOV
– Improvement of the calculation for the linearization
– Improvement of the calculations for S.A. if there are nans
– Faster calculation for the linearization
– Decrease of the maximum of iterations to 200 (it was 300 previously)
– Settings are modified for S.A.: min and max iterations
– Warnings if there are numerical issues for linearization
– If the conditional distribution has already been computed, all the MCMC draws are reused and providing a much faster calculation.

Log-likelihood calculation

– Improvement of the calculation for the linearization
– Faster calculation in case of censored data
– Faster calculation in case of importance sampling
– When the calculation by linearization has issues, then warning is provided to the users.
– The number of Monte-Carlo size in the importance sampling is now at 10000 (it was previously at 20000)

Simulation computation for plots

The simulation are much faster than in the previous version. It impacts a lot the time needed for the generation of the VPCs and the prediction interval for example.
In addition, a deep effort was done on the discrete and event models where the simulation is now tremendously faster. A progression bar is proposed too.
For the simulation on a grid, the doses and the regressors were added.

Plots during algorithms

– large interactivity (zoom, layout, coordinates)
– Possibility to switch between different frames during the algorithms calculation
– List of elements to compute for plots (‘Stop’ button keeps the done computations)

Tests computation

Tests are computed when the conditional distribution task is performed and the plots are launched. The following tests are computed
– Pearson’s correlation test on the individual parameter and the covariates used in the statistical model
– Pearson’s correlation test on the individual random effects and the covariates
– Fisher test for discrete covariates
– Shapiro Wilk test on the random effects
– Pearson’s correlation test on the random effects
– Shapiro Wilk test on the individual parameters that have no covariate
– Kolmogorov Smirnov adequacy test on the individual parameters that have covariates
– Van Der Waerden test on the residuals
– Shapiro Wilk test on the residuals
– For all tests associated to individual (parameters, random effects, NPDEs), the Benjamini-Hochberg procedure is applied to avoid bias

Monolix plots

All the plots were updated with a new technology and new features. In addition, all the color/graphical can be changed in the Preferences frame.
Notice that
– When you save the projects, your current graphical settings are conserved
– you can export your settings to be your defaults in the Export menu.

Stratify

The user can now define all the stratifications needed in a Stratify frame in a very easy way and can split, color and filter bay any defined covariate.
Enhancements
– Large simplification of the usage
– For a continuous covariate, possibility to define groups with either equal number of individuals, or equal size
– Possibility to change all the color
– Possibility to highlight a full group when clicking on the covariate category
– Buttons to add and remove categories
– Better performance

Observed data enhancements

This plot contains all the observations and can be used with all types of observations. It produces
– the spaghetti plot for continuous observations.
– the spaghetti plot or histogram for discrete observations (the user has the possibility to switch).
– the kaplan-Meier plot for event observations along with the mean number of events per individual
Enhancements
– When overlaying a curve, the ID is displayed and all the points of the subject are highlighted.
– When splitting, the information for each group is computed.
– When splitting, the user can choose to adapt the x-axis and or y-axis to each group or to have the same for all groups.
– Possibility to display the dosing times when overlaying an individual.

Individual fits enhancements

– It is possible to sort the individuals by individual parameters values.
– When there are censored data, the full interval is displayed
– the y-scale management is better performed.
– Possibility to display the dosing times.
– The user can choose to share the x-axis and/or y-axis .
– Possibility to zoom on all the individual at the same time with a linked zoom.
– Population fits (population covariate)
– Grid takes doses (and regressors) into account
– Color is added for a better representation of IOV when occasions are joined (according the presence of washout or not)

Observation vs Prediction enhancements

– The conditional distribution can be used for this plots.
– 90% prediction interval is now available.
– Information on the outliers proportions.
– Overlaying a point will display both the ID and the time of the points (and its replicates if the conditional distribution is chosen). In addition, the other points corresponding to the same ID are also highlighted.
– The log-log scale management is more efficiently done.

Scatter plots of the residuals enhancements

– Possibility to have Scatter plot for event.
– IWRES can be computed with the conditional distribution.
– Overlaying a point will display both the ID and the time of the points (and its replicates if the conditional distribution is chosen). In addition,
— the other points corresponding to the same ID are also highlighted.
— the same points are overlaid on the other plots.
– 2 predefined configurations (VPC and scatter).
– In case of discrete models, the scatter plot w.r.t time was added.

Distribution of the residuals enhancements

– By overlaying a bar in the pdf plots, we have the percentage of individual in this bar.
– By overlaying in the cdf plot, the theoretical and empirical cdf are displayed along with the x-axis value.
– The qqplot representation was replaced by a cdf representation.
– The empirical pdf is not computed anymore.

Distribution of the individual parameters enhancements

– The non parametric pdf is not proposed anymore.
– By hovering over a bar in the pdf plots, we have the percentage of individual in this bar.
– The empirical and theoretical cdf of the individual parameters are now computed.
– By hovering over the cdf plot, the theoretical and empirical cdf are displayed along with the x-axis value.
– When splitting, the shrinkage information is computed.

Distribution of the random effects enhancements

– The empirical and theoretical pdf of the individual parameters are now computed.
– By hovering over a bar in the pdf plots, we have the percentage of individual in this bar.
– The empirical and theoretical cdf of the individual parameters are now computed.
– By hovering over the cdf plot, the theoretical and empirical cdf are displayed along with the x-axis value.

Correlation between random effects enhancements

– Correlation information is proposed.
– Hovering over a point will display the ID of the point (and its replicates if the conditional distribution is chosen). In addition, the same ID is overlaid in the other figures.
– Possibility to select the parameters to look at.
– Possibility to split the graphic.
– Optimized layout. At the beginning, a maximum of 6 random effects is displayed. However, the user can choose any number afterward.

Individual parameters vs covariates

– Overlaying a point will display the ID of the point (and its replicates if the conditional distribution is chosen). In addition, the same ID is overlaid in the other figures.
– Possibility to split and have all figures
– Possibility to select the parameters to look at.
– Possibility to select the covariates to look at.
– Possibility to split and color at the same time.

Visual Predictive Checks enhancements

– This plot contains all the observations and can be used with all types of observations.
– In case of categorical projects, there are no bins management on the y-axis. All the categories are displayed with the good y-label
– In case of count projects, the y-label is well defined

Prediction distribution

– Possibility to color the observations
– Possibility to differentiate the censored data and the non censored data
– Overlaying a point will display the ID of the point. In addition, the other points of the same ID are also overlaid.
– Overlaying a band will display the range of the band.

Loglikelihood contribution

– Possibility to zoom on part of the individuals

New plots

– Standards errors of the estimates
– MCMC convergence plot
– Importance Sampling convergence plot

Monolix project definition, settings, and outputs

Project evolution

In terms of project, there are only few modifications
– The definition of the number of doses in the STEADY STATE definition is now in the project and not in the user configuration
– In case of several outputs in the data set, the names of the type of outputs described in the observation is now named yname and not ytype anymore.. Retro compatibility is ensured.
– In case of a single output in the data set, the name of the type of output described in the observation was ytype=1. It is now removed as it is useless. Retro compatibility is ensured.
– The list of plots is now defined in the project file and not in the associated .xmlx anymore.
– The name of the tasks in the Mlxtran project evolved a little bit to be more consistent to the user interface.

Settings

In terms of project settings, there are only few modifications
– The flexibility to use or not the analytical solutions is now defined in the Mlxtran structural model and not in the user configuration
– The project settings are now available via the menu Settings/Project settings
– The user has the possibility to save the data and the model next to the project
– The preferences interface has evolved to be in javascript
– The working directory is not available through the interface but only with the user configuration file
– The change of the number of threads does not imply a restart of Monolix anymore
– We propose to automatically exports all the charts data after the run
– The charts export format are now .svg and .png
– The timestamping option is now called ‘Save History’. The project and its results is saved after each run now.

Configuration of the plots

It is now saved in a .properties associated to the project. It is not a xmxl anymore. It is not xmlx anymore but still readable. Retrocompatibility is only performed on the list of plots. This .properties
– overlay the default settings (default.settings in the user/lixoft folder)
– contains all the information for the display of the plots in terms of what is displayed
– contains all the information for the display of the plots in terms of the covariate stratification in the plot
– contains all the information for the display of the plots in terms of the colors and preferences for each plot
When saving a project, a .properties is generated ensuring to replot exactly the same figures after a reload.
It is possible to export all the settings to define it as the global settings.

Outputs

In terms of outputs, all the files and folder are reorganized. We now have
– summary.txt: providing a summary of the run
– populationparameter.txt with all the estimated population parameters
– the output predictions
– all the files concerning the Fisher Information Matrix are in a folder FisherInformation
– all the files concerning the individual parameters and the random effects are in a folder IndividualParameters
– all the files concerning the logLikelihood are in a folder logLikelihood
– all the files concerning the results of the Tests are in a folder Tests
– a part of the Lixoft files needed to reload is in a private folder .Internals
– when the charts data are exported, the data are exported in a folder ChartsData
– when the figures are exported, the data are exported in a folder ChartsFigures
– all figures can be exported independently

In terms of export, we can
– export all the charts data in Settings/Export charts data
– export all the figures in Settings/Export plots
– export the project in Mlxplore in Settings/Export in Mlxplore

Monolix Connectors

There is a R package associated to Monolix where the user has all the functions available through the interface. The following functions are available
– abort Stop the current task run
– addCategoricalTransformedCovariate: Add Categorical Transformed Covariate
– addContinuousTransformedCovariate: Add Continuous Transformed Covariate
– addMixture: Add Mixture Add a new latent covariate to the current model giving its name and its modality number.
– computePredictions: Compute predictions from the structural model
– getConditionalDistributionSamplingSettings: Get conditional distribution sampling settings
– getConditionalModeEstimationSettings: Get conditional mode estimation settings
– getContinuousObservationModel: Get continuous observation models information
– getCorrelationOfEstimates: Get the inverse of the Fisher Matrix
– getCovariateInformation: Get Covariates Information
– getData: Get project data
– getEstimatedIndividualParameters: Get last estimated individual parameter values
– getEstimatedLogLikelihood: Get Log-Likelihood Values
– getEstimatedPopulationParameters: Get last estimated population parameter value
– getEstimatedRandomEffects: Get estimated the random effects
– getEstimatedStandardErrors: Get standard errors of population parameters
– getGeneralSettings: Get project general settings
– getIndividualParameterModel: Get Individual Parameter Model
– getLastRunStatus: Get last run status
– getLaunchedTasks: Get tasks with results
– getLogLikelihoodEstimationSettings: Get LogLikelihood algorithm settings
– getMCMCSettings: Get MCMC algorithm settings
– getMlxEnvInfo: Get information about MlxEnvironment object
– getObservationInformation: Get observations information
– getPopulationParameterEstimationSettings: Get population parameter estimation settings
– getPopulationParameterInformation: Get Population Parameters Information
– getPreferences: Get project preferences
– getProjectSettings: Get project settings
– getSAEMiterations: Get SAEM algorithm iterations
– getScenario: Get current scenario
– getSimulatedIndividualParameters: Get simulated individual parameters
– getSimulatedRandomEffects: Get simulated random effects
– getStandardErrorEstimationSettings: Get standard error estimation settings
– getStructuralModel: Get structural model file
– getVariabilityLevels: Get Variability Levels
– initializeMlxConnectors: Initialize MlxConnectors API
– isRunning: Get current scenario state
– loadProject: Load project from file
– mlxDisplay: Display Mlx API Structures
– newProject: Create new project
– removeCovariate: Remove Covariate
– runConditionalDistributionSampling: Sampling from the conditional distribution
– runConditionalModeEstimation: Estimation of the conditional modes (EBEs)
– runLogLikelihoodEstimation: Log-Likelihood estimation
– runPopulationParameterEstimation: Population parameter estimation
– runScenario: Run Current Scenario
– runStandardErrorEstimation: Standard error estimation
– saveProject: Save current project
– setAutocorrelation: Set auto-correlation
– setConditionalDistributionSamplingSettings: Set conditional distribution sampling settings
– setConditionalModeEstimationSettings: Set conditional mode estimation settings
– setCorrelationBlocks: Set Correlation Block Structure
– setCovariateModel: Set Covariate Model
– setData: Set project data
– setErrorModel: Set error model
– setGeneralSettings: Set common settings for algorithms
– setIndividualParameterDistribution: Set Individual Parameter Distribution
– setIndividualParameterVariability: Individual Variability Management
– setInitialEstimatesToLastEstimates: Initialize population parameters with the last estimated ones
– setLogLikelihoodEstimationSettings: Set loglikelihood estimation settings
– setMCMCSettings: Set settings associated to the MCMC algorithm
– setObservationDistribution: Set observation model distribution
– setObservationLimits: Set observation model distribution limits
– setPopulationParameterEstimationSettings: Set population parameter estimation settings
– setPopulationParameterInformation: Population Parameters Initialization and Estimation Method
– setPreferences: Set preferences
– setProjectSettings: Set project settings
– setScenario: Set scenario
– setStandardErrorEstimationSettings: Set standard error estimation settings
– setStructuralModel: Set structural model file

5.2.Submission of Monolix analysis to regulatory agencies

Monolix is used for regulatory submissions (including the FDA and the EMA) of population PK and PK/PD analyses. Monolix analyses for first in human dose estimation, dose-finding studies and registration studies have been routinely and successfully submitted to the FDA, EMA [3] and other agencies. The FDA and the EMA do have Monolix and the modelling experts to understand, review and run Monolix. Regulators are also taking part in publishing research articles with Monolix [2].
Regulatory guidelines [1] provide only little information on the required electronic files for submission. Based on exchanges with regulatory agencies and confirmed through past regulatory submissions using Monolix, the following listed files (in Table 1) are required for a complete Monolix analysis submission package. The Monolix analysis submission package listed in Table 1 has all the files needed to run Monolix without any implementation work and to reproduce the results. Attention must be payed to use relative file path definitions to facilitate the project transfer from one computer system to another (Monolix documentation). Further, it is important to consider that Table 1 represents the Monolix submission package for a typical analysis. There might be modelling cases that require the submission of additional material. Each submission should be treated separately and carefully reviewed for completeness cases by cases.

Table 1 Monolix analysis submission package

File Explanation File type, format
Report (.pdf) Report detailing all of the modelling according to the EMA or FDA guidelines [1] PDF
Data (.txt or .csv) Data file containing all the observations, dosing history, patient specific information and other information provided via the data file Text file, CSV
Project file (.mlxtran) Defines what data file, model file, algorithm settings, graphical settings and parameters have been used to generate the results Text file, Mlxtran

In addition, some Monolix files define project independent parameters that are applied to all projects run on the same computer account. In all but the most special cases users will have these parameters set to the default values. In case these parameters are modified, they should also be communicated. They can be found in the following files:

Table 2 Optional but recommended files 

File Explanation File type, format
Model file (.txt) The structural model in Mlxtran syntax if not one of the MonolixSuite library Text file, Mlxtran
Properties file (.properties) Properties of the plots. It allows the project to reload all the graphical properties to be able to exactly reproduce the same plots (in terms of color, split by categories, …)  Text file, .properties

A Monolix run will automatically produce a large number of additional files. However, the files listed in Table 1 and Table 2 are sufficient to entirely reproduce the results. Note that all files are in a human readable format. Thus, the information contained in these files can also be included into the appendices of the report creating one single document that contains everything to reproduce the results.

References

[1] Applicable regulatory guidelines

  • EMA “Guideline on reporting the results of population pharmacokinetic analyses” (CHMP/EWP/185990/06): details what should be contained in a PK or PK/PD analysis report.
  • FDA “Guidance for Industry Population Pharmacokinetics”: takes a larger scope also addressing “Data Integrity and Computer Software’ and ‘Electronic Files”.
  • FDA “Exposure-Response Relationships – Study Design, Data Analysis, and Regulatory Applications”.

[2] FDA publications using MonolixSuite

  • “Plasma pharmacokinetics of ceftiofur metabolite desfuroylceftiofur cysteine disulfide in holstein steers: application of nonlinear mixed-effects modeling.”, J Vet Pharmacol Ther 2016 Apr;39(2):149-56, DOI: 10.1111/jvp.12245. O. A. Chiesa, S. Feng, P. Kijak,E. A. Smith, H. Li and J. Qiu.
  • “Quantification of disease progression and dropout for Alzheimer’s disease.”, Int. Journal of Clinical Pharmacology and Therapeutics, Volume 51 – February (120 – 131),(Doi: 10.5414/CP201787). D. William-Faltaos, Y. Chen, Y. Wang, J. Gobburu,, and H. Zhu.
  • “Estimation of Population Pharmacokinetic Parameters Using MLXTRAN Interpreter in MONOLIX 2.4”, D. William Faltaos, Acop 2009.

[3] Submission with Monolix analyses to the EMA

  • Procedure No. EMEA/H/C/000402/II/0110/G – EMA/CHMP/186699/2015 –  Tamiflu – International non-proprietary name: OSELTAMIVIR. (link)
  • Procedure No. EMEA/H/C/000401/II/0066 – EMA/168487/2015 – Tracleer –  International non-proprietary name: BOSENTAN. (link)

5.3.Running Monolix using a command line

Yes, it is possible to run Monolix using a simple command line:

monolixSuiteInstallationFolder/bin/Monolix.sh --no-gui  -p fullPathProjectName

(replace .sh into .bat for windows operating system)

Notice that the project name should be defined using a full path and not a relative path. The program options are

  • --no-gui: without opening a window, mandatory in no-desktop environments.
  • -p: project to run. It should be the full path name of the project
  • --thread: number of threads allowed for this run
  • --mode: Select the verbosity of the run information that will be log in console. It can be “none”, “basic” (default value), or “complete”.

 

Note: if the plots task is selected in the scenario, and if “Export charts data” is selected in Monolix’s preferences, the charts data are saved in the result folder. Generating the interactive plots requires to open the project in the GUI.

5.4.How to compute AUC using in Monolix and Mlxtran

Often the Area under the PK curve (AUC) is needed as an important PK metric to link with the pharmacodymanic effect. We show here how the AUC can be computed within the mlxtran model file and be outputted for later analysis.

In the EQUATION section of the Mlxtran model you can use the following code. It is the basic implementation of the AUC for a 1-compartmental model with absorption rate ka. It will compute the AUC from the start to the end of the integration.

Notice that the calculation can be used as an output (to be matched to observations of the data set) and thus in the output={} definition, or just computed and available for post-treatment using table={} as can be seen here.

 

AUC for t=0 to t=tend and fast computation with linear ODE

INPUT:
parameter = {ka, V, Cl}

PK:
compartment(cmt=1, amount=Ac, volume=V, concentration=Cc)
elimination(cmt=1, Cl)
oral(ka, cmt=1)

EQUATION:
odeType = linear
ddt_AUC = 1/V * Ac

OUTPUT:
output = {Cc}
table = {AUC}

Time interval AUC

If you want to limit the computation of the AUC to a time period you can use the following code:

INPUT:
parameter = {ka, V, Cl}

PK:
depot(ka, target=Ac)

EQUATION:
odeType=stiff

Ac_0 = 0
ddt_Ac = -Cl/V*Ac
Cc = Ac/V

AUC50_0 = 0
if(t < 50)
dAUC50 = 1/V * Ac
else
dAUC50 = 0
end
ddt_AUC50 = dAUC50

AUC100_0 = 0
if(t < 100)
dAUC150 = 1/V * Ac
else
dAUC100 = 0
end
ddt_AUC100 = dAUC100

AUC50_100 = AUC100 - AUC50

OUTPUT:
output = {Cc}
table = {AUC50_100}

Note that the t==tDose would not work because the integrator does not necessarily evaluate the time exactly at the times of doses. Thus the test t==tDose might not be tested at all during the computation.

 

Fast computation with linODE and dose interval AUC (AUCtau)

In many cases the mlxtran code above will be sufficient. For cases where you need fast execution you use a special solver for linear ODE systems by specifying odeType = linear in the EQUATION section. This does not work with if statements and you will need to use the code below. In this code the AUC is computed for each dose interval. Thus, at each dose the AUC is set to zero and the concentration is integrated until the next administration.

INPUT:
parameter = {ka, V, Cl}

PK:
compartment(cmt=1, amount=Ac, volume=V, concentration=Cc)
elimination(cmt=1, Cl)
oral(ka, cmt=1)

; Create a separate dummy compartment for the AUC
compartment(cmt=2, amount=AUCtau)
iv(cmt=2, p= - AUCtau/amtDose)

EQUATION:
odeType=linear
ddt_AUCtau = 1/V * Ac

OUTPUT:
output = {Cc}
table = {AUC}

5.5.How to export to Datxplore, Mlxpore and Simulx ?

Using Monolix interface, it is possible to export your data set, your model, your project to other Lixoft software.

Export to Datxplore

Datxplore is a graphical and interactive software for the exploration and visualization of data. Datxplore provides various plots (box plots, histograms, survival curves…) to study the statistical properties of discrete and continuous data, and to analyze the behavior depending on covariates, individuals, etc. It is a great application to have a better look to your data set.
In Monolix, you can export your data set by clicking on the “Data” frame (in green) and on the DATAVIEWER button (in blue) as on the following figure. The data viewer will open and has the same functionalities as Datxplore (the data viewer is Datxplore embedded in Monolix).

Export to Mlxplore

Mlxplore is an application for the visual and interactive exploration of your models. It is designed for intuitive and easy use. It is a powerful solution in your daily modeling work as well as for sharing and teaching of PK/PD principles or for real time dose-regimen exploration in front of an audience. Mlxplore also allows you to visualize the statistical components of the model, such as the impact of covariates and inter-individual variability.

In Monolix, you can export your model (longitudinal and statistical) and the data set by clicking on the Export/Export to Mlxplore on the following figure. When clicking, a window will pop up, where the user can define

  • The prediction/simulation grid
    • minimum
    • step
    • maximum
  • The considered individual (with its administration design)
    • Only the first one
    • All individuals
    • User choice

After clicking on Export/Export to Mlxplore, Mlxplore launches. Thus, you will be able to explore your model and associate it to the data set.

Notice that some cases are not managed by Mlxplore:

  • Models with a non-continuous output
  • Models with regressors
  • Models with bsmm functions

Another important point is that Mlxplore is only managing predictions. Thus, even if a continuous error model is considered in the project, it will not be taken into account and the associated sliders have no impact on the results.

Moreover, when categorical covariate are considered, the following message appears in the export window imgErrMlxploreExport

This implies that the categorical covariates are not anymore considered as a category but as a continuous covariate with 0 for the reference category.

Export to Simulx

Simulx is a powerful and flexible simulator for clinical trial pharmacometrics that runs on top of the Lixoft simulation engine. It covers a comprehensive range of data types for simulation and links to a ready to use PK and PD library. It directly connects with Monolix for seamless Modeling and Simulation, and is the statistically most rigorous solution available today. Simulx is currently available via a comprehensive R package thanks to its combination with mlxR, a DDMoRe-sponsored library developed by Inria which contains several R functions for advanced clinical trial simulation.

The simulx function can either directly take the path to a Monolix project as input, or use a model file which can be generated from the Monolix project. This is done using the monolix2simulx function.

Several restrictions:

  • Projects with IOV can not be exported.
  • Projects with custom distributions for individual parameters can not be exported.
  • Projects with bsmm function in the longitudinal model can not be exported.
Suggest Edit