Select Page

# Version 2016R1

This documentation is for Monolix Suite 2016R1.

## Monolix

Monolix (MOdèles NOn LInéaires à effets miXtes) is a platform of reference for model based drug development. It combines the most advanced algorithms with unique ease of use. Pharmacometricians of preclinical and clinical groups can rely on Monolix for population analysis and to model PK/PD and other complex biochemical and physiological processes. Monolix is an easy, fast and powerful tool for parameter estimation in non-linear mixed effect models, model diagnosis and assessment, and advanced graphical representation. Monolix is the result of a ten years research program in statistics and modeling, led by Inria (Institut National de la Recherche en Informatique et Automatique) on non-linear mixed effect models for advanced population analysis, PK/PD, pre-clinical and clinical trial modeling & simulation.

## Objectives

The objectives of Monolix are to perform:

1. Parameter estimation for nonlinear mixed effects models
• computing the maximum likelihood estimator of the population parameters, without any approximation of the model (linearization, quadrature approximation, …), using the Stochastic Approximation Expectation Maximization (SAEM) algorithm,
• computing standard errors for the maximum likelihood estimator
• computing the conditional modes, the conditional means and the conditional standard deviations of the individual parameters, using the Hastings-Metropolis algorithm
2. Model selection and diagnosis
• comparing several models using some information criteria (AIC, BIC)
• testing hypotheses using the Likelihood Ratio Test
• testing parameters using the Wald Test
3. Easy description of pharmacometric models (PK, PK-PD, discrete data) with the Mlxtran language
4. Goodness of fit plots

## An interface for ease of use

Monolix can be used either via a graphical interface or command lines for powerful scripting. This means less programming and more focus on exploring models and pharmacology to deliver in time. The interface can be seen here

One can see several blocks in the interface

• Data and model
• Initialization
• Results

Each of this block has a specific section on this website. An advanced description of available graphics is also provided.

### 2.Data and models

#### Creating and using models

• Libraries of models      learn how to use the Monolix libraries of PKPD models and create your own libraries
• Model implementation      learn how to create a project by defining the model using the Graphical User Interface (GUI) and/or an external model file
• Outputs and Tables     learn how to define outputs and create tables with selected outputs of the model
• Residual error model      learn how to use the predefined residual error models
• Handling censored data      learn how to handle easily and properly censored data, i.e. data below (resp. above) a lower (resp.upper) limit of quantification (LOQ) or detection (LOD).
• Mixture of structural models      learn how to implement between subject mixture models (BSMM) and within subject mixture models (WSMM).
• Time-to-event data model      learn how to implement a model for (repeated) time-to-event data
• Count data model      learn how to implement a model for count data, including hidden Markov model.
• Categorical data model      learn how to implement a model for categorical data, assuming either independence or a Markovian dependence between observations.

### 2.1.1.Libraries of models

Objectives: learn how to use the Monolix libraries of PKPD models and create your own libraries.

Projects: warfarinPKlibrary_project, PDsim_project, warfarinPKPDlibrary_project, warfarinPKPDelibrary_project, hcv_project

## The PK library

• theophylline_project (data = ‘theophylline_data.txt’ , model=’lib:oral1_1cpt_kaVCl.txt’)

A one compartment PK model for oral administration with first order absorption and linear elimination is used for fitting the theophylline data. The model oral1_1cpt_TlagkaVCl from the PK library is used:

Click on the button Model file to select or modify a model. Then, select the PK library from the menu Monolix

Click on the Edit button to edit the model:

[LONGITUDINAL]
input = {ka, V, Cl}
EQUATION:
Cc = pkmodel(ka, V, Cl)
OUTPUT:
output = Cc


The name of the PK model appears in the section [LONGITUDINAL] of the project file

<MODEL>
[INDIVIDUAL]
input = {ka_pop, omega_ka, V_pop, omega_V, Cl_pop, omega_Cl}
DEFINITION:
ka = {distribution=lognormal, typical=ka_pop, sd=omega_ka}
V = {distribution=lognormal, typical=V_pop, sd=omega_V}
Cl = {distribution=lognormal, typical=Cl_pop, sd=omega_Cl}
[LONGITUDINAL]
input = a
file = 'lib:oral1_1cpt_kaVCl.txt'
DEFINITION:
concentration = {distribution=normal, prediction=Cc, errorModel=constant(a)}


## The PD library

• PDsim_project (data = ‘PDsim_data.txt’ , model=’lib:immed_Emax_const.txt’)

An Emax PD model with constant baseline is used for this project.

## Link a PK model and a PD model

• warfarinPKlibrary_project (data = ‘warfarin_data.txt’ , model=’lib:oral1_1cpt_TlagkaVCl.txt’)

We first fit the PK data of the warfarin data using a model from the PK library

• warfarinPKPDlibrary_project (data = ‘warfarin_data.txt’ , model= {‘lib:oral1_1cpt_TlagkaVCl.txt’, ‘lib:immed_Imax_const.txt’})

To fit simultaneously the PK and the PD warfarin data, we need to link the PK model with a PD model. Click on the +

and select a model from the PD library. We will use here the immediate response model immed_Imax_const from the Monolix PD library

[LONGITUDINAL]
input = {Cc, Imax, IC50, S0}
Cc = {use=regressor}
EQUATION:
E = S0*(1 - Imax*Cc/(Cc+IC50))
OUTPUT:
output = E


Two models then appear in the list of model files:

## Link a PK model and a PD model with an effect compartment

• warfarinPKPDelibrary_project (data = ‘warfarin_data.txt’ , model= {‘lib:oral1_1cpt_TlagkaVClke0.txt’, ‘lib:immed_Imax_const_Ce.txt’})

Inserting an effect compartment is made possible by using the PKe library

and the PDe library

Then, these 2 models are linked for fitting simultaneously the PK and the PD warfarin data.

## Create and use your own libraries

• 8.case_studies/hcv_project (data = ‘hcv_data.txt’ , model=’VKlibrary/hcvNeumann98_model.txt’)

You can use the mlxEditor, or any other text editor, for creating new models and new libraries of models. A basic library which includes several viral kinetics model is available in 8.case_studies/model. Click on Other list for selecting this library or any other existing library of models.

### 2.1.2.Model implementation

Objectives: learn how to create a project by defining the model using the Graphical User Interface (GUI) and/or an external model file.

Projects: theo1_project, theo2_project, theo3_project, theo4_project

## Use an external file for the structural model

• theo1_project (data = ‘theophylline_data.txt’ , model= {‘theo1a_model.txt’, ‘theo1b_model.txt’})

In this example, the structural model is implemented in a external file.
You can edit this model by clicking on the mlxEditor button

theo1a_model.txt is a text file which contains an OUTPUT section where Cc is defined as the prediction for the observations

[LONGITUDINAL]
input = {ka, V, Cl}
EQUATION:
Cc=pkmodel(ka, V, Cl)
OUTPUT:
output = Cc


The other components of the model (the residual error model and the model for the individual PK parameters) are defined in the graphical user interface (GUI).

Instead of theo1a_model.txt, we can use theo1b_model.txt where no output is defined.

[LONGITUDINAL]
input = {ka, V, Cl}
EQUATION:
Cc=pkmodel(ka, V, Cl)


For changing the structural model, click on the button Model file and select the new model in the list

Since no output is defined in this model, the name of the variable used as a prediction for the observations should be provided in a GUI:

The project can be saved and used equivalently either with theo1a_model.txt or theo1b_model.txt.

## Use an external file for the structural model and the residual error model

• theo2_project (data = ‘theophylline_data.txt’ , model= {‘theo2a_model.txt’, ‘theo2b_model.txt’})

Both the structural model and the residual error model are defined in theo2a_model.txt

[LONGITUDINAL]
input = {ka, V, Cl, a}
EQUATION:
Cc=pkmodel(ka, V, Cl)
DEFINITION:
y1 = {distribution=normal, prediction=Cc, errorModel=constant(a) }
OUTPUT:
output = y1


Then, the residual error model cannot be modified anymore in the main GUI

The same model for the observations is implemented in theo2b_model.txt, but where no output is defined (y_1 should defined as the output in the GUI).

## Use an external file for the complete model

• theo3_project (data = ‘theophylline_data.txt’ , model= ‘theo3a_model.txt’)

The complete model is now defined in theo3a_model.txt

[COVARIATE]
input = {WEIGHT}
EQUATION:
lw70 = log(WEIGHT/70)
[INDIVIDUAL]
input = {Cl_pop, omega_Cl, V_pop, omega_V, ka_pop, beta_V_lw70, lw70, omega_ka}
DEFINITION:
Cl = {distribution=lognormal, typical=Cl_pop, sd=omega_Cl}
V = {distribution=lognormal, typical=V_pop, covariate=lw70, coefficient=beta_V_lw70, sd=omega_V}
ka = {distribution=lognormal, typical=ka_pop, sd=omega_ka}
[LONGITUDINAL]
input = {ka, V, Cl, a}
EQUATION:
Cc=pkmodel(ka, V, Cl)
DEFINITION:
y1 = { distribution=normal, prediction=Cc, errorModel=constant(a) }
OUTPUT:
output = y1


It is therefore not possible anymore to modify any component of the model from the GUI (all the graphical components are disabled).

## Define the complete model in the project file

• theo4_project (data = ‘theophylline_data.txt’, no model file)

The complete model is defined in the project itself. You can open the mlxEditor to edit the project itself

Section <MODEL> of the project contains the complete model.
Remark: Exactly the same syntax is used when the model – or a component of the model – is defined in the project or in an external file

### 2.1.3.Outputs and Tables

Objectives: learn how to define outputs and create tables from the outputs of the model.

Projects: tgi1_project, tgi2_project

• tgi1_project (data = ‘tgi_data.txt’ , model=’tgi1_model.txt’)

We use the Tumor Growth Inhibition (TGI) model proposed by Ribba *et al.* in this example (Ribba, B., Kaloshi, G., Peyre, M., Ricard, D., Calvez, V., Tod, M., . & Ducray, F., *A tumor growth inhibition model for low-grade glioma treated with chemotherapy or radiotherapy*. Clinical Cancer Research, 18(18), 5071-5080, 2012.)

[LONGITUDINAL]
input = {K, KDE, KPQ, KQPP, LAMBDAP, GAMMA, DELTAQP,PT0,Q0}
PK:
depot(target=C)
EQUATION:
t0=0
PT_0=PT0
Q_0 = Q0
PSTAR = PT+Q+QP
ddt_C = -KDE*C
ddt_PT = LAMBDAP*PT*(1-PSTAR/K) + KQPP*QP - KPQ*PT - GAMMA*KDE*PT*C
ddt_Q = KPQ*PT - GAMMA*KDE*Q*C
ddt_QP = GAMMA*KDE*Q*C - KQPP*QP - DELTAQP*QP


PSTAR is the tumor size predicted by the model. It is therefore used as a prediction for the observations in the project:

[LONGITUDINAL]
input = b
file = 'model/tgi1_model.txt'
DEFINITION:
y1 = {distribution=normal, prediction=PSTAR, errorModel=proportional(b)}


We can select which tables and graphics to save with the button Outputs to save

We select in this example the tables with predicted values computed at the observation times and on a finer grid (used for the graphics of individual fits)

The sequence of tasks selected in this project includes the estimation of the individual parameters (using the conditional models) and several diagnostic plots (including the individual fits).

Thus, individual predictions of the tumor size PSTAR are computed using both the conditional modes (indPred_mode) and the conditional means estimated during the last iterations of SAEM (indPred_mean*) and saved in table predictions.txt

and finegrid.txt

Remark: the same model file tgi1_model.txt can be used with different tools, including Mlxplore or Simulx (see this Shiny application for instance).

• tgi2_project (data = ‘tgi_data.txt’ , model=’tgi2_model.txt’)

We can save in the tables additional variables defined in the model, such as PT, Q and QP, for instance, by adding a block OUTPUT: in the model file:

OUTPUT:
output = PSTAR
table   = {PT, Q, QP}


Tables predictions.txt and finegrid.txt now include the predicted values of these variables for each individual (columns PT_mean*, Q_mean*, QP_mean*, PT_mode, Q_mode, QP_mode, PT_pop, Q_pop, QP_pop).

### 2.2.1.Residual error model

Objectives: learn how to use the predefined residual error models.

Projects: warfarinPKlibrary_project, bandModel_project, autocorrelation_project, errorGroup_project

## Introduction

For continuous data, we are going to consider scalar outcomes ($y_{ij}\in \mathbb{R}$) and assume the following general model:

$y_{ij}=f(t_{ij},\psi_i)+ g(t_{ij},\psi_i,\xi_i)\varepsilon_{ij}$

for i from 1 to N, and j from 1 to $\text{n}_{i}$, where $\psi_i$ is the parameter vector of the structural model f for individual i. The residual error model is defined by the function g which depend on some additional vector of parameters $\xi$. The residual errors $(\varepsilon_{ij})$ are standardized Gaussian random variables (mean 0 and standard deviation 1). In this case, it is clear that $f(t_{ij},\psi_i)$ and $g(t_{ij},\psi_i,\xi)$ are the conditional mean and standard deviation of $y_{ij}$, i.e.,

\begin{aligned}\mathbb{E}(y_{ij} | \psi_i) &= f(t_{ij},\psi_i) \\\textrm{sd}(y_{ij} | \psi_i) &= g(t_{ij},\psi_i,\xi)\end{aligned}

The following error models are available in Monolix:

• constant : $y = f + a\, \varepsilon$ and $\xi=a$
• proportional : $y = f + b\,f\, \varepsilon$ and $\xi=b$
• combined1 : $y = f + (a+ b\,f) \varepsilon$ and $\xi=(a,b)$
• combined2 : $y = f + \sqrt{a^2+ b^2\,f^2} \varepsilon$ and $\xi=(a,b)$
• proportionalc : $y = f + b\,f^c\, \varepsilon$ and $\xi=(b,c)$
• combined1c : $y = f + (a+ b\,f^c) \varepsilon$ and $\xi=(a,b,c)$
• combined2c : $y = f + \sqrt{a^2+ b^2\,f^{2c}} \varepsilon$ and $\xi=(a,b,c)$

The assumption that the distribution of any observation $y_{ij}$ is symmetrical around its predicted value is a very strong one. If this assumption does not hold, we may want to transform the data to make it more symmetric around its (transformed) predicted value. In other cases, constraints on the values that observations can take may also lead us to transform the data.
Model can be extended to include a transformation of the data:

$u(y_{ij})=u(f(t_{ij},\psi_i))+ a\varepsilon_{ij}$

where is a monotonic transformation (a strictly increasing or decreasing function). As we can see, both the data $y_{ij}$ and the structural model are transformed by the function u so that $f(t_{ij},\psi_i)$ remains the prediction of $y_{ij}$. Several error models based on such transformation are available in Monolix:

• exponential : $u(y) = \log(y)$ (assuming that $y >0$) and $\xi=a$. This is equivalent to assume that $y_{ij}=f(t_{ij},\psi_i))e^{ a\varepsilon_{ij}}$.
• logit : $u(y) = \textrm{logit}(y)=\log(y/(1-y))$ (assuming that $0\leq y\leq 1$).
• band(0,10) : $u(y) = \textrm{logit}(y/10)=\log(y/(10-y))$ (assuming that $0).
• band(0,100) : $u(y) = \textrm{logit}(y/100)=\log(y/(100-y))$ (assuming that $0).

## Defining the residual error model from the Monolix GUI

• A menu in the frame Data and model of the main GUI allows one to select the residual error model:

• a summary of the statistical model which includes the residual error model can be displayed:

• when a combined 1 error model is used, we can force parameter b to be positive by ticking the box b>0:

• Autocorrelation of the residual errors is estimated when the checkbox r is ticked:

## Some basic residual error models

• warfarinPKlibrary_project (data = ‘warfarin_data.txt’, model = ‘lib:oral1_1cpt_TlagkaVCl.txt’)

The residual error model used with this project for fitting the PK of warfarin is a constant error model

$y_{ij} = f(t_{ij},\psi_i))+ a\varepsilon_{ij}$

Several diagnostic plots can then be used for evaluating the error model. Here, the VPC’s suggests that a proportional component perhaps should be included in the error model:

We can modify the residual error model and select, for instance, a combined1 error model directly from the menu in the GUI:

Estimated population parameters now include a and b

VPCs obtained with this error model do not show any mispecification

Remarks:

• When the residual error model is defined in the GUI, a bloc DEFINITION: is then automatically added to the project file in the section [LONGITUDINAL] of <MODEL> when the project is saved:
DEFINITION:
y1 = {distribution=normal, prediction=Cc, errorModel=combined(a,b)}

• the statistical summary includes the residual error model

## Residual error models for bounded data

• bandModel_project (data = ‘bandModel_data.txt’, model = ‘lib:immed_Emax_null.txt’)

In this example, data are known to take their values between 0 and 100. We can use a band(0,100) error model if we want to take this constraint into account.

VPCs obtained with this error model do not show any mispecification

The statistical summary includes this residual error model:

This residual error model is implemented in Mlxtran as follows:

DEFINITION:
effect = {distribution=logitnormal, min=0, max=100, prediction=E, errorModel=constant(a)}


## Autocorrelated residuals

For any subject i, the residual errors $(\varepsilon_{ij},1\leq j \leq n_i)$ are usually assumed to be independent random variables. The extension to autocorrelated errors is possible by assuming, that $(\varepsilon_{ij})$ is a stationary autoregressive process of order 1, AR(1), which autocorrelation decreases exponentially:

$\textrm{corr}(\varepsilon_{ij},\varepsilon_{i,{j+1}}) = r_i^{(t_{i,j+1}-t_{ij})}$

where $0\leq r_i \leq 1$ for each individual i. If $t_{ij}=j$ for any (i,j), then $t_{i,j+1}-t_{i,j}=1$ and the autocorrelation function $\gamma_i$ for individual i is given by

\begin{aligned}\gamma_i(\tau) &= {\rm corr}(\varepsilon_{ij},\varepsilon_{i,j+\tau}) \\&= r_i^{\tau} .\end{aligned}

The residual errors are uncorrelated when $r_i=0$.

• autocorrelation_project (data = ‘autocorrelation_data.txt’, model = ‘lib:infusion_1cpt_Vk.txt’)

Autocorrelation is estimated since the checkbox r is ticked in this project:

Estimated population parameters now include the autocorrelation r:

Remarks:

• Monolix accepts both regular and irregular time grids.
• Rich data are required (i.e. a large number of time points per individual) for estimating properly the autocorrelation structure of the residual errors.

## Using different error models per group/study

• errorGroup_project (data = ‘errorGroup_data.txt’, model = ‘errorGroup_model.txt’)

Data comes from 3 different studies in this example:

We want to use different error models for the 3 studies. A solution consists in defining the column STUDY with the reserved keyword YTYPE. It will be then possible to define one error model per outcome:

We use here the same PK model for the 3 studies:

[LONGITUDINAL]
input = {V, k}
PK:
Cc1 = pkmodel(V, k)
Cc2 = Cc1
Cc3 = Cc1
OUTPUT:
output = {Cc1, Cc2, Cc3}


Since 3 outputs are defined in the structural model, we can now define 3 error models in the GUI:

Different residual error parameters are estimated for the 3 studies. We can remark than, even if 2 proportional error models are used for the 2 first studies, different parameters $b_1$ and $b_2$ are estimated:

### 2.2.2.Handling censored (BLQ) data

Objectives: learn how to handle easily and properly censored data, i.e. data below (resp. above) a lower (resp.upper) limit of quantification (LOQ) or below a limit of detection (LOD).

Projects: censored1log_project, censored1_project, censored2_project, censored3_project, censored4_project

## Introduction

Censoring occurs when the value of a measurement or observation is only partially known. For continuous data measurements in the longitudinal context, censoring refers to the values of the measurements, not the times at which they were taken. For example, the lower limit of detection (LLOD) is the lowest quantity of a substance that can be distinguished from its absence. Therefore, any time the quantity is below the LLOD, the “observation” is not a measurement but the information that the measured quantity is less than the LLOD. Similarly, in longitudinal studies of viral kinetics, measurements of the viral load below a certain limit, referred to as the lower limit of quantification (LLOQ), are so low that their reliability is considered suspect. A measuring device can also have an upper limit of quantification (ULOQ) such that any value above this limit cannot be measured and reported.
As hinted above, censored values are not typically reported as a number, but their existence is known, as well as the type of censoring. Thus, the observation $y^{(r)}_{ij}$ (i.e., what is reported) is the measurement $y_{ij}$ if not censored, and the type of censoring otherwise.
We usually distinguish between three types of censoring: left, right and interval. In each case, the SAEM algorithm implemented in Monolix properly computes the maximum likelihood estimate of the population parameters, combining all the infomation provided by censored and non censored data.

## Theory

In the presence of censored data, the conditional density function needs to be computed carefully. To cover all three types of censoring (left, right, interval), let $I_{ij}$ be the (finite or infinite) censoring interval existing for individual i at time $t_{ij}$. Then,

$\displaystyle p(y^{(r)}|\psi)=\prod_{i=1}^{N}\prod_{j=1}^{n_i}p(y_{ij}|\psi_i)^{1_{y_{ij}\notin I_{ij}}}\mathbb{P}(y_{ij}\in I_{ij}|\psi_i)^{1_{y_{ij}\in I_{ij}}}$

where

$\displaystyle \mathbb{P}(y_{ij}\in I_{ij}|\psi_i)=\int_{I_{ij}} p_{y_{ij}|\psi_i} (u|\psi_i)du$

We see that if $y_{ij}$ is not censored (i.e. $1_{y_{ij}\notin I_{ij}}=1$), its contribution to the likelihood is the usual $p(y_{ij}|\psi_i)$, whereas if it is censored, the contribution is $\mathbb{P}(y_{ij}\in I_{ij}|\psi_i)$.

For the calculation of the likelihood, this is equivalent to the M3 method in NONMEM when only the CENS column is given, and to the M4 method when both a CENS column and a LIMIT column are given.

## PK data below a lower limit of quantification

### Left censored data

• censored1log_project (data = ‘censored1log_data.txt’, model = ‘pklog_model.txt’)

PK data are log-concentration in this example. The limit of quantification of 1.8 mg/l for concentrations becomes log(1.8)=0.588 for log-concentrations. Column of observations (Y) contains either the LLOQ for data below the limit of quantification (BLQ data) or the measured log-concentrations for non BLQ data. Furthermore, Monolix uses an additional column CENS to indicate if an observation is left censored (CENS=1) or not (CENS=0). In this example, subject 1 has two BLQ data at time 24h and 30h (the measured log-concentrations were below 0.588 at these times):

Monolix then recognized this keyword CENS:

The plot of individual fits displays BLQ and non BLQ data together with the predicted log-concentrations on the whole time interval:

For diagnostic plots such as VPC, residuals of observations versus predictions, Monolix proposes to sample the BLQ data from the conditional distribution

$p(y^{BLQ} | y^{non BLQ}, \hat{psi}, \hat{\theta})$

where $\hat{\theta}$ and $\hat{\psi}$ are the estimated population and individual parameters. This is the most efficient way to take into account the complete information provided by the data and the model for diagnostic plots such as VPCs:

A strong bias appears if LLOQ is used instead for the BLQ data:

Ignoring the BLQ data entails a loss of information:

Imputed BLQ data is also used for residuals:

and for observations versus predictions:

A strong bias appears if LLOQ is used instead for the BLQ data for these two diagnostic plots:

while ignoring the BLQ data entails a loss of information:

Diagnostic plot BLQ plots the cumulative fraction of BLQ data (green line) with a 90

### Interval censored data

• censored1_project (data = ‘censored1_data.txt’, model = ‘lib:oral1_1cpt_kaVk.txt’)

We use the original concentrations in this project. Then, BLQ data should be treated as interval censored data since a concentration is know to be positive. In other word, a data reported as BLQ data means that the (non reported) measured concentration is between 0 and 1.8mg/l. An additional column LIMIT reports the lower limit of the censored interval (0 in this example):

Remark: if this column is missing, then BLQ data is assumed to be left-censored data that can take any positive and negative value below LLOQ.

Monolix recognized this keyword LIMIT:

Monolix will use this additional information for properly estimating the parameters of the model and imputing the BLQ data for the diagnostic plots.

## PK data below a lower limit of quantification or below a limit of detection

• censored2_project (data = ‘censored2_data.txt’, model = ‘lib:oral1_1cpt_kaVk.txt’)

Several censoring processes can be combined in the same data set, assuming different limits. We may know, for instance that a PK data is either below a limit of detection ($y_{ij} <$LLOD=1.2) or between the limit of detection and the limit of quantification (LLOD=1.2$y_{ij} <$LLOQ=1.8). Column LIMIT and Y report respectively the lower and upper limits of the censoring intervals:

Plot of individual fits now displays LLOQ or LLOD with a red star when a PK data is censored:

## PK data below a lower limit of quantification and PD data above an upper limit of quantification

• censored3_project (data = ‘censored3_data.txt’, model = ‘pkpd_model.txt’)

We work with PK and PD data in this project. We assume that the PD data may be right censored and that the upper limit of quantification is ULOQ=90. We use CENS=-1 to indicate that an observation is right censored. In such case, the PD data can take any value above the upper limit reported in column Y (here YTYPE=1 and YTYPE=2 are used respectively for PK and PD data):

Plot of individual fits for the PD data now displays ULOQ and the predicted PD profile:

We can display the cumulated fraction of censored data both for the PK and the PD data:

## Combination of interval censored PK and PD data

• censored4_project (data = ‘censored4_data.txt’, model = ‘pkpd_model.txt’)

We assume in this example

• 2 different censoring intervals(0,1) and (1.2, 1.8) for the PK,
• a censoring interval (80,90) and right censoring (>90) for the PD.

Combining columns CENS, LIMIT and Y allow to combine efficiently these different censoring processes:

This coding of the data means that, for subject 1,

• PK data is between 0 and 1 at time 30h,
• PK data is between 1.2 and 1.8 at times 0.5h and 24h,
• PD data is between 80 and 90 at times 12 and 16,
• PD data is above 90 at times 4 and 8.

Plot of individual fits for the PK and the PD data displays the different limits of these censoring intervals:

Other diagnostic plots, such as the plot of observations versus predictions, adequately use imputed censored PK and PD data:

## Case studies

• 8.case_studies/hiv_project (data = ‘hiv_data.txt’, model = ‘hivLatent_model.txt’)
• 8.case_studies/hcv_project (data = ‘hcv_data.txt’, model = ‘hcvNeumann98_model_latent.txt’)

### 2.2.3.Mixture of structural models

Objectives: learn how to implement between subject mixture models (BSMM) and within subject mixture models (WSMM).

Projects: bsmm1_project, bsmm2_project, wsmm_project

## Introduction

There exist several types of mixture models useful in the context of mixed effects models. It may be necessary in some situations to introduce diversity into the structural models themselves:

• Between-subject model mixtures (BSMM) assume that there exist subpopulations of individuals. Different structural models describe the response of each subpopulation, and each subject belongs to one of these subpopulations. One can imagine for example different structural models for responders, nonresponders and partial responders to a given treatment.

The easiest way to model a finite mixture model is to introduce a label sequence $(z_i ; 1\leq i \leq N)$ that takes its values in $\{1,2,\ldots,M\}$ such that $z_i = m$ if subject i belongs to subpopulation m$\mathbb{P}(z_i=m)$ is the probability for subject i to belong to subpopulation m. A BSMM assumes that the structural model is a mixture of M different structural models:

$$f\left( t_{ij};\psi_i,z_i \right) = \sum_{m=1}^M 1_{z_i = m} f_m\left( t_{ij};\psi_i \right)$$

In other word, each subpopulation has its own structural model: $f_m$ is the structural model for subpopulation m.

• Within-subject model mixtures (WSMM) assume that there exist subpopulations (of cells, viruses, etc.) within each patient. In this case, different structural models can be used to describe the response of different subpopulations, but the proportion of each subpopulation depends on the patient.

Then, it makes sense to consider that the mixture of models happens within each individual. Such within-subject model mixtures require additional vectors of individual parameters $\pi_i=(\pi_{1,i},\ldots, \pi_{M,i})$ representing the proportions of the M models within each individual i:

$f\left( t_{ij};\psi_i,z_i \right) = \sum_{m=1}^M \pi_{m,i} f_m\left( t_{ij};\psi_i \right) .$

The proportions $(\pi_{m,i})$ are now individual parameters in the model and the problem is transformed into a standard mixed effects model. These proportions are assumed to be positive and to sum to 1 for each patient.

## Between subject mixture models

### Supervised learning

• bsmm1_project (data = ‘pdmixt1_data.txt’, model = ‘bsmm1_model.txt’)

We consider a very simple example here with two subpopulations of individuals who receive a given treatment. The outcome of interest is the measured effect of the treatment (a viral load for instance). The two populations are non responders and responders. We assume here that the status of the patient is known. Then, the data file contains an additional column GROUP. This column is duplicated because Monolix uses it

• i) as a regression variable (X): it is used in the model to distinguish responders and non responders,
• ii) as a categorical covariate (CAT): it is used to stratify the diagnostic plots.

We can then display the data

and use the categorical covariate GROUP_CAT to split the plot into responders and non responders:

We use different structural models for non responders and responders. The predicted effect for non responders is constant:

$f_1(t) = A_1$

while the predicted effect for responders decreases exponentially:

$f_2(t) = A_2 e^{-k\, t}$

The model is implemented in the model file bsmm1_model.txt (remark that the names of the regression variable in the data file and in the model script do not need to match):

[LONGITUDINAL]
input = {A1, A2, k, g}
g = {use=regressor}
EQUATION:
if g==1
f = A1
else
f = A2*exp(-k*max(t,0))
end
OUTPUT:
output = f


The plot of individual fits exhibit the two different structural models:

VPCs should then be splitted according to the GROUP_CAT

as well as the prediction distribution for non responders and responders:

### Unsupervised learning

• bsmm2_project (data = ‘pdmixt2_data.txt’, model = ‘bsmm2_model.txt’)

The status of the patient is unknown in this project (which means that the column GROUP is not available anymore):

Let p be the proportion of non responders in the population. Then, the structural model for a given subject is $f_1$ with probability p and $f_2$ with probability 1-p. The structural model is therefore a BSMM:

[LONGITUDINAL]
input = {A1, A2, k, p}
EQUATION:
f1 = A1
f2 = A2*exp(-k*max(t,0))
f=bsmm(f1, p, f2, 1-p)
OUTPUT:
output = f


Important: p is a population parameter of the model to estimate. There is no inter-patient variability on p: all the subjects have the same probability to be a non responder in this example. We use a logit-normal distribution for p  in order to constrain it to be between 0 and 1, but without variability:

p is estimated with the other population parameters:

The individual parameters are estimated using the conditional modes in this project:

Then, the group to which a patient belong is also estimated as the group of highest conditional probability:

\begin{aligned}\hat{z}_i &= 1~~~~\textrm{if}~~~~ \mathbb{P}(z_i=1 | (y_{ij}),\hat{\psi}_i,\hat{\theta})> \mathbb{P}(z_i=2 | (y_{ij}),\hat{\psi}_i,\hat{\theta})\\&=0~~~~\textrm{otherwise}\end{aligned}

The estimated groups can be displayed together with the individual parameters:

and can be used as a stratifying variable to split some plots such as VPC’s

## Within subject mixture models

• wsmm_project (data = ‘pdmixt2_data.txt’, model = ‘wsmm_model.txt’)

It may be too simplistic to assume that each individual is represented by only one well-defined model from the mixture. We consider here that the mixture of models happens within each individual and use a WSMM:

$f = pf_1+(1-p)f_2$

[LONGITUDINAL]
input = {A1, A2, k, p}
EQUATION:
f1 = A1
f2 = A2*exp(-k*max(t,0))
f=wsmm(f1, p, f2, 1-p)
OUTPUT:
output = f


Remark: Here, writing f=wsmm(f1, p, f2, 1-p) is equivalent to write f=p*f1 + (1-p)*f2
Important: Here, p is an individual parameter: the subjects have different proportions of non responder cells. We use a probit-normal distribution for p in order to constrain it to be between 0 and 1, with variability:

There is no latent covariate when using WSMM: mixtures are continuous mixtures. We therefore cannot split anymore the VPCs and the prediction distributions.

### 2.3.1.Time-to-event data models

Objectives: learn how to implement a model for (repeated) time-to-event data with different censoring processes.

Projects: tte1_project, tte2_project, tte3_project, tte4_project, rtteWeibull_project, rtteWeibullCount_project

## Introduction

Here, observations are the “times at which events occur”. An event may be one-off (e.g., death, hardware failure) or repeated (e.g., epileptic seizures, mechanical incidents, strikes). Several functions play key roles in time-to-event analysis: the survival, hazard and cumulative hazard functions. We are still working under a population approach here so these functions, detailed below, are thus individual functions, i.e., each subject has its own. As we are using parametric models, this means that these functions depend on individual parameters $(\psi_i)$.

• The survival function $S(t, \psi_i)$ gives the probability that the event happens to individual i after time $t>t_{\text{start}}$:

$S(t,\psi_i) \ \ = \ \ \mathbb{P}(T_i>t;\psi_i) .$

• The hazard function $h(t,\psi_i)$ is defined for individual i as the instantaneous rate of the event at time t, given that the event has not already occurred:

$h(t,\psi_i) \ \ = \ \ \lim_{dt\to 0} \frac{S(t,\psi_i) - S(t + dt,\psi_i)}{ S(t,\psi_i) \, dt} .$

This is equivalent to

$h(t,\psi_i) \ \ = \ \ -\frac{d}{dt} \log{S(t,\psi_i)} .$

• Another useful quantity is the cumulative hazard function $H(a,b;\psi_i)$, defined for individual i as

$H(a,b;\psi_i) \ \ = \ \ \int_a^b h(t,\psi_i) \, dt .$

Note that $S(t,\psi_i) \ \ = \ \ e^{-H(t_{\text{start}},t;\psi_i)}$. Then, the hazard function $h(t,\psi_i)$ characterizes the problem, because knowing it is the same as knowing the survival function $S(t,\psi_i)$. The probability distribution of survival data is therefore completely defined by the hazard function.

## Single event

To begin with, we will consider a one-off event. Depending on the application, the length of time to this event may be called the survival time (until death, for instance), failure time (until hardware fails), and so on. In general, we simply say “time-to-event”. The random variable representing the time-to-event for subject i is typically written Ti.

### Single event exactly observed or right censored

• tte1_project (data = tte1_data.txt , model=tte1_model.txt)

The event time may be exactly observed at time $t_i$, but if we assume that the trial ends at time $t_{\text{stop}}$, the event may happen after the end. This is “right censoring”. Here, Y=0 at time t means that the event happened after t and Y=1 means that the event happened at time t. The rows with t=0 are included to show the trial start time $t_{\text{start}}=0$:

By clicking on the button plot the data, it is possible to display the Kaplan Meier plot (i.e. the empirical survival function) before fitting any model:

A very basic model with constant hazard is used for this data:

[LONGITUDINAL]
input = Te
DEFINITION:
Event = { type = event,  hazard = 1/Te,  maxEventNumber = 1 }


Here, Te is the expected time to event. Specification of the maximum number of events is required both for the estimation procedure and for the diagnostic plots based on simulation, such as the predicted interval for the Kaplan Meier plot which is obtained by Monte Carlo:

### Single event interval censored or right censored

• tte2_project (data = tte2_data.txt , model=tte2_model.txt)

We may know the event has happened in an interval $I_i$ but not know the exact time $t_i$. This is interval censoring. Here, Y=0 at time t means that the event happened after t and Y=1 means that the event happened before time t.

Event for individual 1 happened between t=10 and t=15. No event was observed until the end of the experiment (t=100) for individual 5. We use the same basic model, but we need now to specify that the events are interval censored:

[LONGITUDINAL]
input = Te
DEFINITION:
Event = {type=event,
hazard=1/Te,
eventType=intervalCensored,
maxEventNumber=1,
intervalLength=5,       ; used for the graphics (not mandatory)
rightCensoringTime=200  ; used for the graphics (not mandatory)
}


## Repeated events

Sometimes, an event can potentially happen again and again, e.g., epileptic seizures, heart attacks. For any given hazard function h, the survival function S for individual i now represents the survival since the previous event at $t_{i,j-1}$, given here in terms of the cumulative hazard from $t_{i,j-1}$ to $t_{i,j}$:

$\begin{array}{ccl}S(t_{i,j} | t_{i,j-1};\psi_i) &=& \mathbb{P}(T_{i,j} > t_{i,j}\,|\,T_{i,j-1} = t_{i,j-1};\psi_i)\\&=&\exp(-H(t_{i,j-1},t_{i,j};\psi_i)) \\ &=&\exp(-\int_{t_{i,j-1}}^{t_{i,j}}h(t,\psi_i) dt) \end{array}$

### Repeated events exactly observed or right censored

• tte3_project (data = tte3_data.txt , model=tte3_model.txt)

A sequence of $n_i$ event times is precisely observed before $t_{\text{stop}}=200$:

We can then display the Kaplan Meier plot for the first event and the mean number of events per individual:

After fitting the model, prediction intervals for these two curves can also be displayed on the same graph.

### Repeated events interval censored or right censored

• tte4_project (data = tte4_data.txt , model=tte4_model.txt)

We do not know the exact event times, but the number of events that occurred for each individual in each interval of time:

## User defined likelihood function for time-to-event data

• rtteWeibull_project (data = rtte_data.txt , model=rtteWeibull_model.txt)

A Weibull model is used in this example:

EQUATION:
h = (beta/lambda)*(t/lambda)^(beta-1)
DEFINITION:
Event = {type=event,  hazard=h}

• rtteWeibullCount_project (data = rtteCount_data.txt , model=rtteWeibullCount_model.txt)

Instead of defining the data as events, it is possible to consider the data as count data: indeed, we count the number of events per interval. An additional column with the start of the interval is added in the data file and defined as a regression variable:

We then use a model for count data (see rtteWeibullCount_model.txt).

### 2.3.2.Count data model

Objectives: learn how to implement a model for count data.

Projects: count1a_project, count1a_project, count1a_project, count2_project, hmm0_project, hmm1_project

## Introduction

Longitudinal count data is a special type of longitudinal data that can take only nonnegative integer values {0, 1, 2, …} that come from counting something, e.g., the number of seizures, hemorrhages or lesions in each given time period . In this context, data from individual j is the sequence $y_i=(y_{ij},1\leq j \leq n_i)$ where $y_{ij}$ is the number of events observed in the jth time interval $I_{ij}$.
Count data models can also be used for modeling other types of data such as the number of trials required for completing a given task or the number of successes (or failures) during some exercise. Here, $y_{ij}$ is either the number of trials or successes (or failures) for subject i at time $t_{ij}$. For any of these data types we will then model $y_i=(y_{ij},1\leq j \leq n_i)$ as a sequence of random variables that take their values in {0, 1, 2, …}.  If we assume that they are independent, then the model is completely defined by the probability mass functions $\mathbb{P}(y_{ij}=k)$ for $k \geq 0$ and $1 \leq j \leq n_i$. Here, we will consider only parametric distributions for count data.

## Count data with constant distribution over time

• count1a_project (data = ‘count1_data.txt’, model = ‘count_library/poisson_mlxt.txt’)

A Poisson model is used for fitting the data:

[LONGITUDINAL]
input = lambda
DEFINITION:
Y = { type = count,  log(P(Y=k)) = -lambda + k*log(lambda) - factln(k) }


Residuals for noncontinuous data reduce to NPDE’s. We can compare the empirical distribution of the NPDE’s with the distribution of a standardized normal distribution:

VPC’s for count data compare the observed and predicted frequencies of the categorized data over time:

• count1b_project (data = ‘count1_data.txt’, model = ‘count_library/poissonMixture_mlxt.txt’)

A mixture of 2 Poisson distributions is used to fit the same data.

## Count data with time varying distribution

• count2_project (data = ‘count2_data.txt’, model = ‘count_library/poissonTimeVarying_mlxt.txt’)

The distribution of the data changes with time in this example:

We then use a Poisson distribution with a time varying intensity:

[LONGITUDINAL]
input =  {a,b}

EQUATION:
lambda= a*exp(-b*t)

DEFINITION:
y = {type=count, P(y=k)=exp(-lambda)*(lambda^k)/factorial(k)}


This model seems to fit the data very well:

## Hidden Markov model for count data

Markov chains are a useful tool for analyzing categorical longitudinal data. However, sometimes the Markov process cannot be directly observed and only some output, dependent on the (hidden) state, is seen. More precisely, we assume that the distribution of this observable output depends on the underlying hidden state. Such models are called hidden Markov models. An HMM is thus defined for individual $i$ as a pair of processes $(z_{ij},y_{ij},\, j=1,2,\ldots)$, where the latent sequence $(z_{ij})$ is a Markov chain and the distribution of observation $y_{ij}$ at time $t_{ij}$ depends on state $z_{ij}$.
In the following example, the latent sequence $(z_{ij})$ take its values in $\{1, 2\}$. When $z_{ij}=1$ (resp. $z_{ij}=2$), $y_{ij}$ is a Poisson random variable with parameter $\lambda_1$ (resp. $\lambda_2$).
Remark: Models in the 2 following examples are implemented in Matlab (implementation with Mlxtran is not possible with the current version of Monolix). These models can be used with both the StandAlone and Matlab versions of Monolix, but they can only be modified with the Matlab version.

• hmm0_project (data = ‘hmm_data.txt’, model = ‘count_library/hmm0_mlx.m’)

The latent sequence $(z_{ij})$ is a sequence of independent and identically distributed random variables.

• hmm1_project (data = ‘hmm_data.txt’, model = ‘count_library/hmm1_mlx.m’)

The latent sequence $(z_{ij})$ is a (hidden) Markov chain.

### 2.3.3.Categorical data model

Objectives: learn how to implement a model for categorical data, assuming either independence or a Markovian dependence between observations.

Projects: categorical1_project, categorical2_project, markov0_project, markov1a_project, markov1b_project, markov1c_project, markov2_project, markov3a_project, markov3b_project

## Introduction

Assume now that the observed data takes its values in a fixed and finite set of nominal categories $\{c_1, c_2,\ldots , c_K\}$. Considering the observations $(y_{ij},\, 1 \leq j \leq n_i)$ for any individual $i$ as a sequence of conditionally independent random variables, the model is completely defined by the probability mass functions $\mathbb{P}(y_{ij}=c_k | \psi_i)$ for $k=1,\ldots, K$ and $1 \leq j \leq n_i$. For a given $(i,j)$, the sum of the $K$ probabilities is 1, so in fact only $K-1$ of them need to be defined. In the most general way possible, any model can be considered so long as it defines a probability distribution, i.e., for each $k$, $\mathbb{P}(y_{ij}=c_k | \psi_i) \in [0,1]$, and $\sum_{k=1}^{K} \mathbb{P}(y_{ij}=c_k | \psi_i) =1$. Ordinal data further assume that the categories are ordered, i.e., there exists an order $\prec$ such that

$c_1 \prec c_2,\prec \ldots \prec c_K .$

We can think, for instance, of levels of pain (low $\prec$ moderate $\prec$ severe) or scores on a discrete scale, e.g., from 1 to 10. Instead of defining the probabilities of each category, it may be convenient to define the cumulative probabilities $\mathbb{P}(y_{ij} \preceq c_k | \psi_i)$ for $k=1,\ldots ,K-1$, or in the other direction: $\mathbb{P}(y_{ij} \succeq c_k | \psi_i)$ for $k=2,\ldots, K$. Any model is possible as long as it defines a probability distribution, i.e., it satisfies

$0 \leq \mathbb{P}(y_{ij} \preceq c_1 | \psi_i) \leq \mathbb{P}(y_{ij} \preceq c_2 | \psi_i)\leq \ldots \leq \mathbb{P}(y_{ij} \preceq c_K | \psi_i) =1 .$

It is possible to introduce dependence between observations from the same individual by assuming that $(y_{ij},\,j=1,2,\ldots,n_i)$ forms a Markov chain. For instance, a Markov chain with memory 1 assumes that all that is required from the past to determine the distribution of $y_{ij}$ is the value of the previous observation $y_{i,j-1}$., i.e., for all $k=1,2,\ldots ,K$,

$\mathbb{P}(y_{ij} = c_k\,|\,y_{i,j-1}, y_{i,j-2}, y_{i,j-3},\ldots,\psi_i) = \mathbb{P}(y_{ij} = c_k | y_{i,j-1},\psi_i).$

## Ordered categorical data

• categorical1_project (data = ‘categorical1_data.txt’, model = ‘categorical1_model.txt’)

In this example, observations are ordinal data that take their values in {0, 1, 2, 3}:

• Cumulative odds ratio are used in this example to define the model

$\textrm{logit}(\mathbb{P}(y_{ij} \leq k))= \log \left( \frac{\mathbb{P}(y_{ij} \leq k)}{1 - \mathbb{P}(y_{ij} \leq k )} \right)$

where

$\begin{array}{ccl} \text{logit}(\mathbb{P}(y_{ij} \leq 0)) &=& \theta_{i,1}\\\text{logit}(\mathbb{P}(y_{ij} \leq 1)) &=& \theta_{i,1}+\theta_{i,2}\\\text{logit}(\mathbb{P}(y_{ij} \leq 2)) &=& \theta_{i,1}+\theta_{i,2}+\theta_{i,3}\\\end{array}$

This model is implemented in categorical1_model.txt:

[LONGITUDINAL]
input = {th1, th2, th3}
EQUATION:
lgp0 = th1
lgp1 = lgp0 + th2
lgp2 = lgp1 + th3
DEFINITION:
level = { type = categorical,  categories = {0, 1, 2, 3},
logit(P(level<=0)) = th1
logit(P(level<=1)) = th1 + th2
logit(P(level<=2)) = th1 + th2 + th3
}


A normal distribution is used for $\theta_{i,1}$, while log-normal distributions for $\theta_{2}$ and $\theta_{3}$ ensure that these parameters are positive (even without variability)
Residuals for noncontinuous data reduce to NPDE’s. We can compare the empirical distribution of the NPDE’s with the distribution of a standardized normal distribution:

VPC’s for categorical data compare the observed and predicted frequencies of each category over time:

The prediction distribution can also be computed by Monte-Carlo:

## Ordered categorical data with regression variables

• categorical2_project (data = ‘categorical2_data.txt’, model = ‘categorical2_model.txt’)

A proportional odds model is used in this example, where PERIOD and DOSE are used as regression variables (i.e. time-varying covariates)

## Discrete-time Markov chain

If observation times are regularly spaced (constant length of time between successive observations), we can consider the observations $(y_{ij},\,j=1,2,\ldots,n_i)$ to be a discrete-time Markov chain.

• markov0_project (data = ‘markov1a_data.txt’, model = ‘markov0_model.txt’)

In this project, states are assumed to be independent and identically distributed:

$\mathbb{P}(y_{ij} = 1) = 1 - \mathbb{P}(y_{ij} = 2) = p_{i,1}$

Observations in markov1a_data.txt take their values in {1, 2}. Before fitting any model to this data, it is possible to display the empirical transitions between states by selecting Transition probabilities in the list of selected plots.

We can see that the probability to be in state 1 or 2 clearly depends on the previous state. The proposed model should therefore be discarted.

• markov1a_project (data = ‘markov1a_data.txt’, model = ‘markov1a_model.txt’)

Here,

\begin{aligned}\mathbb{P}(y_{i,j} = 1 | y_{i,j-1} = 1) = 1 - \mathbb{P}(y_{i,j} = 2 | y_{i,j-1} = 1) = p_{i,11} \\\mathbb{P}(y_{i,j} = 1 | y_{i,j-1} = 2) = 1 - \mathbb{P}(y_{i,j} = 2 | y_{i,j-1} = 2) = p_{i,12} \\\end{aligned}

[LONGITUDINAL]
input = {p11, p21}
DEFINITION:
State = {type = categorical,  categories = {1,2},  dependence = Markov
P(State=1|State_p=1) = p11
P(State=1|State_p=2) = p21
}


The distribution of the initial state is not defined in the model, which means that, by default,

$\mathbb{P}(y_{i,1} = 1) = \mathbb{P}(y_{i,1} = 2) = 0.5$

• markov1b_project (data = ‘markov1b_data.txt’, model = ‘markov1b_model.txt’)

The distribution of the initial state, $p = \mathbb{P}(y_{i,1} = 1)$, is estimated in this example

DEFINITION:
State = {type = categorical,  categories = {1,2},  dependence = Markov
P(State_1=1)= p
P(State=1|State_p=1) = p11
P(State=1|State_p=2) = p21
}

• markov3a_project (data = ‘markov3a_data.txt’, model = ‘markov3a_model.txt’)

Transition probabilities change with time in this example:

We then define time varying transition probabilities in the model:

[LONGITUDINAL]
input = {a1, b1, a2, b2}
EQUATION:
lp11 = a1 + b1*t/100
lp21 = a2 + b2*t/100
DEFINITION:
State = { type = categorical,  categories = {1,2},  dependence = Markov
logit(P(State=1|State_p=1)) = lp11
logit(P(State=1|State_p=2)) = lp21
}

• markov2_project (data = ‘markov2_data.txt’, model = ‘markov2_model.txt’)

Observations in markov2_data.txt take their values in {1, 2, 3}. Then, 6 transition probabilities need to be defined in the model.

## Continuous-time Markov chain

The previous situation can be extended to the case where time intervals between observations are irregular by modeling the sequence of states as a continuous-time Markov process. The difference is that rather than transitioning to a new (possibly the same) state at each time step, the system remains in the current state for some random amount of time before transitioning. This process is now characterized by transition rates instead of transition probabilities:

$\mathbb{P}(y_{i}(t+h) = k\,|\,y_{i}(t)=\ell , \psi_i) = h \, \rho_{\ell k}(t,\psi_i) + o(h),\qquad k \neq \ell .$

The probability that no transition happens between $t$ and $t+h$ is

$\mathbb{P}(y_{i}(s) = \ell, \forall s\in(t, t+h) \ | \ y_{i}(t)=\ell , \psi_i) = e^{h \, \rho_{\ell \ell}(t,\psi_i)} .$

Furthermore, for any individual $i$ and time $t$, the transition rates $(\rho_{\ell,k}(t,\psi_i))$ satisfy for any $1\leq \ell \leq K$,

$\sum_{k=1}^K \rho_{\ell k}(t,\psi_i) = 0.$

Constructing a model therefore means defining parametric functions of time $(\rho_{\ell,k})$ that satisfy this condition.

• markov1c_project (data = ‘markov1c_data.txt’, model = ‘markov1c_model.txt’)

Observation times are irregular in this example:

Then, a continuous time Markov chain should be used in order to take into account the Markovian dependence of the data:

DEFINITION:
State = { type = categorical,  categories = {1,2}, dependence = Markov
transitionRate(1,2) = q12
transitionRate(2,1) = q21
}

• markov3b_project (data = ‘markov3b_data.txt’, model = ‘markov3b_model.txt’)

Time varying transition rates are used in this example.

### 2.4.1.Joint models for continuous outcomes

Objectives: learn how to implement a joint model for continuous PKPD data.

Projects: warfarinPK_project, warfarinPKPDlibrary_project, warfarinPKPDelibrary_project, warfarin_PKPDimmediate_project, warfarin_PKPDeffect_project, warfarin_PKPDturnover_project, warfarin_PKPDseq1_project, warfarin_PKPDseq2_project, warfarinPD_project

## Introduction

A “joint model” describes two or more types of observation that typically depend on each other. A PKPD model is a “joint model” because the PD depends on the PK. Here we demonstrate how several observations can be modeled simultaneously. We also discuss the special case of sequential PK and PD modelling, using either the population PK parameters or the individual PK parameters as an input for the PD model.

## Fitting first a PK model to the PK data

• warfarinPK_project (data = ‘warfarin_data.txt’, model = ‘lib:oral1_1cpt_TlagkaVCl.txt’)

The column DV of the data file contains both the PK and the PD measurements: the key-word Y is used by Monolix for this column. The column DVID is a flag defining the type of observation: DVID=1 for PK data and DVID=2 for PD data: the keyword YTYPE is then used for this column.

We will use the model oral1_1cpt_TlagkaVCl from the Monolix PK library

[LONGITUDINAL]
input = {Tlag, ka, V, Cl}
PK:
Cc = pkmodel(Tlag, ka, V, Cl)
OUTPUT:
output = Cc


Only the predicted concentration Cc is defined as an output of this model. Then, this prediction will be automatically associated to the outcome of type 1 (DVID=1) while the other observations (DVID=2) will be ignored.
Remark: any other ordered values could be used for YTYPE: the smallest one will always be associated to the first prediction defined in the model.

## Simultaneous PKPD modeling using the PK and PD libraries

See project warfarinPKPDlibrary_project from demo folder 1.creating_and_using_models/libraries_of_models to see how to link the PK model to an immediate response model from the Monolix PD library for modelling simultaneously the PK and the PD warfarin data.

See project warfarinPKPDelibrary_project from demo folder 1.creating_and_using_models/libraries_of_models to see how to add an effect compartement by using the PKe and PDe Monolix libraries.

## Simultaneous PKPD modeling using a user defined model

• warfarin_PKPDimmediate_project (data = ‘warfarin_data.txt’, model = ‘immediateResponse_model.txt’)

Is is also possible for the user to write his own PKPD model. The same PK model used previously and an immediate response model are defined in the model file immediateResponse_model.txt

[LONGITUDINAL]
input = {Tlag, ka, V, Cl, Imax, IC50, S0}
EQUATION:
Cc = pkmodel(Tlag, ka, V, Cl)
E = S0 * (1 - Imax*Cc/(Cc+IC50) )
OUTPUT:
output = {Cc, E}


Two predictions are now defined in the model: Cc for the PK (DVID=1) and E for the PD (DVID=2).

• warfarin_PKPDeffect_project (data = ‘warfarin_data.txt’, model = ‘effectCompartment_model.txt’)

An effect compartment is defined in the model file effectCompartment_model.txt

[LONGITUDINAL]
input = {Tlag, ka, V, Cl, ke0, Imax, IC50, S0}
EQUATION:
{Cc, Ce} = pkmodel(Tlag, ka, V, Cl, ke0)
E = S0 * (1 - Imax*Ce/(Ce+IC50) )
OUTPUT:
output = {Cc, E}


Ce is the concentration in the effect compartment

• warfarin_PKPDturnover_project (data = ‘warfarin_data.txt’, model = ‘turnover1_model.txt’)

An indirect response (turnover) model is defined in the model file turnover1_model.txt

[LONGITUDINAL]
input =  {Tlag, ka, V, Cl, Imax, IC50, Rin, kout}
EQUATION:
Cc = pkmodel(Tlag, ka, V, Cl)
E_0 = Rin/kout
ddt_E = Rin*(1-Imax*Cc/(Cc+IC50)) - kout*E
OUTPUT:
output = {Cc, E}


Several implementations of the same pkmodel are possible. For instance, the predicted concentration Cc is defined using equations in turnover2_model.txt instead of using the pkmodel function. The residual error models for the PK and PD measurements are both defined in turnover3_model.txt. Any of these two models can be used instead of turnover1_model.txt by right clicking on the Model file button and selecting Change model.

## Sequential PKPD modelling

In the sequential approach, a PK model is developed and parameters estimated in the first step. For a given PD model, different strategies are then possible for the second step, i.e., for estimating the population PD parameters:

### Using estimated population PK parameters

• warfarin_PKPDseq1_project (data = ‘warfarin_data.txt’, model = ‘turnover1_model.txt’)

Population PK parameters are set to their estimated values but individual PK parameters are not assumed to be known and sampled from their conditional distributions at each SAEM iteration. In Monolix, this simply means changing the status of the population PK parameter values so that they are no longer used as initial estimates for SAEM but considered fixed.

The joint PKPD model defined in turnover1_model.txt is again used with this project.

### Using estimated individual PK parameters

• warfarin_PKPDseq2_project (data = ‘warfarinSeq_data.txt’, model = ‘turnoverSeq_model.txt’)

Individual PK parameters are set to their estimated values and used as constants in the PKPD model for the fitting the PD data. In this example, individual PK parameters $(\psi_i)$ were estimated as the modes of the conditional distributions $(p(\psi_i | y_i, \hat{\theta}))$. An additional column MDV is necessary in the datafile in order to ignore the PK data

The estimated individual PK parameters are defined as regression variables, using the reserved keyword X. The covariates used for defining the distribution of the individual PK parameters are ignored.

We use the same turnover model for the PD data. Here, the PK parameters are defined as regression variables (i.e. regressors).

[LONGITUDINAL]
input =  {Imax, IC50, Rin, kout, Tlag, ka, V, Cl}
Tlag  = {use = regressor}
ka    = {use = regressor}
V     = {use = regressor}
Cl    = {use = regressor}
EQUATION:
Cc = pkmodel(Tlag,ka,V,Cl)
E_0 = Rin/kout
ddt_E= Rin*(1-Imax*Cc/(Cc+IC50)) - kout*E
OUTPUT:
output = E


## Fitting a PKPD model to the PD data only

• warfarinPD_project (data = ‘warfarinPD_data.txt’, model = ‘turnoverPD_model.txt’)

In this example, only PD data are available:

Nevertheless, a PKPD model – where only the effect is defined as a prediction – can be used for fitting this data:

[LONGITUDINAL]
input =  {Tlag, ka, V, Cl, Imax, IC50, Rin, kout}
EQUATION:
Cc = pkmodel(Tlag, ka, V, Cl)
E_0 = Rin/kout
ddt_E = Rin*(1-Imax*Cc/(Cc+IC50)) - kout*E
OUTPUT:
output = E


## Case studies

• 8.case_studies/PKVK_project (data = ‘PKVK_data.txt’, model = ‘PKVK_model.txt’)
• 8.case_studies/hiv_project (data = ‘hiv_data.txt’, model = ‘hivLatent_model.txt’)

### 2.4.2.Joint models for non continuous outcomes

Objectives: learn how to implement a joint model for continuous and non continuous data.

Projects: warfarin_cat_project, PKcount_project, PKrtte_project

## Joint model for continuous PK and categorical PD data

• warfarin_cat_project (data = ‘warfarin_cat_data.txt’, model = ‘PKcategorical1_model.txt’)

In this example, the original continuous PD data has been recoded as 1 (Low), 2 (Medium) and 3 (High).

International Normalized Ratio (INR) values are commonly used in clinical practice to target optimal warfarin therapy. Low INR values (<2) are associated with high blood clot risk and high ones (>3) with high risk of bleeding, so the targeted value of INR, corresponding to optimal therapy, is between 2 and 3.

Prothrombin complex activity is inversely proportional to the INR. We can therefore associate the three ordered categories for the INR to three ordered categories for PCA: Low PCA values if PCA is less than 33% (corresponding to INR>3), medium if PCA is between 33% and 50% ($2 \leq\text{INR}\leq 3$) and high if PCA is more than 50% (INR<2).

The column dv contains both the PK and the new categorized PD measurements.

Instead of modeling the original continuous PD data, we can model the probabilities of each of these categories, which have direct clinical interpretations. The model is still a joint PKPD model since this probability distribution is expected to depend on exposure, i.e., the plasmatic concentration predicted by the PK model. We introduce an effect compartment to mimic the effect delay. Let $y_{ij}^{(2)}$ be the PCA level for patient i at time $t_{ij}^{(2)}$. We can then use a proportional odds model for modeling this categorical data:

$\begin{array}{ccl}\text{logit} \left(\mathbb{P}(y_{ij}^{(2)} \leq 1 | \psi_i)\right) &= &\alpha_{i} + \beta_{i} Ce(t_{ij}^{(2)},\phi_i^{(1)}) \\ \text{logit} \left(\mathbb{P}(y_{ij}^{(2)} \leq 2 | \psi_i)\right) &=& \alpha_{i} + \gamma_{i} + \beta_{i}Ce(t_{ij}^{(2)},\phi_i^{(1)}) \\ \text{logit} \left(\mathbb{P}(y_{ij}^{(2)} \leq 3 | \psi_i)\right) &= & 1,\end{array}$

where $Ce(t,\phi_i^{(1)})$ is the predicted concentration of warfarin in the effect compartment at time $t$ for patient $i$ with PK parameters $\phi_i^{(1)}$. This model defines a probability distribution for $y_{ij}$ if $\gamma_i\geq 0$.
If $\beta_i>0$, the probability of low PCA at time $t_{ij}^{(2)}$ ($y_{ij}^{(2)}=1$) increases along with the predicted concentration $Ce(t_{ij}^{(2)},\phi_i^{(1)})$. The joint model is implemented in the model file PKcategorical1_model.txt

[LONGITUDINAL]
input = {Tlag, ka, V, Cl, ke0, alpha, beta, gamma}
EQUATION:
{Cc,Ce}= pkmodel(Tlag,ka,V,Cl,ke0)
lp1 = alpha + beta*Ce
lp2 = lp1+ gamma         ; gamma >= 0
DEFINITION:
Level = {
type=categorical
categories={1,2,3}
logit(P(Level<=1)) = lp1
logit(P(Level<=2)) = lp2
}
OUTPUT:
output = {Cc, Level}


In this example, the residual error model for the PK data is defined in the Monolix GUI. We can equivalently use the model file PKcategorical2_model.txt where the the residual error model is defined in the model file:

[LONGITUDINAL]
input = {Tlag, ka, V, Cl, ke0, alpha, beta, gamma, a, b}
EQUATION:
{Cc,Ce}= pkmodel(Tlag,ka,V,Cl,ke0)
lp1 = alpha + beta*Ce
lp2 = lp1+ gamma         ; gamma >= 0
DEFINITION:
Concentration = {
distribution=normal,
prediction=Cc,
errorModel=combined1(a,b)
}
Level = {
type=categorical
categories={1,2,3}
logit(P(Level<=1)) = lp1
logit(P(Level<=2)) = lp2
}
OUTPUT:
output = {Concentration, Level}


To use this model with the Monolix project, right click on the button Model file and select Change model. See Categorical data model for more details about categorical data models.

## Joint model for continuous PK and count PD data

• PKcount_project (data = ‘PKcount_data.txt’, model = ‘PKcount1_model.txt’)

The data file used for this project is PKcount_data.txt where the PK and the count PD data are simulated data.

We use a Poisson distribution for the count data, assuming that the Poisson parameter is function of the predicted concentration. For any individual i, we have

$\lambda_i(t) = \lambda_{0,i} \left( 1 - \frac{Cc_i(t)}{Cc_i(t) + IC50_i} \right)$

where $Cc_i(t)$ is the predicted concentration for individual $i$ at time $t$ and

$\log\left(P(y_{ij}^{(2)} = k)\right) = -\lambda_i(t_{ij}) + k\,\log(\lambda_i(t_{ij})) - \log(k!)$

The joint model is implemented in the model file PKcount1_model.txt

[LONGITUDINAL]
input = {ka, V, Cl, lambda0, IC50}
EQUATION:
Cc = pkmodel(ka,V,Cl)
lambda=lambda0*(1 - Cc/(IC50+Cc))
DEFINITION:
Seizure = { type = count,  log(P(Seizure=k)) = -lambda + k*log(lambda) - factln(k) }
OUTPUT:
output={Cc,Seizure}


Model file PKcount1_model.txt can be replaced by PKcount2_model.txt where the residual error model for the PK data is defined. See Count data model for more details about count data models.

## Joint model for continuous PK and time-to-event data

• PKrtte_project (data = ‘PKrtte_data.txt’, model = ‘PKrtteWeibull1_model.txt’)

The data file used for this project is PKrtte_data.txt where the PK and the time-to-event data are simulated data.

We use a Weibull model for the events count data, assuming that the baseline is function of the predicted concentration. For any individual i, we define the hazard function as

$h_i(t) = \gamma_{i} \, Cc_i(t) \, t^{\beta-1}$

where $Cc_i(t)$ is the predicted concentration for individual i at time t. The joint model is implemented in the model file PKrtteWeibull1_model.txt

[LONGITUDINAL]
input  = {ka, V, Cl, gamma, beta}
EQUATION:
Cc = pkmodel(ka, V, Cl)
if t<0.1
aux=0
else
aux = gamma*Cc*(t^(beta-1))
end
DEFINITION:
Hemorrhaging  = {type=event, hazard=aux}
OUTPUT:
output = {Cc, Hemorrhaging}


Model file PKrtteWeibull1_model.txt can be replaced by PKrtteWeibull2_model.txt where the residual error model for the PK data is defined. See Time-to-event data model for more details about time-to-event data models.

### 2.5.1.Model for the individual parameters: introduction

A model for observations depend on a vector of individual parameters ψ $\psi_i$. As we want to work with a population approach, we now suppose that $\psi_i$ comes from some probability distribution $p_{{\psi_i}}$.

In this section, we are interested in the implementation of individual parameter distributions $(p_{{\psi_i}}, 1\leq i \leq N)$. Generally speaking, we assume that individuals are independent. This means that in the following analysis, it is sufficient to take a closer look at the distribution $p_{{\psi_i}}$ of a unique individual i. The distribution $p_{{\psi_i}}$ plays a fundamental role since it describes the inter-individual variability of the individual parameter $\psi_i$.
In Monolix, we consider that some transformation of the individual parameters is normally distributed and is a linear function of the covariates:

$h(\psi_i) = h(\psi_{\rm pop})+ \beta \cdot ({c}_i - {c}_{\rm pop}) + \eta_i \,, \quad \eta_i \sim {\cal N}(0,\Omega).$

This model gives a clear and easily interpreted decomposition of the variability of $h(\psi_i)$ around $h(\psi_{\rm pop})$, i.e., of $\psi_i$ around $\psi_{\rm pop}$:

The component $\beta \cdot ({c}_i - {c}_{\rm pop})$ describes part of this variability by way of covariates ${c}_i$ that fluctuate around a typical value ${c}_{\rm pop}$.
The random component $\eta_i$ describes the remaining variability, i.e., variability between subjects that have the same covariate values.
By definition, a mixed effects model combines these two components: fixed and random effects. In linear covariate models, these two effects combine additively. In the present context, the vector of population parameters to estimate is $\theta = (\psi_{\rm pop},\beta,\Omega)$. Several extensions of this basic model are possible:

We can suppose for instance that the individual parameters of a given individual can fluctuate over time. Assuming that the parameter values remain constant over some periods of time called \emph{occasions}, the model needs to be able to describe the inter-occasion variability (IOV) of the individual parameters.
If we assume that the population consists of several homogeneous subpopulations, a straightforward extension of mixed effects models is a finite mixture of mixed effects models, assuming for instance that the distribution $p_{{\psi_i}}$ is a mixture of distributions.

### 2.5.2.Probability distribution of the individual parameters

Objectives: learn how to define the probability distribution and the correlation structure of the individual parameters.

Projects: warfarin_distribution1_project, warfarin_distribution2_project, warfarin_distribution3_project, warfarin_distribution4_project

## Introduction

One way to extend the use of Gaussian distributions is to consider that some transformation of the parameters in which we are interested is Gaussian, i.e., assume the existence of a monotonic function $h$ such that $h(\psi)$ is normally distributed. Then, there exists some $\mu$ and $\omega$ such that, for each individual $i$:

$h(\psi_i) \sim {\cal N}(h(\bar{\psi}_i), \omega^2)$

where $\tilde{\psi}_i$ is the predicted value of $\psi_i$. In this section, we consider models for the individual parameters without any covariate. Then, the predicted value of $\psi_i$ is the $\tilde{\psi}_i = \psi_{\rm pop}$ and

$h(\psi_i) \sim {\cal N}(h(\psi_{pop}), \omega^2)$

Transformation $h$ defines the distribution of $\psi_i$. Some predifined distributions/transformations are available in Monolix:

• Normal distribution:

The two mathematical representations for normal distributions are equivalent:

$\psi_i \sim {\cal N}(\bar{\psi}_{i}, \omega^2) ~~\Leftrightarrow~~ \psi_i = \bar{\psi}_i + \eta_i, ~~\text{where}~~\eta_i \sim {\cal N}(0,\omega^2).$

• Log-normal distribution:

A log-normally random variable takes positive values only. A log-normal distribution looks like a normal distribution for a small variance $\omega^2$. On the other hand the asymmetry of the distribution increases when $\omega^2$ increases.
Remark: the two mathematical representations for log-normal distributions are equivalent:

$\log(\psi_i) \sim {\cal N}(\log(\bar{\psi}_{i}), \omega^2) ~~\Leftrightarrow~~ \psi_i = \bar{\psi}_i e^{\eta_i}, ~~\text{where}~~\eta_i \sim {\cal N}(0,\omega^2).$

• Power-normal (or Box-Cox) distribution:

Here, $h(\psi_i) = \frac{\psi_i^\lambda -1}{\lambda}$ (with $\lambda > 0$) follows a normal distribution truncated so that $h(\psi_i)>0$. It therefore takes positive values. This distribution converges to a log-normal one when $\lambda \to 0$ and a truncated normal one when $\lambda \to 1$.

• Logit-normal distribution:

A random variable $\psi_i$ with a logit-normal distribution takes its values in $(0,1)$. The logit of $\psi_i$ is normally distributed, i.e.,

$\text{logit}(\psi_i) = \log \left(\frac{\psi_i}{1-\psi_i}\right) \ \sim \ \ {\cal N}( \text{logit}(\tilde{\psi}_i), \omega^2).$

• Probit-normal distribution:

A random variable $\psi$ with a probit-normal distribution also takes its values in (0,1).The probit function is the inverse cumulative distribution function (quantile function) $\Phi^{-1}$ associated with the standard normal distribution ${\cal N}(0,1)$:

$\text{probit}(\psi_i) = \Phi^{-1}(\psi_i) \ \sim \ {\cal N}( \Phi^{-1}(\tilde{\psi}_i), \omega^2) .$

The probit-normal distribution with $m=0.5$ and $\omega=1$ is the uniform distribution on $[0,1]$. Logit and probit transformations can be generalized to any interval (a,b) by setting $\psi_{(a,b)} = a + (b-a)\psi_{(0,1)} ,$ where $\psi_{(0,1)}$ is a random variable that takes values in (0,1) with a logit-normal (or probit-normal) distribution.

## Marginal distributions of the individual parameters

• warfarin_distribution1_project (data = ‘warfarin_data.txt’, model = ‘lib:oral1_1cpt_TlagkaVCl.txt’)

We use the warfarin PK example here. The four PK parameters Tlag, ka, V and Cl are log-normally distributed:

Letter L is then used for these four log-normal distributions in the main Monolix graphical user interface:

The distribution of the 4 PK parameters defined in the MonolixGUI is automatically translated into Mlxtran in the project file:

[INDIVIDUAL]
input = {Tlag_pop, omega_Tlag, ka_pop, omega_ka, V_pop, omega_V, Cl_pop, omega_Cl}
DEFINITION:
Tlag = {distribution=lognormal, typical=Tlag_pop, sd=omega_Tlag}
ka = {distribution=lognormal, typical=ka_pop, sd=omega_ka}
V = {distribution=lognormal, typical=V_pop, sd=omega_V}
Cl = {distribution=lognormal, typical=Cl_pop, sd=omega_Cl}


Estimated parameters are the parameters of the 4 log-normal distributions and the parameters of the residual error model:

Here, $V_{\rm pop} = 7.94$ and $\omega_V=0.22$ means that the estimated population distribution for the volume is:$\log(V_i) \sim {\cal N}(\log(7.94) , 0.22^2)$ or, equivalently, $V_i = 7.94\, e^{\eta_i}$where $\eta_i \sim {\cal N}(0,0.22^2)$.
Remark: $V_{\rm pop} = 7.94$ is not the population mean of the distribution of $V_i$, but the median of this distribution.
The four probability distribution functions are displayed Figure Parameter distributions:

Click on Settings and select the checkbox Informations to display a summary of these distributions:

Remark 1: $V_{\rm pop}$ is not the population mean of the distribution of $V_i$, but the median of this distribution. The same property holds for the 3 other distributions which are not Gaussian.
Remark 2: here, standard deviations $\omega_{Tlag}$, $\omega_{ka}$, $\omega_V$ and $\omega_{Cl}$ are approximatively the coefficients of variation (CV) of Tlag, ka, V and Cl since these 4 parameters are log-normally distributed with variances < 1.

• warfarin_distribution2_project (data = ‘warfarin_data.txt’, model = ‘lib:oral1_1cpt_TlagkaVCl.txt’)

Other distributions for the PK parameters are used in this project:

• we use a uniform distribution on (0,3) for Tlag. This is equivalent to rescale a probit-normal distribution on (0,3)

and fix the population value $Tlag_{\rm pop}$ to 1.5 (=(0 + 3)/2) and the standard deviation $\omega_{Tlag}$ to 1:

• we use a logit-normal distribution rescaled on (0.25, 4.25) for ka:

• a normal distribution for $V$
• a log-normal distribution for $Cl$

Distributions of Tlag and ka are therefore defined as Custom:

Estimated parameters are the parameters of the 4 transformed normal distributions and the parameters of the residual error model:

Here, $Tlag_{\rm pop} = 1.5$ and $\omega_{Tlag}=1$ means that $Tlag_i \sim Unif([0,3])$ while $ka_{\rm pop} = 1.91$ and $\omega_{ka}=1.9$ means that

$\textrm{logit}(\frac{ka_i-0.25}{4}) \sim {\cal N}(\textrm{logit}(\frac{1.91-0.25}{4}) , 1.9^2)$

The four probability distribution functions are displayed Figure Parameter distributions:

Remark 1: population values $Tlag_{\rm pop}$, $ka_{\rm pop}$, $V_{\rm pop}$ and $Cl_{\rm pop}$ are the medians of the distributions of the 4 PK parameters.
Remark 2: here, standard deviations $\omega_{Tlag}$, $\omega_{ka}$ and $\omega_{V}$ cannot be interpreted as coefficients of variation (CV) since Tlag, ka and V[ are not log-normally distributed.

## Correlation structure of the random effects

• warfarin_distribution3_project (data = ‘warfarin_data.txt’, model = ‘lib:oral1_1cpt_TlagkaVCl.txt’)

Correlation between the random effects can be introduced in the model. A block structure is used in this project, assuming linear correlations between $\eta_{Tlag,i}$ and $\eta_{ka,i}$ and between $\eta_{V,i}$ and $\eta_{Cl,i}$:

Estimated population parameters now include these 2 correlations:

Remark: corr_Tlag_ka is not the correlation between the individual parameters $Tlag_i$ and $ka_i$ but the (linear) correlation between the random effects $\eta_{Tlag,i}$ and $\eta_{ka,i}$.

• warfarin_distribution4_project (data = ‘warfarin_data.txt’, model = ‘lib:oral1_1cpt_TlagkaVCl.txt’)

In this example, $Tlag_i$ does not vary in the population, which means that $\eta_{Tlag,i}=0$ for all $i$, while the three other random effects are correlated:

Estimated population parameters now include the 3 correlations between $\eta_{ka,i}$, $\eta_{V,i}$ and $\eta_{Cl,i}$ :

### 2.5.3.Model for individual covariates

Objectives: learn how to implement a model for continuous and/or categorical covariates.

Projects: warfarin_covariate1_project, warfarin_covariate2_project, warfarin_covariate3_project, phenobarbital_project

## Model with continuous covariates

• warfarin_covariate1_project (data = ‘warfarin_data.txt’, model = ‘lib:oral1_1cpt_TlagkaVCl.txt’)

The warfarin data contains 2 individual covariates: weight which is a continuous covariate and sex which is a categorical covariate with 2 categories (1=Male, 0=Female). We can ignore these columns if are sure not to use them, or declare them using respectively the reserved keywords COV (for continuous covariate) and CAT (for categorical covariate):

Even if these 2 covariates are now available, we can choose to define a model without any covariate:

Here, a 0 on the first row (weight) and the third column (V), for instance, means that there is no relationship between weight and volume in the model. A diagnostic plot Covariates is generated which displays possible relationships between covariates and individual parameters (even if these covariates are not used in the model):

• warfarin_covariate2_project (data = ‘warfarin_data.txt’, model = ‘lib:oral1_1cpt_TlagkaVCl.txt’)

We decide to use the weight in this project in order to explain part of the variability of $V_i$ and $Cl_i$. We will implement the following model for these two parameters:

\begin{aligned}\log(V_i) &= \log(V_{\rm pop}) + \beta_V \, \log(w_i/70) + \eta_{V,i} \\\log(Cl_i) &= \log(Cl_{\rm pop}) + \beta_{Cl} \, \log(w_i/70) + \eta_{Cl,i} \\\end{aligned}

which means that population parameters of the PK parameters are defined for a typical individual of the population with weight=70kg.

The model for $V_i$ and $Cl_i$ can be equivalently written as follows:

\begin{aligned} V_i &= V_{\rm pop} ( w_i/70 )^{\beta_V} \, e^{ \eta_{V,i} } \\ Cl_i &= Cl_{\rm pop} ( w_i/70 )^{\beta_{Cl}} \, e^{ \eta_{Cl,i} } \end{aligned}

The individual predicted values for $V_i$ and $Cl_i$ are therefore

\begin{aligned} \bar{V}_i &= V_{\rm pop} \left( w_i/70 \right)^{\beta_V} \\ \bar{Cl}_i &= Cl_{\rm pop} \left( w_i/70 \right)^{\beta_{Cl}} \end{aligned}

and the statistical model describes how $V_i$ and $Cl_i$ are distributed around these predicted values:

\begin{aligned} \log(V_i) &\sim {\cal N}( \log(\bar{V}_i) , \omega^2_V) \\ \log(Cl_i) &\sim {\cal N}( \log(\bar{Cl}_i) , \omega^2_{Cl}) \end{aligned}

Here, $\log(V_i)$ and $\log(Cl_i)$ are linear functions of $\log(w_i/70)$: we then need to transform first the original covariate $w_i$ into $\log(w_i/70)$ by clicking on the button Transform of the main GUI. We can then transform and rename the original covariates of the dataset:

We then define a new covariate model, where $\log(V_i)$ and $\log(Cl_i)$ are linear functions of the transformed weight $lw70_i$:

Coefficients $\beta_{V}$ and $\beta_{Cl}$ are now estimated with their s.e. and the p-values of the Wald tests are derived to test if these coefficients are different from 0:

## Model with categorical covariates

• warfarin_covariate3_project (data = ‘warfarin_data.txt’, model = ‘lib:oral1_1cpt_TlagkaVCl.txt’)

We use sex instead of weight in this project, assuming different population values of volume and clearance for males and females.
More precisely, we consider the following model for $V_i$ and $Cl_i$:

\begin{aligned}\log(V_i) &= \log(V_{\rm pop}) + \beta_V \, 1_{sex_i=F} + \eta_{V,i} \\\log(Cl_i) &= \log(Cl_{\rm pop}) + \beta_{Cl} \,1_{sex_i=F} + \eta_{Cl,i}\end{aligned}

where $1_{sex_i=F} =1$ if individual $i$ is a female and 0 otherwise. Then, $V_{\rm pop}$ and $Cl_{\rm pop}$ are the population volume and clearance for males while $V_{\rm pop}\, e^{\beta_V}$ and $Cl_{\rm pop}\, e^{\beta_{Cl}}$ are the population volume and clearance for females. By clicking on the button Transform, we can modify the name of the categories (click on the name of each category and enter a new name) and the reference category (Male is the reference category in this project):

Define then the covariate model in the main GUI:

We can furthermore assume different variances of the random effects $\eta_{V,i}$ for males and females by clicking on the button Cat. Var. and selecting which variances depend on the categorical covariate:

Estimated population parameters, including the coefficients $\beta_V$ and $\beta_{Cl}$ and the 2 standard deviations $\omega_{V,F}$ and $\omega_{V,M}$ are displayed with the results:

Estimated population parameters are also computed and displayed per category:

We can display the probability distribution functions of the 4 PK parameters:

We can then distinguish the distributions for males and females by clicking on Settings and ticking the box By group:

## Transforming categorical covariates

• phenobarbital_project (data = ‘phenobarbital_data.txt’, model = ‘lib:bolus_1cpt_Vk.txt’)

The phenobarbital data contains 2 covariates: the weight and the Apgar score which is considered as a categorical covariate:

Instead of using the 10 original levels of the Apgar score, we will transform this categorical covariate and create 3 categories: Low = {1,2,3}, Medium = {4, 5, 6, 7} and High={8,9,10}.

If we assume, for instance that the volume is related to the Apgar score, then $\beta_{V,Low}$ and $\beta_{V,High}$ are estimated (assuming that Medium is the reference level).

### 2.5.4.Inter occasion variability (IOV)

Objectives: learn how to take into account inter occasion variability (IOV).

Projects: iov1_project, iov2_project, iov3_project, iov4_project

## Introduction

A simple model consists of splitting the study into K time periods or occasions and assuming that individual parameters can vary from occasion to occasion but remain constant within occasions. Then, we can try to explain part of the intra-individual variability of the individual parameters by piecewise-constant covariates, i.e., occasion-dependent or occasion-varying (varying from occasion to occasion and constant within an occasion) ones. The remaining part must then be described by random effects. We will need some additional notation for describing this new statistical model. Let

• $\psi_{ik}$ be the vector of individual parameters of individual i for occasion k, where $1\leq i \leq N$ and $1\leq k \leq K$.
• ${c}_{ik}$ be the vector of covariates of individual i for occasion k. Some of these covariates remain constant (gender, group treatment, ethnicity, etc.) and others can vary (weight, treatment, etc.).

Let $\psi_i = (\psi_{i1}, \psi_{i2}, \ldots , \psi_{iK})$ be the sequence of K individual parameters for individual i. We also need to define:

• $\eta_i^{(0)}$, the vector of random effects which describes the random inter-individual variability of the individual parameters,
• $\eta_{ik}^{(1)}$, the vector of random effects which describes the random intra-individual variability of the individual parameters in occasion k, for each $1\leq k \leq K$.

Here and in the following, the superscript (0) is used to represent inter-individual variability, i.e., variability at the individual level, while superscript (1) represents inter-occasion variability, i.e., variability at the occasion level for each individual. The model now combines these two sequences of random effects:

$h(\psi_{ik}) = h(\psi_{\rm pop})+ \beta(c_{ik} - c_{\rm pop}) + \eta_i^{(0)} + \eta_{ik}^{(1)} .$

Remark: Individuals do not need to share the same sequence of occasions: the number of occasions and the times defining the occasions can differ from one individual to another.

## Cross over study

• iov1_project (data = ‘iov1_data.txt’, model = ‘lib:oral1_1cpt_kaVk.txt’)

In this example, PK data are collected for each patient during two independent treatment periods of time (each one starting at time 0). A column OCC is used to identify the period:

This column is defined using the reserved keyword OCC:

Here, treatment (column TREAT) changes with the occasion. Then, Monolix automatically creates a categorical covariate sTREAT which is the sequence of treatments (AB or BA). This new categorical covariates doesn’t change with time and can be used at the individual level” with SEX. We first define the distribution of the individual parameters at the “individual level” as usual:

By clicking on the button IOV, we then define the distribution of the individual parameters at the occasion level”. In this example, we assume that the volume $V_i$ depends on the treatment we assume inter-occasion variability for $V_i$ and $Cl_i$:

The population parameters now include the standard deviations of the random effects for the 2 levels of variability (omega is used fo IIV and gamma for IOV):

By clicking on Stratify in the menu bar of the graphics, you can use the occasion to split the plots

VPC per occasion:

Prediction distribution per plot:

## Occasions with washout

• iov2_project (data = ‘iov2_data.txt’, model = ‘lib:oral1_1cpt_kaVk.txt’)

The time is increasing in this example, but the dynamical system (i.e. the compartments) is reset when the second period starts. Column EVID provides some information about events concerning dose administration. In particular, EVID=4 indicates that the system is reset (washout) when a new dose is administrated

This column is defined using the reserved keyword EVID:

Monolix automatically proposes to define the treatment periods (between successive resetting) as statistical occasions and introduce IOV, as we did in the previous example. We can display the individual fit by splitting each occasion for each individual

Or by merging the different occasions in a unique plot for each individual:

## Occasions without washout

• iov3_project (data = ‘iov3_data.txt’, model = ‘lib:oral1_1cpt_kaVk.txt’)

Multiple doses are administrated to each patient. We consider each period of time between successive doses as an statistical occasion. A column OCC is therefore necessary in the data file:

The model for IIV and IOV can then be defined as usual. The plot of individual fits allows us to check that the predicted concentration is now continuous over the different occasions for each individual:

## Multiple levels of occasions

• iov4_project (data = ‘iov4_data.txt’, model = ‘lib:oral1_1cpt_kaVk.txt’)

We can easily extend such approach to multiple levels of variability. In this example, columns P1 and P2 define embedded occasions. The are both defined as occasions:

We then define a statistical model for each level of variability. We can switch to the next/previous level by clicking on the arrow button

### 2.5.5.Mixture of distributions

Objectives: learn how to implement a mixture of distributions for the individual parameters.

Projects: PKgroup_project, PKmixt_project

## Introduction

Mixed effects models allow us to take into account between-subject variability. One complicating factor arises when data is obtained from a population with some underlying heterogeneity. If we assume that the population consists of several homogeneous subpopulations, a straightforward extension of mixed effects models is a finite mixture of mixed effects models, assuming, for instance, that the probability distribution of some individual parameters vary from one subpopulation to another one. The introduction of a categorical covariate (e.g., sex, genotype, treatment, status, etc.) into such a model already supposes that the whole population can be decomposed into subpopulations. The covariate then serves as a \emph{label} for assigning each individual to a subpopulation.
In practice, the covariate can either be known or not. Mixture models usually refer to models for which the categorical covariate is unknown, but whichever the case, the model is the same. The difference appears when having to perform certain tasks and in the methods needed to implement them. For instance, model building is different depending on whether the labels are known or unknown: we have supervised learning if the labels are known and unsupervised otherwise. For the sake of simplicity, we will consider a basic model that involves individual parameters $(\psi_i,1\leq i \leq N)$ and observations $(y_{ij}, i \leq N, 1\leq j \leq n_i)$. Then, the easiest way to model a finite mixture model is to introduce a label sequence $(z_i , 1\leq i \leq N)$ that takes its values in $\{1,2,\ldots,M\}$ such that $z_i=m$ if subject i belongs to subpopulation m.
In some situations, the label sequence $(z_i , 1\leq i \leq N)$ is known and can be used as a categorical covariate in the model. If $(z_i)$ is unknown, it can modeled as a set of independent random variables taking its values in $\{1,2,\ldots,M\}$ where for $i=1,2,\ldots, N$, $P(z_i = m)$ is the probability that individual belongs to group m. We will assume furthermore that the $(z_i)$ are identically distributed, i.e., $P(z_i = m)$ does not depend on i for $m=1,\ldots,M$.

## Mixture of distributions – supervised learning

• PKgroup_project (data = ‘PKmixt_data.txt’, model = ‘lib:oral1_1cpt_kaVCl.txt’)

The sequence of labels is known as GROUP in this project and is therefore defined as a categorical covariate:

We can then assume, for instance different population values for the volume in the 2 groups:

and estimate the population parameters using this covariate model

Then, this covariate GROUP can be used as a stratification variable.

## Mixture of distributions – unsupervised learning

• PKmixt_project (data = ‘PKmixt_data.txt’, model = ‘lib:oral1_1cpt_kaVCl.txt’)

We will use the same data with this project but ignoring the column GROUP (which is equivalent to assume that the label is unknown)

If we suspect some heterogeneity in the population, we can introduce a “latent covariate” and select the number of categories for this latent covariate.

Remark: several latent covariates can be introduced in the model, with different number of categories.

We can then use this latent covariate lcat as any observed categorical covariate. We can assume again different population values for the volume in the 2 groups:

and estimate the population parameters using this covariate model. Proportions of each group are also estimated:

Once the population parameters are estimated, the sequence of latent covariates $(ZM_i)$, i.e. the group to which belongs each subject, can be estimated together with the individual parameters, as the modes of the conditional distributions.

The sequence of estimated latent covariates $(ZM_i)$ can be used as a stratification variable:

We can for example display the VPC in the 2 groups:

By plotting the distribution of the individual parameters, we see that $V_i$ has a bimodal distribution

This distribution can be decomposed into 2 distributions by ticking the box By group:

### 2.6.1.PK model: single route of administration

Objectives: learn how to define and use a PK model for single route of administration.

Projects: bolusLinear_project, bolusMM_project, bolusMixed_project, infusion_project, oral1_project, oral0_project, sequentialOral0Oral1_project, simultaneousOral0Oral1_project, oralAlpha_project, oralTransitComp_project

## Introduction

Once a drug is administered, we usually describe subsequent processes within the organism by the pharmacokinetics (PK) process known as ADME: absorption, distribution, metabolism, excretion. A PK model is a dynamical system mathematically represented by a system of ordinary differential equations (ODEs) which describes transfers between compartments and elimination from the central compartment.
See this web animation for more details.
Mlxtran is remarkably efficient for implementing simple and complex PK models:

• The function pkmodel can be used for standard PK models. The model is defined according to the provided set of named arguments. The pkmodel function enables different parametrizations, different models of absorption, distribution and elimination.
• PK macros define the different components of a compartmental model. Combining such PK components provide a high degree of flexibility for complex PK models. They can also extend a custom ODE system.
• A system of ordinary differential equations (ODEs) can be implemented very easily.

It is also important to highlight the fact that the data file use by Monolix for PK modelling only contains information about dosing, i.e. how and when the drug is administrated. There is no need to integrate in the data file any information related to the PK model. This is an important remark since it means that any (complex) PK model can be used with the same data file. In particular, we make a clear distinction between administration (related to the data) and absorption (related to the model).

### The pkmodel function

The PK model is defined by the names of the input parameters of the pkmodel function. These names are reserved keywords.

Absorption

• p: Fraction of dose which is absorbed
• ka: absorption constant rate (first order absorption)
• or, Tk0: absorption duration (zero order absorption)
• Tlag: lag time before absorption
• or, Mtt, Ktr: mean transit time & transit rate constant

Distribution

• V: Volume of distribution of the central compartment
• k12, k21: Transfer rate constants between compartments 1 (central) & 2 (peripheral)
• or V2, Q2: Volume of compartment 2 (peripheral) & inter compartment clearance, between compartments 1 and 2,
• k13, k31: Transfer rate constants between compartments 1 (central) & 3 (peripheral)
• or V3, Q3: Volume of compartment 3 (peripheral) & inter compartment clearance, between compartments 1 and 3.

Elimination

• k: Elimination rate constant
• or Cl: Clearance
• Vm, Km: Michaelis Menten elimination parameters

Effect compartment

• ke0: Effect compartment transfer rate constant

## Intravenous bolus injection

### Linear elimination

• bolusLinear_project

A single iv bolus is administered at time 0 to each patient. The data file bolus1_data.txt contains 4 columns: id, time, amt (the amount of drug in mg) and y (the measured concentration):

The names of these columns are recognized as keywords by Monolix:

It is important to remark that, in this data file, a row contains either some information about the dose (in which case y = ".") or a measurement (in which case amt = "."). We could equivalently use the data file bolus2_data.txt which contains 2 additional columns: EVID and MDV (these two names are also recognized by Monolix):

Here, EVID=1 means that this record describes a dose while EVID=0 means that this record contains an observed value. On the other hand, MDV=1 means that the observed value of this record should be ignored while MDV=0 means that this record contains an observed value. The two data files bolus1_data.txt and bolus2_data.txt contain exactly the same information and provide exactly the same results. A one compartment model with linear elimination is used with this project:

$\begin{array}{ccl} \frac{dA_c}{dt} &=& - k A_c(t) \\ A_c(t) &= &0 ~~\text{for}~~ t<0 \end{array}$

Here, $A_c(t)$ and $C_c(t)=A_c(t)/V$ are, respectively, the amount and the concentration of drug in the central compartment at time $t$. When a dose $D$ arrives in the central compartment at time $\tau$, an iv bolus administration assumes that

$A_c(\tau^+) = A_c(\tau^-) + D$

where $A_c(\tau^-)$ (resp. $A_c(\tau^+)$) is the amount of drug in the central compartment just before (resp. after) $\tau$. Parameters of this model are $V$ and $k$. We therefore use the model bolus_1cpt_Vk from the Monolix PK library:

[LONGITUDINAL]
input = {V, k}
EQUATION:
Cc = pkmodel(V, k)
OUTPUT:
output = Cc


We could equivalently use the model bolusLinearMacro.txt (click on the button Model and select the new PK model in the library 6.PK_models/model)

[LONGITUDINAL]
input = {V, k}
PK:
compartment(cmt=1, amount=Ac)
iv(cmt=1)
elimination(cmt=1, k)
Cc = Ac/V


These two implementations generate exactly the same C++ code and then provide exactly the same results. Here, the ODE system is linear and Monolix uses its analytical solution. Of course, it is also possible (but not recommended with this model) to use the ODE based PK model bolusLinearODE.txt :

[LONGITUDINAL]
input = {V, k}
PK:
depot(target = Ac)
EQUATION:
ddt_Ac = - k*Ac
Cc = Ac/V


Results obtained with this model are slightly different from the ones obtained with the previous implementations since a numeric scheme is used here for solving the ODE. Individual fits obtained with model look nice

but the VPC show some misspecification in the elimination process:

### Michaelis Menten elimination

• bolusMM_project

A non linear elimination is used with this project:

$\frac{dA_c}{dt} = - \frac{ V_m \, A_c(t)}{V\, K_m + A_c(t) }$

This model is available in the Monolix PK library as bolus_1cpt_VVmKm:

[LONGITUDINAL]
input = {V, Vm, Km}
PK:
Cc = pkmodel(V, Vm, Km)
OUTPUT:
output = Cc


Instead of this model, we could equivalently use PK macros with bolusNonLinearMacro.txt from the library 6.PK_models/model:

[LONGITUDINAL]
input = {V, Vm, Km}
PK:
compartment(cmt=1, amount=Ac, volume=V)
iv(cmt=1)
elimination(cmt=1, Vm, Km)
Cc = Ac/V
OUTPUT:
output = Cc


or an ODE with bolusNonLinearODE:

[LONGITUDINAL]
input = {V, Vm, Km}
PK:
depot(target = Ac)
EQUATION:
ddt_Ac = -Vm*Ac/(V*Km+Ac)
Cc=Ac/V
OUTPUT:
output = Cc


Results obtained with these three implementations are identical since no analytical solution is available for this non linear ODE. We can then check that this PK model seems to describe much better the elimination process of the data:

### Mixed elimination

• bolusMixed_project

THe Monolix PK library contains standard” PK models. More complex models should be implemented by the user in a model file. For instance, we assume in this project that the elimination process is a combination of linear and nonlinear elimination processes:

$\frac{dA_c}{dt} = -\frac{ V_m A_c(t)}{V K_m + A_c(t) } - k A_c(t)$

This model is not available in the Monolix PK library. It is implemented in bolusMixed.txt:

[LONGITUDINAL]
input = {V, k, Vm, Km}
PK:
depot(target = Ac)
EQUATION:
ddt_Ac = -Vm*Ac/(V*Km+Ac) - k*Ac
Cc=Ac/V
OUTPUT:
output = Cc


This model seems to describe very well the data:

## Intravenous infusion

• infusion_project

Intravenous infusion assumes that the drug is administrated intravenously with a constant rate (infusion rate), during a given time (infusion time). Since the amount is the product of infusion rate and infusion time, an additional column rate or time is required in the data file: Monolix can use both indifferently. Data file infusion_rate_data.txt has an additional column rate:

It can be replaced by infusion_tinf_data.txt which contains exactly the same information:

We use with this project a 2 compartment model with non linear elimination and parameters $V_1$, $Q$, $V_2$, $V_m$, $K_m$:

\begin{aligned} k_{12} &= Q/V_1 \\ k_{21} &= Q/V_2 \\\frac{dA_c}{dt} & = k_{21} \, Ap(t) - k_{12} \, Ac(t)- \frac{ V_m \, A_c(t)}{V_1\, K_m + A_c(t) } \\ \frac{dA_p}{dt} & = - k_{21} \, Ap(t) + k_{12} \, Ac(t) \\ Cc(t) &= \frac{Ac(t)}{V_1} \end{aligned}

This model is available in the Monolix PK library as infusion_2cpt_V1QV2VmKm:

[LONGITUDINAL]
input = {V1, Q, V2, Vm, Km}
PK:
V = V1
k12 = Q/V1
k21 = Q/V2
Cc = pkmodel(V, k12, k21, Vm, Km)
OUTPUT:
output = Cc


### first-order absorption

• oral1_project

This project uses the data file oral_data.txt. For each patient, information about dosing is the time of administration and the amount:

A one compartment model with first order absorption and linear elimination is used with this project. Parameters of the model are $ka$, $V$ and $Cl$. we will then use model oral1_kaVCl.txt from the Monolix PK library

[LONGITUDINAL]
input = {ka, V, Cl}
EQUATION:
Cc = pkmodel(ka, V, Cl)
OUTPUT:
output = Cc


Both the individual fits and the VPCs show that this model doesn’t describe properly the absorption process:

There exist many options for implementing this PK model with Mlxtran:
– using PK macros: oralMacro.txt:

[LONGITUDINAL]
input = {ka, V, Cl}

PK:
compartment(cmt=1, amount=Ac)
oral(cmt=1, ka)
elimination(cmt=1, k=Cl/V)
Cc=Ac/V

OUTPUT:
output = Cc


– using a system of two ODEs as in oralODEb.txt:

[LONGITUDINAL]
input = {ka, V, Cl}
PK:
EQUATION:
k = Cl/V
Cc = Ac/V


– combining PK macros and ODE as in oralMacroODE.txt (macros are used for the absorption and ODE for the elimination):

[LONGITUDINAL]
input = {ka, V, Cl}
PK:
compartment(cmt=1, amount=Ac)
oral(cmt=1, ka)
EQUATION:
k = Cl/V
ddt_Ac = - k*Ac
Cc = Ac/V


– or equivalently, as in oralODEa.txt:

[LONGITUDINAL]
input = {ka, V, Cl}
PK:
depot(target=Ac, ka)
EQUATION:
k = Cl/V
ddt_Ac = - k*Ac
Cc = Ac/V


Remark: Models using the pkmodel function or PK macros only use an analytical solution of the ODE system.

### zero-order absorption

• oral0_project

A one compartment model with zero order absorption and linear elimination is used to fit the same PK data with this project. Parameters of the model are Tk0, V and Cl. We will then use model oral1_Tk0VCl.txt from the Monolix PK library

[LONGITUDINAL]
input = {Tk0, V, Cl}
EQUATION:
Cc = pkmodel(Tk0, V, Cl)
OUTPUT:
output = Cc


Both the individual fits and the VPCs show that a zero absorption process better fit the data than a first order one, but the absorption process is not properly described yet…

Remark 1: implementing a zero-order absorption process using ODEs is not easy… on the other hand, it becomes extremely easy to implement using either the pkmodel function or the PK macro oral(Tk0).
Remark 2: The duration of a zero-order absorption has nothing to do with an infusion time: it is a parameter of the PK model (exactly as the absorption rate constant ka for instance), it is not part of the data.

### sequential zero-order first-order absorption

• sequentialOral0Oral1_project

More complex PK models can be implemented using Mlxtran. A sequential zero-order first-order absorption process assumes that a fraction Fr of the dose is first absorbed during a time Tk0 with a zero-order process, then, the remaining fraction is absorbed with a first-order process. This model is implemented in sequentialOral0Oral1.txt using PK macros:

[LONGITUDINAL]
input = {Fr, Tk0, ka, V, Cl}
PK:
compartment(amount=Ac)
absorption(Tk0, p=Fr)
absorption(ka , Tlag=Tk0 , p=1-Fr)
elimination(k=Cl/V)
Cc=Ac/V


Both the individual fits and the VPCs show that this PK model perfectly describes the whole ADME process for the same PK data:

### simultaneous zero-order first-order absorption

• simultaneousOral0Oral1_project

A simultaneous zero-order first-order absorption process assumes that a fraction $Fr$ of the dose is absorbed with a zero-order process while the remaining fraction is absorbed simultaneously with a first-order process. This model is implemented in simultaneousOral0Oral1.txt using PK macros:

[LONGITUDINAL]
input = {Fr, Tk0, ka, V, Cl}
PK:
compartment(amount=Ac)
absorption(Tk0, p=Fr)
absorption(ka, p=1-Fr)
elimination(k=Cl/V)
Cc=Ac/V


### alpha-order absorption

• oralAlpha_project

An $\alpha$-order absorption process assumes that the rate of absorption is proportional to some power of the amount of drug in the depot compartment:

$\frac{dA_d}{dt} = - r \, \left(A_d(t)\right)^\alpha$

This model is implemented in oralAlpha.txt using ODEs:

[LONGITUDINAL]
input = {r, alpha, V, Cl}
PK:
EQUATION:
Cc = Ac/V


### transit compartment model

• oralTransitComp_project

A PK model with transit compartment of transit rate $Ktr$ and mean transit time $Mtt$ can be implemented using the PK macro oral(ka, Mtt, Ktr), or using the pkmodel function, as in oralTransitComp.txt:

[LONGITUDINAL]
input = {Mtt, Ktr, ka, V, Cl}
EQUATION:
Cc = pkmodel(Mtt, Ktr, ka, V, Cl)


## Using different parametrizations

The PK macros and the function pkmodel use some preferred parametrizations and some reserved names as input arguments: Tlag, ka, Tk0, V, Cl, k12, k21… It is however possible to use another parametrization and/or other parameter names. As an example, consider a 2 compartments model for oral administration with a lag, a first order absorption and a linear elimination. We can use the pkmodel function with, for instance, parameters ka, V, k, $k_{12}$ and $k_{21}$:

[LONGITUDINAL]
input = {ka, V, k, k12, k21}
PK:
Cc = pkmodel(ka, V, k, k12, k21)


Imagine now that we want i) to use the clearance $Cl$ instead of the elimination rate constant $k$, ii) to use capital letters for the parameter names. We can still use the pkmodel function as follows:

[LONGITUDINAL]
input = {KA, V, CL, K12, K21}
PK:
Cc = pkmodel(ka=KA, V, k=CL/V, k12=K12, k21=K21)


### 2.6.2.PK model: multiple routes of administration

Objectives: learn how to define and use a PK model for multiple routes of administration..

Projects: ivOral1_project, ivOral2_project

## Combining iv and oral administrations – Example 1

• ivOral1_project (data = ‘ivOral1_data.txt’ , model = ‘ivOral1Macro_model.txt’)

In this example, we combine oral and iv administrations of the same drug. The data file ivOral1_data.txt contains an additional column ADM which indicates the route of administration (1=oral, 2=iv):

We assume here a one compartment model with first-order absorption process from the depot compartment (oral administration) and a linear elimination process from the central compartment. We further assume that only a fraction F (bioavailability) of the drug orally administered is absorbed. This model is implemented in ivOral1Macro_model.txt using PK macros:

[LONGITUDINAL]
input = {F, ka, V, k}
PK:
compartment(cmt=1, amount=Ac)
oral(type=1, cmt=1, ka, p=F)
iv(type=2, cmt=1)
elimination(cmt=1, k)
Cc = Ac/V


A logit-normal distribution is used for bioavability F that takes it values in (0,1):

The model properly fits the data:

Remark: the same PK model could be implemented using ODEs instead of PK macros.
Let $A_d$ and $A_c$ be, respectively, the amounts in the depot compartment (gut) and the central compartment (bloodtsream). Kinetics of $A_d$ and $A_c$ are described by the following system of ODEs

\begin{aligned}\dot{A}_d(t) & = - ka \, A_d(t) \\ \dot{A}_c(t) & = ka \, A_d(t) - k \, A_c(t) \end{aligned}

The target compartment is the depot compartment ($A_d$) for oral administrations and the central compartment ($A_c$) for iv administrations. This model is implemented in ivOral1ODE_model.txt using a system of ODEs:

[LONGITUDINAL]
input = {F, ka, V, k}
PK:
depot(type=2, target=Ac)
EQUATION:
Cc = Ac/V


Solving this ODEs system is less efficient than using the PK macros which uses the analytical solution of the linear system.

## Combining iv and oral administrations – Example 2

• ivOral2_project (data = ‘ivOral2_data.txt’ , model = ‘ivOral2Macro_model.txt’)

In this example (based on simulated PK data), we combine intraveinous injection with 3 different types of oral administrations of the same drug. The datafile ivOral2_data.txt contains column ADM which indicates the route of administration (1,2,3=oral, 4=iv):

We assume that one type of oral dose (adm=1) is absorbed into a latent compartment following a zero-order absorption process. the 2 oral doses (adm=2,3) are absorbed into the central compartment following first-order absorption processes with different rates. Bioavailabilities are supposed to be different for the 3 oral doses. There is linear transfer from the latent to the central compartment. A peripheral compartment is linked to the central compartment. The drug is eliminated by a linear process from the central compartment:

This model is implemented in ivOral2Macro_model.txt using PK macros:

[LONGITUDINAL]
input = {F1, F2, F3, Tk01, ka2, ka3, kl, k23, k32, V, Cl}
PK:
compartment(cmt=1, amount=Al)
compartment(cmt=2, amount=Ac)
peripheral(k23,k32)
oral(type=1, cmt=1, Tk0=Tk01, p=F1)
oral(type=2, cmt=2, ka=ka2,   p=F2)
oral(type=3, cmt=2, ka=ka3,   p=F3)
iv(type=4, cmt=2)
transfer(from=1, to=2, kt=kl)
elimination(cmt=2, k=Cl/V)
Cc = Ac/V
OUTPUT:
output = Cc


Here, logit-normal distributions are used for bioavabilities $F_1$, $F_2$ and $F_3$. The model properly fits the data:

Remark: the number and type of doses vary from one patient to another one in this example.

### 2.6.3.From multiple doses to steady-state

Objectives: learn how to define and use a PK model with multiple doses or assuming steady-state.

Projects: multidose_project, addl_project, ss1_project, ss2_project, ss3_project

## Multiple doses

• multidose_project (data = ‘multidose_data.txt’ , model = ‘lib:bolus_1cpt_Vk.txt’)

In this project, each patient receives several iv bolus injections. Each dose is represented by a row in the data file multidose_data.txt:

The PK model and the statistical model used in this project properly fit the observed data of each individual. Even if there is no observations between 12h and 72h, predicted concentrations computed on this time interval exhibit the multiple doses received by each patient:

VPCs, which is a diagnostic tool, are based on the design of the observations and therefore ignore” what may happen between 12h and 72h:

On the other hand, the prediction distribution, which is not a diagnostic tool, computes the distribution of the predicted concentration at any time point:

We can remark in the previous project, that, for each patient, the interval time between two successive doses is the same (12 hours for each patient) and the amount of drug which is administrated is always the same as well (40mg for each patient). We can take advantage of this design in order to simplify the data file by defining, for each patient, a unique amount (AMT), the number of additional doses which are administrated after the first one (ADDL) and the time interval between successive doses (II):

The keywords ADDL and II are automatically recognized by Monolix:

Remarks:

• Results obtained with this project, i.e. with this data file, are identical to the ones obtained with the previous project.
• It is possible to combine single doses (using ADDL=0) and repeated doses in a same data file.

• ss1_project (data = ‘ss1_data.txt’ , model = ‘lib:oral0_1cpt_Tk0VCl.txt’)

The dose orally administrated at time 0 to each patient is assumed to be a *steady-state dose* which means that a large” number of doses before time 0 have been administrated, with a constant amount and a constant interval dosing, such that steady-state, i.e. equilibrium, is reached at time 0. The data file ss1_data contains a column SS which indicates if the dose is a staedy-state dose or not and a column II for the inter-dose interval:

The keywords SS and II are automatically recognized by Monolix:

Click on Check the initial fixed effects and set the minimum value of the X-axis (min X) to display the predicted concentration between the last dose administrated at time 0:

We see on this plot that Monolix adds 4 doses before the last dose to reach steady-state. Individual fits display the predicted concentrations computed with these additional doses:

• ss2_project (data = ‘ss2_data.txt’ , model = ‘lib:oral0_1cpt_Tk0VCl.txt’)

Individual fits display the predicted concentrations computed with this combination of doses:

• ss3_project (data = ‘ss3_data.txt’ , model = ‘lib:oral0_1cpt_Tk0VCl.txt’)

Two different steady-state doses are combined in this project:

Monolix automatically creates occasions between successive steady-state doses in such situation. It is therefore possible to introduce inter-occasion variability (IOV) in the model:

### 2.7.1.Using regression variables

Objectives: learn how to define and use regression variables (time varying covariates).

Projects: reg1_project, reg2_project

## Introduction

A regression variable is a variable $x$ which is a given function of time, which is not defined in the model but which is used in the model. $x$ is only defined at some time points $t_1, t_2, \ldots, t_m$ (possibly different from the observation time points), but $x$ is a function of time that should be defined for any $t$ (if is used in an ODE for instance, or if a prediction is computed on a fine grid). Then, Mlxtran defines the function $x$ by intepolating the given values $(x_1, x_2, \ldots, x_m)$. In the current version of Mlxtran, interpolation is performed by using the last given value:

$x(t) = x_j \quad \text{for} \ \ t_j \leq t < t_{j+1}$

## Continuous regression variables

• reg1_project (data = reg1_data.txt , model=reg1_model.txt)

We consider a basic PD model in this example, where some concentration values are used as a regression variable:

[LONGITUDINAL]
input = {Emax, EC50, Cc}
Cc = {use=regressor}
EQUATION:
E = Emax*Cc/(EC50 + Cc)


The predicted effect is therefore piecewise constant: it changes at the time points where concentration values are provided:

## Categorical regression variables

• reg2_project (data = reg2_data.txt , model=reg2_model.txt)

The variable $z_{ij}$ takes its values in {1, 2} in this example and represents the state of individual $i$ at time $t_{ij}$. We then assume that the observed data $y_{ij}$ has a Poisson distribution with parameter $\lambda_1$ if $z_{ij}=1$ and parameter $\lambda_2$ if $z_{ij}=2$$z$ is known in this example: it is then defined as a regression variable in the model:

[LONGITUDINAL]
input = {lambda1, lambda2, z}
z = {use=regressor}

EQUATION:
if z==0
lambda=lambda1
else
lambda=lambda2
end
DEFINITION:
y = {type=count, log(P(y=k)) = -lambda + k*log(lambda) - factln(k) }


### 2.7.2.Bayesian estimation

Objectives: learn how to combine maximum likelihood estimation and Bayesian estimation of the population parameters.

Projects: theobayes1_project, theobayes2_project, theobayes3_project

## Introduction

The Bayesian approach considers the vector of population parameters $\theta$ as a random vector with a prior distribution $\pi_\theta$. We can then define the *posterior distribution* of $\theta$:

\begin{aligned} p(\theta | y ) &= \frac{\pi_\theta( \theta )p(y | \theta )}{p(y)} \\ &= \frac{\pi_\theta( \theta ) \int p(y,\psi |\theta) \, d \psi}{p(y)} . \end{aligned}

We can estimate this conditional distribution and derive statistics (posterior mean, standard deviation, quantiles, etc.) and the so-called maximum a posteriori (MAP) estimate of $\theta$:

\begin{aligned} \hat{\theta}^{\rm MAP} &=\text{arg~max}_{\theta} p(\theta | y ) \\ &=\text{arg~max}_{\theta} \left\{ {\cal LL}_y(\theta) + \log( \pi_\theta( \theta ) ) \right\} . \end{aligned}

The MAP estimate maximizes a penalized version of the observed likelihood. In other words, MAP estimation is the same as penalized maximum likelihood estimation. Suppose for instance that $\theta$ is a scalar parameter and the prior is a normal distribution with mean $\theta_0$ and variance $\gamma^2$. Then, the MAP estimate is the solution of the following minimization problem:

$\hat{\theta}^{\rm MAP} =\text{arg~min}_{\theta} \left\{ -2{\cal LL}_y(\theta) + \frac{1}{\gamma^2}(\theta - \theta_0)^2 \right\} .$

This is a trade-off between the MLE which minimizes the deviance, $-2{\cal LL}_y(\theta)$, and $\theta_0$ which minimizes $(\theta - \theta_0)^2$. The weight given to the prior directly depends on the variance of the prior distribution: the smaller $\gamma^2$ is, the closer to $\theta_0$ the MAP is. In the limiting case, $\gamma^2=0$; this means that $\theta$ is fixed at $\theta_0$ and no longer needs to be estimated. Both the Bayesian and frequentist approaches have their supporters and detractors. But rather than being dogmatic and following the same rule-book every time, we need to be pragmatic and ask the right methodological questions when confronted with a new problem.
All things considered, the problem comes down to knowing whether the data contains sufficient information to answer a given question, and whether some other information may be available to help answer it. This is the essence of the art of modeling: find the right compromise between the confidence we have in the data and our prior knowledge of the problem. Each problem is different and requires a specific approach. For instance, if all the patients in a clinical trial have essentially the same weight, it is pointless to estimate a relationship between weight and the model’s PK parameters using the trial data. A modeler would be better served trying to use prior information based on physiological knowledge rather than just some statistical criterion.
Generally speaking, if prior information is available it should be used, on the condition of course that it is relevant. For continuous data for example, what does putting a prior on the residual error model’s parameters mean in reality? A reasoned statistical approach consists of including prior information only for certain parameters (those for which we have real prior information) and having confidence in the data for the others. Monolix allows this hybrid approach which reconciles the Bayesian and frequentist approaches. A given parameter can be

• a fixed constant if we have absolute confidence in its value or the data does not allow it to be estimated, essentially due to lack of identifiability.
• estimated by maximum likelihood, either because we have great confidence in the data or no information on the parameter.
• estimated by introducing a prior and calculating the MAP estimate or estimating the posterior distribution.

## Computing the Maximum a posteriori (MAP) estimate

• theobayes1_project (data = ‘theophylline_amt_data.txt’ , model = ‘lib:oral1_1cpt_kaVCl.txt’)

We want to introduce a prior distribution for $ka_{\rm pop}$ in this example. Right click on the initial value for $ka_{\rm pop}$ and select Prior

We propose to use a log-normal distribution with mean $\log(2)$ and standard deviation $0.1$ for $ka_{\rm pop}$ and to compute the MAP estimate for $ka_{\rm pop}$.

The MAP estimate of $ka_{\rm pop}$ is a penalized maximum likelihood estimate:

## Estimating the posterior distribution

• theobayes2_project (data = ‘theophylline_amt_data.txt’ , model = ‘lib:oral1_1cpt_kaVCl.txt’)

Using the same prior distribution, we can choose to estimate the posterior distribution of $ka_{\rm pop}$ instead of its mode only:

Estimation of the individual parameters using the conditional mean and sd is checked in this example. Then, the MCMC algorithm used for estimating the conditional distributions of the individual parameters is also used for estimating the posterior distribution of $ka_{\rm pop}$:

Once the (population and individual) parameters have been estimated, the posterior and prior distributions of $ka_{\rm pop}$ is displayed with the graphics

## Fixing the value of a parameter

• theobayes3_project (data = ‘theophylline_amt_data.txt’ , model = ‘lib:oral1_1cpt_kaVCl.txt’)

We can combine different strategies for the population parameters: Bayesian estimation for $ka_{\rm pop}$, maximum likelihood estimation for $V_{\rm pop}$ and fixed value for $Cl_{\rm pop}$, for instance.

Remark: $Cl_{\rm pop}$ is not estimated (it’s s.e. is not computed) but the standard deviation $\omega_{Cl}$ is estimated as usual in this example:

## Bayesian estimation of several population parameters

• theobayes4_project (data = ‘theophylline_amt_data.txt’ , model = ‘lib:oral1_1cpt_kaVCl.txt’)

Prior distributions can be used for several parameters:

Then, the MCMC algorithm is used for estimating these 3 posterior distributions:

### 2.7.3.Delayed differential equations

Objectives: learn how to implement a model with ordinary differential equations (ODE) and delayed differential equations (DDE).

Projects: tgi_project, seir_project, seir_noDelay_project

## Ordinary differential equations based model

• tgi_project (data = tgi_data.txt , model = tgi_model.txt)

We consider here the tumor growth inhibition (TGI) model proposed by Ribba *et al.* (Ribba, B., Kaloshi, G., Peyre, M., Ricard, D., Calvez, V., Tod, M., . & Ducray, F., *A tumor growth inhibition model for low-grade glioma treated with chemotherapy or radiotherapy*. Clinical Cancer Research, 18(18), 5071-5080, 2012.). This model is defined by a set of ordinary differential equations

\begin{aligned}\frac{dC}{dt} &= - k_{de} C(t) \\\frac{dP_T}{dt} &= \lambda P_T(t)(1- P^{\star}(t)/K) + k_{QPP}Q_P(t) -k_{PQ} P_T(t) -\gamma \, k_{de} P_T(t)C(t) \\ \frac{dQ}{dt} &= k_{PK} P_T(t) -\gamma \, k_{de} Q(t)C(t) \\\frac{dQ_P}{dt} &= \gamma \, k_{de} Q(t)C(t) - k_{QPP} Q_P(t) -\delta_{QP} Q_P(t)\end{aligned}

where $P^\star(t) = P_T(t) + Q(t) + Q_P(t)$ is the total tumor size. This set of ODEs is valid for $t>0$, while

\begin{aligned} C(t) &= 0 \\ P_T(t) &= P_{T0} \\ Q(t) &= Q_0 \\ Q_P(t) &= 0 \end{aligned}

for $t \leq 0$.This model (derivatives and initial conditions) can easily be implemented with Mlxtran:

[LONGITUDINAL]
input = {K, KDE, KPQ, KQPP, LAMBDAP, GAMMA, DELTAQP, PT0, Q0}
PK:
depot(target=C)
EQUATION:
t0=0
PT_0=PT0
Q_0 = Q0
PSTAR = PT+Q+QP
ddt_C = -KDE*C
ddt_PT = LAMBDAP*PT*(1-PSTAR/K) + KQPP*QP - KPQ*PT - GAMMA*KDE*PT*C
ddt_Q = KPQ*PT - GAMMA*KDE*Q*C
ddt_QP = GAMMA*KDE*Q*C - KQPP*QP - DELTAQP*QP


Remark: t0, PT_0 and Q_0 are reserved keywords that define the initial conditions.

Then, the graphic of individual fits clearly shows that the tumor size is constant until $t=0$ and starts changing according to the model at $t=0$.

## Don’t forget the initial conditions!

• tgiNoT0_project (data = tgi_data.txt , model = tgi1NoT0_model.txt)

The initial time $t_0$ is not specified in this example

[LONGITUDINAL]
input = {K, KDE, KPQ, KQPP, LAMBDAP, GAMMA, DELTAQP, PT0, Q0}
PK:
depot(target=C)
EQUATION:
PT_0=PT0
Q_0 = Q0
PSTAR = PT+Q+QP
ddt_C = -KDE*C
ddt_PT = LAMBDAP*PT*(1-PSTAR/K) + KQPP*QP - KPQ*PT - GAMMA*KDE*PT*C
ddt_Q = KPQ*PT - GAMMA*KDE*Q*C
ddt_QP = GAMMA*KDE*Q*C - KQPP*QP - DELTAQP*QP


Since $t_0$ is missing, Monolix uses the first time value encountered for each individual. If, for instance, the tumor size is computed for $t\geq -5$ for the individual fits, then $t_0=-5$ will be used for defining the initial conditions, which introduces a shift in the plot:

Conclusion: don’t forget to specify properly the initial conditions of a system of ODEs!

## Delayed differential equations based model

A system of delay differential equations (DDEs) can be implemented in a block EQUATION of the section [LONGITUDINAL] of a script Mlxtran. Mlxtran provides the command delay(x,T) where x is a one-dimensional component and T is the explicit delay. Therefore, DDEs with a nonconstant past of the form

$\begin{array}{ccl} \frac{dx}{dt} &=& f(x(t),x(t-T_1), x(t-T_2), ...), ~~\text{for}~~t \geq 0\\ x(t) &=& x_0(t) ~~~~\text{for}~~\text{min}(T_k) \leq t \leq 0 \end{array}$

can be solved.

• seir_project (data = seir_data.txt , model = seir_model.txt)

The model is a system of 4 DDEs:

[LONGITUDINAL]
input = {A, lambda, gamma, epsilon, d, tau, omega}
EQUATION:
t0 = 0
S_0 = 15
E_0 = 0
I_0 = 2
R_0 = 3
N = S + E + I + R
ddt_S = A - d*S - lambda*S*I/N + gamma*delay(I,tau)*exp(-d*tau)
ddt_E = lambda*S*I/N - d*E - lambda*delay(S,omega)*delay(I,omega)*exp(-d*omega)/(delay(I,omega) + delay(S,omega) + delay(E,omega) + delay(R,omega))
ddt_I = -(gamma+epsilon+d)*I + lambda*delay(S,omega)*delay(I,omega)*exp(-d*omega)/(delay(I,omega) + delay(S,omega) + delay(E,omega) + delay(R,omega))
ddt_R = gamma*I - d*R - gamma*delay(I,tau)*exp(-d*tau)


Introducing these delays allows to obtain nice fits for the 4 outcomes, including $(R_{ij})$:

• seir_noDelay_project (data = seir_data.txt , model = seir_model.txt)

On the other hand, if ODEs are used, i.e. without any delay,

then, the model is not able anymore to fit correctly the observed data, such as $(R_{ij})$:

## Case studies

• 8.case_studies/arthritis_project

Monolix includes several estimation algorithms:

Also, different types of results are available in the form of graphics and tables. The tasks can run individually, or you can define a workflow (see here). Moreover, Monolix includes a convergence assessment tool. It is possible to execute a workflow as defined above but for different, randomly generated, initial values of fixed effects.

### Purpose

Maximum likelihood estimation of the population parameters is performed with the button. The sequence of estimated parameters ($\theta_k$) is displayed in a figure which is automatically saved as saem.png in the results folder at the end of the estimation. Also, the final estimations are displayed in the command window.
This algorithm produces the file pop_parameters.txt in the result folder, with all the estimated parameters: population parameters, variances (or standard deviations), correlations, error model parameters, etc. The estimation of the population parameters by groups defined by the continuous and categorical covariates used in the model is also given.

For example, in the theophylline_project, where one has a continuous covariate based on a transformation, the population parameters estimation file writes

******************************************************************
* theophylline_project.mlxtran
* November 23, 2015 at 13:09:49
* Monolix version: 4.4.0
******************************************************************

Estimation of the population parameters

parameter
ka_pop : 1.57
V_pop : 0.454
beta_V_t_WEIGHT : -0.471
Cl_pop : 0.0399

omega_ka : 0.641
omega_V : 0.113
omega_Cl : 0.272

a : 0.735

Numerical covariates
t_WEIGHT = log(WEIGHT/70)


and the following figure is shown.

In another example, in the iov1_project, where we have both categorical covariates (SEX, TREAT, OCC, sTREAT) and IOV (the estimated parameters are called gamma), the population parameters estimation file writes

******************************************************************
*      iov1_project.mlxtran
*      November 24, 2015 at 08:44:55
*      Monolix version: 4.4.0
******************************************************************

Estimation of the population parameters

parameter
ka_pop           :     1.42
beta_ka_sTREATBA :    0.153
V_pop            :    0.469
beta_V_TREATB    :   -0.208
beta_V_sTREATBA  :   0.0415
beta_V_SEXM      :   0.0287
beta_V_OCC2      :  -0.0261
Cl_pop           :   0.0371
beta_Cl_SEXM     :  -0.0412
beta_Cl_OCC2     :   0.0188

omega_ka         :     0.26
omega_V          :   0.0957
omega_Cl         :    0.124

gamma_ka         :        0
gamma_V          :   0.0275
gamma_Cl         :    0.187

a                :    0.108
b                :    0.107


### Main population parameter settings

Seed: The default value of the seed used for the random numbers generator is 123456. This seed can be randomly generated with the “generate” button. It is accessible in Menu/Settings/Seed.

Number of iterations: There are two stages in the population parameters estimation algorithm. Use K1 and K2 to define the number of iterations for each one. If auto is selected, then algorithm will determine automatically if it can jump from one stage to the other or if it can finish the estimation process. In this case, the numbers $K_1$ and $K_2$ will represent the maximum number of iterations for each stage. Here, these numbers will not exceed 500 and 200. By default, the automatic determination of $(K_1,K_2)$ is proposed. It is accessible in Menu/Settings/Pop. parameters.

Number of chains: When the data set is small, just a few of subjects for instance, replicating the data set can improve the precision of the estimation. The number of (Markov) chains specifies how many times the data set need to be replicated. You can specify this number yourself, or you can let Monolix to compute how many replicates are needed to ensure a minimum number of subjects. In this case, you should specify this minimum size and any time you change the data set, the number of chains will be updated according to the number of subjects present in the new data set. For instance, in the theophylline example there are only 12 subjects in the data set. Setting to 50 the minimum number of subjects, Monolix choose to use 5 Markov chains.  It is accessible in Menu/Settings/Pop. parameters.

Simulated annealing: A Simulated Annealing version of SAEM is used to estimate the population parameters (the variances are constrained to decrease slowly during the first iterations of SAEM).  It is accessible in Menu/Settings/Pop. parameters.

### Very advance population parameters settings

In this paragraph, we define all the settings used for the population parameter estimation. However, we strongly encourage the user to get help from Lixoft’s support when considering using these parameters.

SAEM algorithm: Additionally, $K_0$ is the number of burning iterations. The sequence of step sizes $(\gamma_k)$ decreases as $\frac{1}{k^{a_j}}$ for j equals 1 or 1. By default, $a_1 = 0$ and $a_2 = 1$, then during the first $K_1$ iterations $\gamma_k$ is constant and is equal to 1 and $\gamma_k = \frac{1}{(k-K_1)^{a_2}}$ during the next $K_2$ iterations. $l_{K_1}$, $l_{K_2}$ and $r_{K_2}$ define the stopping rules when auto is checked. It is accessible in Menu/Settings/Pop. parameters and by clicking on “Advanced settings” .

MCMC: The default number of Markov chains is L = 1. $m_1$, $m_2$, $m_3$ and $m_4$ are the numbers of transitions of the four different kernels used in the MCMC algorithm. The default value are $m_1=2$, $m_2=0$, $m_3=2$, and $m_4 = 2$. The $m_2$ transition is recommended only in some specific cases, when the observed kinetics are very different (i.e. viral kinetics of responders, non responders, rebounders, …).  It is accessible in Menu/Settings/Pop. parameters and by clicking on “Advanced settings” .

Simulated Annealing: When Simulated Annealing is checked, the temperature decreases as $\tau^{1}$ and $\tau^{2}$ in the two phases of the SAEM algorithm. Notice that you can use $\tau > 1$ to force the estimated variance(s) to increase from a small initial value to the estimated value. Then, $\tau^{2} >1$ can be used if you want to obtain fits as good as possible and $\tau^{1} >1$ if you want to obtain inter-subject variability as small as possible.  It is accessible in Menu/Settings/Pop. parameters and by clicking on “Advanced settings” .

### What is and what is not SAEM? And the related implications.

The objective of this paragraph is to explain what is the SAEM, what the user can wait for and its limitation.

The SAEM algorithm is the stochastic approximation expectation-maximization algorithm. It has been shown to be extremely efficient for a wide variety of complex models: categorical data, count data, time-to-event data, mixture models, hidden Markov models, differential equation based models, censored data, … Furthermore, convergence of SAEM has been rigorously proved and the implementation in Monolix is efficient.

SAEM is an iterative algorithm: SAEM as implemented in Monolix has two phases. The goal of the first is to get to a neighborhood of the solution in only a few iterations. A simulated annealing version of SAEM accelerates this process when the initial value is far from the actual solution. The second phase consists of convergence to the located maximum with behavior that becomes increasingly deterministic, like a gradient algorithm.

SAEM is a stochastic algorithm: We cannot claim that SAEM always converges (i.e., with probability 1) to the global maximum of the likelihood. We can only say that it converges under quite general hypotheses to a maximum – global or perhaps local – of the likelihood. A large number of simulation studies have shown that SAEM converges with high probability to a “good” solution – it is hoped the global maximum – after a small number of iterations. Therefore, we encourage the user to use the convergence assessment tool to look at the sensitivity of the results w.r.t. the initial parameters

SAEM is a stochastic algorithm (2)The trajectory of the outputs of SAEM depends on the sequence of random numbers used by the algorithm. This sequence is entirely determined by the “seed.” In this way, two runs of SAEM using the same seed will produce exactly the same results. If different seeds are used, the trajectories will be different but convergence occurs to the same solution under quite general hypotheses. However, if the trajectories converge to different solutions, that does not mean that any of these results are “false”. It just means that the model is case sensitive to the seed or to the initial conditions.

### Purpose

The variance of the maximum likelihood estimate (MLE) $\hat{\theta}$, and thus confidence intervals, can be derived from the observed Fisher information matrix (FIM), itself derived from the observed likelihood (i.e., the pdf of observations y):

$I_{y}(\hat{\theta})\triangleq -\frac{\partial^2}{\partial\theta^2}\log({\cal L}_y(\hat{\theta}))$

There are two different algorithms: by linearization or by stochastic approximation. When “linearization” is used, the structural model is linearized, and the full statistical model is approximated by a Gaussian model. When “stochastic approximation” is used, the exact model is used, and the Fisher information matrix (F.I.M) is approximated stochastically. The final estimations are displayed in the command window (Matlab or DOS) together with the population parameters:

• the estimated fixed effects, their standard-errors, the absolute and relative p-values obtained from the Wald test (only for the coefficients of the covariates),
• the estimated variances (or standard deviations) and their standard-errors,
• the estimated residual error parameters and their standard-errors,
• the estimated correlation matrix of the random effects if the covariance matrix is not diagonal,
• the correlation matrix of the fixed effect estimates, with the smallest and largest eigenvalues,
• the correlation matrix of the variance components estimates, with the smallest and largest eigenvalues,

All that information is appended to the file pop_parameters.txt. The following file presents an example on the theophylline_project.

******************************************************************
*      theophylline_project.mlxtran
*      November 26, 2015 at 16:17:01
*      Monolix version: 4.4.0
******************************************************************

Estimation of the population parameters

parameter     s.e. (lin)   r.s.e.(%)   p-value
ka_pop          :     1.57         0.31          20
V_pop           :    0.454        0.019           4
beta_V_t_WEIGHT :   -0.471          0.3          63        0.11
Cl_pop          :   0.0399       0.0034           9

omega_ka        :    0.641         0.14          22
omega_V         :    0.113        0.035          31
omega_Cl        :    0.272        0.066          24

a               :    0.735        0.056           8

______________________________________________
correlation matrix of the estimates(linearization)

ka_pop               1
V_pop             0.15       1
beta_V_t_WEIGHT      0    0.09       1
Cl_pop           -0.06   -0.15    0.01       1

Eigenvalues (min, max, max/min): 0.79  1.3  1.6

omega_ka      1
omega_V   -0.02       1
omega_Cl      0   -0.01       1
a         -0.02   -0.11   -0.05       1

Eigenvalues (min, max, max/min): 0.87  1.1  1.3

Numerical covariates
t_WEIGHT = log(WEIGHT/70)


Notice that the chosen method is displayed in the file.

### Best practices: when to use “linearization” and when to use “stochastic approximation”

Firstly, it is only possible to use the linearization algorithm for the continuous data. In that case, this method is generally much faster than stochastic approximation and also gives good estimates of the FIM. The FIM by model linearization will generally be able to identify the main features of the model. More precise– and time-consuming – estimation procedures such as stochastic approximation and importance sampling will have very limited impact in terms of decisions for these most obvious features. Precise results are required for the final runs where it becomes more important to rigorously defend decisions made to choose the final model and provide precise estimates and diagnostic plots. In the theophylline example, the evaluation of the FIM is presented with the CPU time.

Method s.e. $(ka_{pop}, V_{pop}, \beta_{V,t_{WEIGHT}}, Cl_{pop})$ s.e. $(\omega_{ka}, \omega_{V}, \omega_{Cl})$ e.v. correlation matrix of the estimates CPU time [s]
Linearization (0.31, 0.019, .3, .0035) (0.14, 0.035, .066) (0.8, 1.3, 1.6, 0.87, 1.1, 1.3) .1
Stochastic Approximation (0.31, 0.019, .3, .0035) (0.15, 0.035, .068) (0.8, 1.2, 1.5, 0.79, 1.2, 1.5) 3.4

One can see that it the stochastic approximation is much more costly in computation time but the result is more precise.

### Purpose

Although the population parameter estimation algorithm gives a rough estimation of the individual parameters, it can be estimated two more precise estimators: the conditional mode and the conditional mean.

If the option ”Estimate the conditional modes” is selected, the individual parameters are estimated by maximizing the conditional probabilities $p(\psi_i|y_i;\hat{\theta})$, i.e.

$$\hat{\psi}_i^{mode} = \underset{\psi_i}{\textrm{arg max }}p(\psi_i|y_i;\hat{\theta})$$

It corresponds to the “optimal” value for the fit. If the option “Estimate the cond. means and s.d.” is selected, the conditional means and standard deviations are estimated by MCMC. The conditional distribution $p(\psi_i|y_i;\hat{\theta})$ of the vector of individual parameters $\psi_i$ can be estimated for each individual $i$ using the Metropolis-Hastings (MH) algorithm. For each $i$, this algorithm generates a sequence $(\psi_i^{(k)}, i\leq k \leq K)$ which converges in distribution to the conditional distribution $p(\psi_i|y_i;\hat{\theta})$ and can be used for estimating any summary statistic of it (mean, standard deviation, quantiles, etc.).

The MH algorithm therefore allows us to define an initial estimator of the individual parameter $\psi_i$ that approximates the conditional mean:

$\hat{\psi}_i^{mean} = \frac{1}{K}\sum_{k=1}^{K}\psi^{k}$

For each parameter, the mean of these quantities over all the subjects is displayed together with a $r_{mcmc}$ interval. Two files indiv_parameters.txt and indiv_eta.txt are created with the estimated individual parameters and random effects in table format. Also, if there were defined priors on some fixed effects, and it was selected prior distribution method for some of them, a new file called simulatedPopParams.txt is created simulations using their posterior distribution laws.

### Advanced settings for individual parameter estimation:

MCMC: $m_1$, $m_2$, $m_3$ and $m_4$ are the numbers of transitions of the four different kernels used in the MCMC algorithm. The default value are $m_1=2$, $m_2=0$, $m_3=2$, and $m_4 = 2$. $L_{mcmc}$ and $r_{mcmc}$ define the stopping rule of the MCMC algorithm. The number of iterations of MCMC increases with $L_{mcmc}$ and decreases with $r_{mcmc}$. It is accessible in Menu/Settings/Indiv. parameters.

### Best practices : When do we look at the conditional mode and when do we look at the conditional mean?

The choice of using the conditional mean $\psi^{mean}$ or conditional mode $\psi^{mode}$ is arbitrary. By default, Monolix uses the conditional mode for computing predictions, taking the philosophy that the “most likely” values of the individual parameters are the most suited for computing the “most likely” predictions.

### Purpose

Performing likelihood ratio tests and computing information criteria for a given model requires computation of the log-likelihood

${\cal L}{\cal L}_y(\hat{\theta}) = \log({\cal L}_y(\hat{\theta})) \triangleq \log(p(y;\hat{\theta}))$

where $\hat{\theta}$ is the vector of population parameter estimates for the model being considered. The log-likelihood cannot be computed in closed form for nonlinear mixed effects models. It can however be estimated in a general framework for all kinds of data and models using the importance sampling Monte Carlo method. This method has the advantage of providing an unbiased estimate of the log-likelihood – even for nonlinear models – whose variance can be controlled by the Monte Carlo size.

The estimation of the log-likelihood is performed. Two different algorithms are proposed to estimate the log-likelihood: by linearization and by Importance sampling. The estimated log-likelihoods are appended to pop_parameters.txt.

### Log-likelihood by importance sampling

The observed log-likelihood ${\cal LL}(\theta;y)=\log({\cal L}(\theta;y))$ can be estimated without requiring approximation of the model, using a Monte Carlo approach. Since

${\cal LL}(\theta;y) = \log(p(y;\theta)) = \sum_{i=1}^{N} \log(p (y_i;\theta))$

we can estimate $\log(p(y_i;\theta))$ for each individual and derive an estimate of the log-likelihood as the sum of these individual log-likelihoods. We will now explain how to estimate $\log(p(y_i;\theta))$ for any individual i. Using the $\phi$-representation of the model (the individual parameters are transformed to be Gaussian), notice first that $p(y_i;\theta)$ can be decomposed as follows:

$p(y_i;\theta) = \int p(y_i,\phi_i;\theta)d\phi_i = \int p(y_i|\phi_i;\theta)p(\phi_i;\theta)d\phi_i = \mathbb{E}_{p_{\phi_i}}\left(p(y_i|\phi_i;\theta)\right)$

Thus, $p(y_i;\theta)$ is expressed as a mean. It can therefore be approximated by an empirical mean using a Monte Carlo procedure:

1. Draw M independent values $\phi_i^{(1)}$, $\phi_i^{(2)}$, …, $\phi_i^{(M)}$ from the marginal distribution $p_{\phi_i}(.;\theta)$.
2. Estimate $p(y_i;\theta)$ with $\hat{p}_{i,M}=\frac{1}{M}\sum_{m=1}^{M}p(y_i | \phi_i^{(m)};\theta)$

By construction, this estimator is unbiased, and consistent since its variance decreases as 1/M:

$\mathbb{E}\left(\hat{p}_{i,M}\right)=\mathbb{E}_{p_{\phi_i}}\left(p(y_i|\phi_i^{(m)};\theta)\right) = p(y_i;\theta) ~~~~\mbox{Var}\left(\hat{p}_{i,M}\right) = \frac{1}{M} \mbox{Var}_{p_{\phi_i}}\left(p(y_i|\phi_i^{(m)};\theta)\right)$

We could consider ourselves satisfied with this estimator since we “only” have to select M large enough to get an estimator with a small variance. Nevertheless, it is possible to improve the statistical properties of this estimator.

The problem is that it is not possible to generate the $\phi_i^{(m)}$ with this conditional distribution, since that would require to compute a normalizing constant, which here is precisely $p(y_i;\theta)$.

Nevertheless, this conditional distribution can be estimated using the Metropolis-Hastings algorithm described in the Metropolis-Hastings algorithm for simulating the individual parameters and a practical proposal “close” to the optimal proposal $p_{\phi_i|y_i}$ can be derived. We can then expect to get a very accurate estimate with a relatively small Monte Carlo size M.

The mean and variance of the conditional distribution $p_{\phi_i|y_i}$ are estimated by Metropolis-Hastings for each individual i. Then, the $\phi_i^{(m)}$ are drawn with a noncentral student t-distribution:

$\phi_i^{(m)} = \mu_i + \sigma_i \times T_{i,m}$

where $\mu_i$ and $\sigma^2_i$ are estimates of $\mathbb{E}\left(\phi_i|y_i;\theta\right)$ and $\mbox{Var}\left(\phi_i|y_i;\theta\right)$, and $(T_{i,m})$ is a sequence of i.i.d. random variables distributed with a Student’s t-distribution with $\nu$ degrees of freedom.

Remark: Even if $\hat{\cal L}_y(\theta)=\prod_{i=1}^{N}\hat{p}_{i,M}$ is an unbiased estimator of ${\cal L}_y(\theta)$$\hat{\cal LL}_y(\theta)$ is a biased estimator of ${\cal LL}_y(\theta)$. Indeed, by Jensen’s inequality, we have :

$\mathbb{E}\left(\log(\hat{\cal L}_y(\theta))\right) \leq \log \left(\mathbb{E}\left(\hat{\cal L}_y(\theta)\right)\right)=\log\left({\cal L}_y(\theta)\right)$

However, the bias decreases as M increases and also if $\hat{\cal L}_y(\theta)$ is close to ${\cal L}_y(\theta)$. It is therefore highly recommended to use a proposal as close as possible to the conditional distribution $p_{\phi_i|y_i}$, which means having to estimate this conditional distribution before estimating the log-likelihood (i.e run task “individual parameter” with “Cond. means and s.d.” option).

#### Advance settings for the log-likelihood

A t-distribution is used as proposal. The number of degrees of freedom of this distribution can be either fixed or optimized. In such a case, the default possible values are 2, 5, 10 and 20 degree of freedom. A distribution with a small number of degree of freedom (i.e. heavy tails) should be avoided in case of stiff ODE’s defined models. We recommend to set a degree of freedom at 5. It is accessible in Menu/Settings/Log-likelihood.

### Log-likelihood by linearization

The likelihood of the nonlinear mixed effects model  cannot be computed in a closed-form. An alternative is to approximate this likelihood by the likelihood of the Gaussian model deduced from the nonlinear mixed effects model after linearization of the function f (defining the structural model) around the predictions of the individual parameters $(\phi_i; 1 \leq i \leq N)$.
Notice that the log-likelihood can not be computed by linearization for discrete outputs (categorical, count, etc.) nor for mixture models or when the posterior distribution have been estimated for some parameters with priors.

### Best practices: When should I use the linearization and when should I use the importance sampling?

Firstly, it is only possible to use the linearization algorithm for the continuous data. In that case, this method is generally much faster than importance sampling method and also gives good estimates of the LL. The LL calculation by model linearization will generally be able to identify the main features of the model. More precise– and time-consuming – estimation procedures such as stochastic approximation and importance sampling will have very limited impact in terms of decisions for these most obvious features. Selection of the final model should instead use the unbiased estimator obtained by Monte Carlo. In the warfarin example, the evaluation of the log-likelihood (along with the AIC and BIC) is presented with the CPU time.

Method -2LL AIC BIC CPU time [s]
Linearization 2178.78 2220.78 2251.56 1.5
Important sampling 2119.76 2161.76 2192.54 27.1

One can see that it the importance sampling is much more costly in computation time but the result is more precise.

### 3.5.How to run several tasks ?

With the button Run, you can execute successively several tasks. The list of tasks to run is called “scenario” or “workflow”. You can define your own scenario by checking the corresponding checkboxes at the left of the Run button.
For example, define the following workflow in order to:

1. estimate the population parameters,
2. estimate the standard errors,
3. estimate the individual parameters,
4. display graphical results using the individual parameters estimated previously.

If you just want to compute the population parameters with their standard errors, define the workflow as follows:

Notice that you can load and save a workflow using the menu Workflow.

### 3.6.Algorithms convergence assessment

Monolix includes a convergence assessment tool. It is possible to execute a workflow as defined above but for different, randomly generated, initial values of fixed effects. Click in the “Assessment” button to open the Assessment graphical interface.

You can enter the number of runs, or replicates, the parameters to generate initial values and the intervals where those values should be (uniformly) simulated. Click on Run to execute the tool. Thus you are able to estimate the population parameters using several initial seeds and/or several initial conditions. Notice that you are able to set one initial parameter constant while generating the others. The minimum and maximum of the generated parameters can be modified by the user.
The workflow is the same between the runs and is the one defined in the interface (population parameters estimation must be selected). A result folder is generated for each set of initial parameters. Two kinds of graphics are given as a summary of the results

• one showing the estimated values (blue) with the estimated standard errors (red bars) for each replicate (if the Fisher information matrix estimation was included in the scenario)
• the superimposed convergence graphics for each parameter. In case of F.I.M estimation is included, then the black lines represents the mean of the estimated s.e..

Also, a .mat file is produced with all the results used for those graphics. Press Last Results button to reproduce the last result graphics if you already have ran the tool for the current project, and Cancel to close the interface.

### Best practices: what is the use the convergence assessment tool?

We cannot claim that SAEM always converges (i.e., with probability 1) to the global maximum of the likelihood. We can only say that it converges under quite general hypotheses to a maximum – global or perhaps local – of the likelihood. A large number of simulation studies have shown that SAEM converges with high probability to a “good” solution – it is hoped the global maximum – after a small number of iterations. The purpose of this tool is to evaluate the SAEM algorithm with initial conditions and see if the estimated parameters are the “global” minimum.
The trajectory of the outputs of SAEM depends on the sequence of random numbers used by the algorithm. This sequence is entirely determined by the “seed.” In this way, two runs of SAEM using the same seed will produce exactly the same results. If different seeds are used, the trajectories will be different but convergence occurs to the same solution under quite general hypotheses. However, if the trajectories converge to different solutions, that does not mean that any of these results are “false”. It just means that the model is case sensitive to the seed or to the initial conditions. The purpose of this tool is to evaluate the SAEM algorithm with several seeds to see the robustness of the convergence.

### What kind of graphics can be generated by Monolix?

The list of graphics corresponds to all the graphics that Monolix can generate

#### Model summary and convergence

• Project summary
• Spaghetti plot:  This graphic displays the original data w.r.t. time along with some additional informations.
• Convergence SAEM: This graphic displays the convergence of the estimated parameters with respect to the iteration number.

#### Diagnostic plot based on observed data

• Individual fits: This graphics displays the individual fits, using the population parameters with the individual covariates (red) and the individual parameters (green) w.r.t. time on a continuous grid.
• Obs. vs Pred: This graphic displays observations w.r.t. the predictions computed using the population parameters (on the left), and w.r.t. the individual parameters (on the right).
• Residuals: This graphic displays the PWRES (population weighted residuals), the IWRES (individual weighted residuals), and the NPDE (Normalized Prediction Distribution Errors) w.r.t. the time (on the top) and the prediction (bottom).
• VPC: This graphic displays the Visual Predictive Check.
• NPC: This graphic displays the numerical predictive check.

#### Diagnostic plots based on individual parameters

• Covariates: This graphic displays displays the estimators of the individual parameters in the Gaussian space (and those for random effects) w.r.t. the covariates.
• Parameters Distribution: This graphic displays the estimated population distributions of the individual parameters.
• Random Effects (boxplot): This graphic displays the distribution of the random effects with boxplots.
• Random Effects (joint dist.): This graphic displays scatter plots for each pairs of random effects.

#### Extensions

• BLQ: This graphic displays the proportion of censored data w.r.t. time.
• Prediction distribution: This graphic displays the prediction distribution.
• Categorized Data
• Time to Event data: This graphic displays the empiric (from observations) and theoretical (from simulations) survival function and average number of events for time to event data.
• Transition Probabilities: This figures displays the proportion of each category knowing the previous one for discrete models.
• Posterior distribution
• Contribution to likelihood: This graphic displays the contribution of each individual to the log-likelihood.

### What can be done with the graphics ?

The graphics menu bar proposes different optionsIt is possible to

• “Save as”, “Print”, “Close”, “CopyFigure” save, print, an close the figure and, in Windows, copy and paste it to any other document.
• “Zoom” menu activates or de-activates the zoom
• “Axes” menu proposes to use semi-log or log-log scale plots. Settings option gives the opportunity to choose labels and to change the scale of graphics. It is also possible to use the same axis limits for the different plots of the same graphic.
• “Stratify” menu opens the stratify window, see below.
• “Settings” menu opens a new window with the settings of the current graphic

### How to stratify graphics ?

When clicking on the stratify menu, the following window pops up.

The Stratify menu allows to create and use covariates for stratification purposes. It is possible to select a categorical covariate for splitting, filtering or coloring the data set. Also, in the bottom section, it can be defined a time filter.

### How to save the graphics ?

When clicking on the button “Outputs to save” in the interface, the following window pops up

The user can choose which graphics to save. Notice that

• the save starts after the display of the graphics.
• the graphics are saved in the result folder.
• if the graphic is not defined by the user using the “Select plots” button, it will not be saved.

### Purpose

The purpose of this plot is to display the original data w.r.t. time along with some additional informations.

### Example of graphic

In the proposed example, the concentration for the warfarin data set is presented.

### Settings

• Segments : Add/remove the segments between the observations,
• BLQ (with the color): Add/remove the segments between the observations
• Legend: Add/remove the legend (available only if there are several groups on the figure),
• Informations : Add/remove the informations on the measurements (total number of subjects, mean number of doses per subject, total, average, min and max number of observations)

By default, only Segments and Informations are activated.

In the following example, we defined stratification by the category SEX for the effect on the warfarin data set. Additionally, the following settings are the default settings and the legend.

### Best practices

• It is always good to have a look first at the Spaghetti plot before running the parameter estimation. Indeed, it is very convenient to see if all your data are ok, if something weird happens on a subject, …
• It is possible to see the Spaghetti plot just after loading the datas. For that, click on the icon next to the data file choice.
• For a better understanding and/or exploration of the data set, it is also possible to export the data set in Datxplore.

### Purpose

This graphic displays the convergence of the estimated parameters with respect to the iteration number.

### Example of graphic

In the following example, the parameters of a one compartmental model with first order absorption and linear elimination (on the theophylline data set) are estimated. One can see for distribution of the random effects associated to each parameter.

### Settings

• Display: number of graphics the user want to see at the same time. It is convenient when there are a lot of parameters. Indeed, using this setting, the user can focus on particular parameter convergence for example.

By default, all parameters are displayed.

### Purpose

The figure displays the individual fits, using the population parameters with the individual covariates (red) and the individual parameters (green) on a continuous grid. This is a good way to see on each subject the validity of the model, and the actual fit proposed. It is possible to show the computed individual parameters on the figure. Moreover, it is also possible to display the median and a confidence interval for ($y_{ij}$) estimated with a Monte Carlo procedure. The design and the covariates of each subject are used for the simulation.

### Example of graphic

In the proposed example, the concentration for the warfarin data set is presented. For each subject, the data are displayed with blue crosses along with the individual fit and population fit (the prediction using the estimated individual and population parameters respectively).

### Settings

• Grid arrange. Define the number of subjects that are displayed. The user can define the number of rows and the number of columns. Moreover, a slider is present to be able to change the subjects under consideration
• Display
• Filter : filter the observations when stratify
• Split occasions: Split the individual by occasions in case of IOV.
• Population fits : Model estimation on the subject’s design based on the population parameters.
• Individual fits : Model estimation on the subject’s design and covariate based on the individual parameters. The individual parameters come from the conditional mode or the conditional mean estimation.
• BLQ data : show and put in a different color the datas that are BLQ (Below the Limit of Quantification)
• Legend : add/remove the legend. There is only one legend for all the individual fits
• Informations : add/remove the individual parameter estimation for each individual fit.
• Prediction distribution
• The median value obtained with the predicted distribution coming from the population parameter.
• The prediction interval (associated with a level): allows to see the variability driven by the inter-individual variability.

By default, only the observed datas and the individual fit are proposed.

### Purpose

This figure displays observations ($y_{ij}$) versus the predictions ($\hat{y}_{ij}$) computed using the population parameters (on the left), and with the individual parameters (on the right).

### Example of graphic

In the proposed example, the concentration for the warfarin example is presented. It is modeled by a one compartment first order absorption model with a linear elimination. On the left, the scatter plot of the observed concentration with respect to the predicted concentration using the estimated population parameters is displayed. On the right, the scatter plot of the observed concentration with respect to the predicted concentration using the individual parameters is displayed.

### Settings

• Prediction
• Population prediction: add/remove the figure with the comparison between the population prediction and the observations.
• Individual prediction: add/remove the figure with the comparison between the individual prediction (based on the chosen conditional estimation) and the observations.
• Observed datas,
• Segments by individuals: Add/remove segments between the observation of each individual
• BLQ data : show and put in a different color the datas that are BLQ (Below the Limit of Quantification)
• Splie: add/remove a spline interpolation on the comparison between the prediction and the observation.
• Legend : add/remove the legend. There is only one legend for all the individual fits

By default, both population and individual prediction are proposed.

### Purpose

This graphic displays the PWRES (population weighted residuals), the IWRES (individual weighted residuals), and the NPDE (Normalized Prediction Distribution Errors) w.r.t. the time (on the top) and the prediction (bottom). The PWRES are computed using the population parameters and the IWRES are computed using the individual parameters. For discrete outputs, only NPDE and individual NPDE are used.

Population Weighted Residuals ($\text{PWRES}_{ij}$) are defined as the normalized difference between the observations and their mean. Let $y_i = (y_{ij}, 1 \leq j \leq n_i)$ be the vector of observations for subject i. The mean of $y_i$ is the vector $\mathbb{E}(y_i)=(\mathbb{E}(f(t_{ij};\psi_i), 1 \leq j \leq n_i)\$. Let $\text{V}_i$ be the $n_i \times n_i$ variance-covariance matrix of $y_i$. Then, the ith vector of the population weighed residuals $\text{PWRES}_i=(\text{PWRES}_{ij}, 1\leq j \leq n_i)$ is defined by

$\text{PWRES}_i = \text{V}_i^{-1/2}(y_i-\mathbb{E}(y_i))$

$\mathbb{E}(y_i)$ and $V_i$ are not known in practice but can be estimated by Monte-Carlo simulation.

Individual weighted residuals ($\text{IWRES}_{ij}$) are estimates of the standardized residual ($\epsilon_{ij}$) based on individual predictions

$\text{IWRES}_{ij} = \dfrac{ y_{ij}-f(t_{ij};\hat{\psi}_i)}{g(t_{ij};\hat{\psi}_i)}$

If the residual errors are assumed to be correlated, the individual weighted residuals can be decorrelated by multiplying each individual vector $\text{IWRES}_i = (\text{IWRES}_{ij} ; 1\leq j\leq n_i)$ by $\hat{\text{R}}_i^{-1/2}$, where $\hat{\text{R}}_i$ is the estimated correlation matrix of the vector of residuals $(\epsilon_{ij}; 1\leq j \leq n_i)$.

Normalized prediction distribution errors ($\text{NPDE}_{ij}$) are a nonparametric version of $\text{PWRES}_{ij}$ based on a rank statistic. For any (i,j), let $\text{F}_{ij} = \text{F}_{\text{PWRES}_{ij}}(\text{PWRES}_{ij})$ where $\text{F}_{\text{PWRES}_{ij}}$ is the cumulative distribution function (cdf) of $\text{PWRES}_{ij}$. NPDEs are defined as an empirical estimation of $\Phi^{-1}(\text{F}_{ij})$. In practice, one simulates a large number $K$ of simulated data set $y^{(k)}$ using the model, and estimate $\text{F}_{ij}$ as the fraction of simulated data below the original data, i.e:

$\hat{\text{F}}_{ij}=\frac{1}{K}\sum_{k=1}^K 1_{y_{ij}^{(k)}\leq y_{ij}^{\text{obs}}}$

By definition, the distribution of $\text{F}_{ij}$ is uniform on [0,1], we thus rather use $\Phi^{-1}(\text{F}_{ij})$, which follows a standard normal distribution (with $\Phi$ the cdf of the standard normal distribution). NPDEs are defined as an empirical estimation of $\Phi^{-1}(\text{F}_{ij})$, i.e $\text{NPDE}_{ij}=\Phi^{-1}(\hat{\text{F}}_{ij})$.

For continuous outputs, this figure shows the empirical distributions of the weighted residuals and the NPDE together with the standard Gaussian pdf and qqplots to check if the residuals are Gaussian.

### Example of graphic

In the following example, the parameters of a one compartmental model with first order absorption and linear elimination (on the theophylline data set) are estimated. One can see the PWRES, the IWRES and the NPDE w.r.t. the time (on the top), to the prediction (in the middle) and the comparison between the empirical and theoretical pdf of the PWRES, IWRES and NPDE are presented on the bottom. Notice, that in that example, the prediction intervals were added.

### Settings

• Residuals: The user can choose which residual will be displayed
• PWRES
• IWRES (using the individual parameter estimated using the conditional mode or the conditional mean)
• NPDE
• Select Graphics: The user can define which graphics will be represented
• Scatter Plot (TIME): the representation w.r.t. the time
• Scatter Plot (Predictions): the representation w.r.t. the prediction
• Histogram: the pdf of the associated residual
• QQ-plot for the residuals.
• Scatter Plot
• Observed data can be added
• BLQ data (and a possibility to add a different color) can be added if present
• Spline interpolation can be added
• Empirical percentiles for the 10%, 50% and 90% quantiles
• Theoretical percentiles for the 10%, 50% and 90% quantiles
• Prediction interval for the 10%, 50% and 90% quantiles
• Outliers (in dots or in area) to show the residuals that are out of the prediction interval
• By default, both displays are proposed with the individual parameters coming from the conditional mode estimation.
• Residual Distribution
• Empirical pdf
• Theoretical pdf
• Histogram

By default, all the residuals and all the graphics are displayed.

### Purpose

The principle of the VPC (Visual Predictive Check) is to assess graphically whether simulations from a model of interest are able to reproduce both the central trend and variability in the observed data, when plotted versus an independent variable (typically time). A VPC is based on multiple simulations with the model of interest and the design structure of the observed data (i.e., dosing, timing, and number of samples). Percentiles of the simulated data are compared to the corresponding percentiles of the observed data. The percentiles are calculated either for each unique value of the independent (x-axis) variable or for a bin across the independent variable. It allows us to summarize in the same graphic the structural and statistical models by computing several quantiles of the empirical distribution of the data after having regrouped them into bins over successive intervals.

By default, the median and 10th and 90th percentiles are presented. The interpercentile range between the outer percentiles of all the simulated data, i.e. the prediction interval, is displayed with a level of 90%.

### Example of graphic

In the following example, the parameters of a one compartmental model with first order absorption and linear elimination (on the theophylline data set) are estimated. The figure presents the VPC with the prediction interval for quantile 10%, 50% and 90%. Additionally,

### Settings

• Display:
• Observed data
• BLQ data (with a possibility to color them specifically)
• Empirical percentile
• Theoretical percentile
• Prediction interval
• Outliers (in dots or in area)
• Bins
• pcVPC using Uppsala prediction correction
• Legend

### Purpose

This graphic displays the numerical predictive check. It allows to compare empirical cumulative distribution function (CDF) of the observations with theoretical’s (computed from simulations).

### Example of graphic

In the following example, the parameters of a one compartmental model with first order absorption and linear elimination (on the theophylline data set) are estimated. One can see the numerical predictive check of the concentration along with the data set along with the prediction interval and the theoretical cdf.

### Settings

• Display
• Observed CDF
• Theoretical CDF
• Prediction interval (and its level)
• Outliers area
• Grid size for the sampling of the output
• Legend

### Purpose

This figure displays the estimated population distributions of the individual parameters. Settings allow you to show a summary of the theoretical distributions (mean, median, mode, percentiles), histograms, and a non-parametric estimation of the distribution. It is also possible to display the empirical distribution of the individual parameters and the shrinkages computed as follows for each random effect: $Shrinkage = 1 -\frac{Var(\hat{\eta})}{\hat{\omega}^2}$

### Example of graphic

In the following example, the parameter distribution of a one compartmental model with first order absorption and linear elimination (on the theophylline data set) is proposed. One can see for each parameter (ka,V,Cl) the estimated law, the histogram of the realizations and the calculation of the shrinkage.

### Settings

• Display
• Population distribution: parameter distribution given by the user with the estimated population parameter,
• Histogram: represents the histogram of the realization of the considered values,
• Non param. pdf: add/remove distribution estimated by the estimated realization of the random effects.
• Legend : add/remove the legend.
• Informations : add/remove the shrinkage in %
• Estimator. The user can define which estimator is used for the definition of the individual parameters.

By default, only the population distribution is proposed with the individual parameters coming from the conditional mode estimation.

### Purpose

The figure displays the estimators of the individual parameters in the Gaussian space, and those for random effects, (e.g. the conditional expectations $E(\phi_{i,l}/y;\hat{\theta})$ and $E(\eta_{i,l}/y;\hat{\theta})$ for i from 1 to N and the conditional modes) v.s. the covariates.

### Example of graphic

In the proposed example, the parameters estimation for a PKPD model on the warfarin data set is presented. The random effects of 3 parameters of the PD model are displayed w.r.t. a transformed version of the weight (t_wt=log(wt/70)), the weight, and the sex category.

### Settings

• Grid arrange. Define the number of individual parameters (or random effects) and covariate that are displayed. The user can define the number of rows and the number of columns.
• Display
• The user can choose to see either the individual parameters or the Random effects.
• Data
• Splines: add/remove the a spline interpolation
• Regression lines: add/remove the affine fit
• Informations : add/remove the correlation information for each continuous covariate and each random effect.
• Legend : add/remove the legend.
• Estimator. The user can define which estimator is used for the definition of the individual parameters.

### Purpose

This graphic displays the distribution of the random effects with boxplots. The horizontal dotted lines show the interquartile interval of a standard Gaussian distribution.

### Example of graphic

In the following example, the parameters of a one compartmental model with first order absorption and linear elimination (on the theophylline data set) are estimated. One can see for distribution of the random effects associated to each parameter.

### Settings

• Display
• Theoretical median
• Theoretical quartiles
• Estimator. The user can define which estimator is used for the definition of the individual parameters and thus for the random effects.

By default, both displays are proposed with the individual parameters coming from the conditional mode estimation.

### Purpose

This graphic displays scatter plots for each pairs of random effects.

### Example of graphic

In the following example, the parameters of a one compartmental model with first order absorption and linear elimination (on the theophylline data set) are estimated. One can see for distribution of the random effects associated to each other. Regressions and spline interpolation are added to see the correlation between random effects.

### Settings

• Display
• Data
• Spline
• Regression
• Estimator. The user can define which estimator is used for the definition of the individual parameters and thus for the random effects.

By default, both displays are proposed with the individual parameters coming from the conditional mode estimation.

### Purpose

This graphic displays the proportion of censored data w.r.t. time. It is possible to chose the censoring interval. This graphic is only available for projects with censored data.

### Example of graphic

In the following example, the parameters of a PKPD model (on the censored warfarin data set) are estimated. The figure presents the BLQ w.r.t. with the prediction interval.

### Settings

• Display:
• Censored data min and max. By default, the limit and the censored values are used. However, one can look at smaller censored interval for example.
• BLQ Frequency observation calculated by simulation
• Median
• prediction interval with the associated level
• Outliers (area)
• Grid time size
• Legend

By default, the censored area corresponds to the data set description and the BLQ frequency observation, the prediction interval, the outliers and the legend are displayed.

### Purpose

This graphic displays the prediction distribution. It allows to compare the observations with the theoretical distribution of the predictions.

### Example of graphic

In the following example, the parameters of a one compartmental model with first order absorption and linear elimination (on the theophylline data set) are estimated. One can see the prediction distribution of the concentration along with the data set.

### Settings

• Display
• Observed data
• Median
• Continuous or discrete representation of the distribution
• Level (meaning that the distribution corresponds to $[50-\frac{level}{2}, 50+\frac{level}{2}]$)
• Number of band and the associated percentile in case of a discrete representation
• Legend

By default, only the prediction distribution and the median are displayed.

### Purpose

This graphic displays the proportion of each category knowing the previous one. It is only available for discrete observations and any discrete model.

### Example of graphic

In the following example, the parameters of a three-categories markov model are estimated. The figure presents the proportion of each category knowing the previous one. Moreover, there are no dependency on the previous one, only on figure representing the modalities proportion is displayed.

### Settings

• Proportion
• Cumulated by modalities
• Separated
• Display
• Bins
• Legend

### Purpose

This graphic displays the empiric (from observations) and theoretical (from simulations) survival function and average number of events for time to event data. It is only available for outputs of type event.

### Example of graphic

In the following example, the parameter Te of a event model with a constant hazard function (=1/Te) is estimated. The figure presents the Kaplan-Meier plot w.r.t. the time (top) and the mean number of events (bottom).

### Settings

• Selection:Kaplan-Meier plot which represent the survival function.
• mean number of events
• Display
• Empirical
• Predicted median
• Prediction interval
• Censored data
• Accuracy
• Legend
• prediction distribution
• Level
• Number of bands

By default, the empirical of the survival function and of the mean number of events is displayed.

### Purpose

This graphic displays the contribution of each individual to the log-likelihood. This graphic is only available if the log-likelihood was previously computed.

### Example of graphic

In the following example, the parameters of a one compartmental model with first order absorption and linear elimination (on the theophylline data set) are estimated. One can see the contribution of each subject.

### Settings

• Label
• Index
• Name of the subject
• Legend

### 5.FAQ

#### Regulatory

• What is needed for a regulatory submission using Monolix? Monolix is used for regulatory submissions (including the FDA and the EMA) of population PK and PK/PD analyses. The summary of elements needed for submission can be found here.
• How to cite Monolix2016R1? To cite Monolix, please reference it as here
Monolix version 2016R1. Antony, France: Lixoft SAS, 2016.
http://lixoft.com/products/monolix/

#### Running Monolix

• Do I need Matlab to run Monolix? No, there is a stand alone version.
• On what operating system Monolix run? MonolixSuite runs on both Windows, Linux and MacOS platform.
• Is it possible to run Monolix using a simple command line? Yes, see here.
• Can I run a batch function to run a list of projects and/or all the projects in a directory? Yes, please see here R and Matlab functions to do that.

#### Initialization

• How to initialize my parameters? There are several ways to initialize your parameters and visualize the impact. See here the different possibilities.

#### Results

• Can I define my self the result folder? By default, the result folder corresponds to the project name. However, you can define it by yourself. See here to see how to define it on the user interface.
• Can I define the output to save (graphics, tables, …)? Yes, see here.
• What result files are generated by Monolix? Monolix proposes a lot of different output files depending on the tasks done by the user. here is a complete listing of the files along with the condition for creation. See here for more information.
• Can I replot the graphics using another plotting software? Yes, if you go to the menu Graphics and click on “Export graphics data”, all the data needed to reproduce the graphics are stored in text files.

• How are the censored data handled? The handling of censored data is described here.
• Is it possible to define mixture of structural models? Yes, it may be necessary in some situations to introduce diversity into the structural models themselves using between-subject model mixtures (BSMM) or within-subject model mixtures (WSMM). The handling of mixture of structural models is defined here. Notice that in the case of a BSMM, the proportionbetween groups is a population parameter of the model to estimate. There is no inter-patient variability on p: all the subjects have the same probability and a logit-normal distribution for p  must be used to constrain it to be between 0 and 1 without any variability.
• Is it possible to define mixture of distributions? Mixture models usually refer to models for which the categorical covariate is unknown, but whichever the case, the structural model is the same. The handling of mixture of structural models is defined here.
• What is the difference between the complete log-likelihood computed and displayed during SAEM and the log-likelihood computed as separate task (by linearization and/or importance sampling)? The log-likelihood is defined as $\sum_{i=1}^{N_{\text{ind}}}\log\left(p(y_i;\theta)\right)$. It is the relevant quantity to compare model, but unfortunately it cannot be computed in closed form because the individual parameters $\phi_i$ are unobserved. Thus, to estimate the log-likelihood an importance sampling Monte Carlo method is used in a separate task (or an approximation is calculated via linearization). To know more on the log-likelihood calculation using linearization or importance sampling, see here. On the contrary, the complete likelihood refers to the joint distribution $\sum_{i=1}^{N_{\text{ind}}}\log\left(p(y_i,\phi_i;\theta)\right)$. By decomposing the terms as $p(y_i,\psi_i;\theta)=p(y_i|\psi_i;\theta)p(\psi_i;\theta)$, we see that this quantity can easily be computed using as $\phi_i$ the individual parameters drawn by MCMC for the current iteration of SAEM. This quantity is calculated at each SAEM step and is useful to assess the convergence of the SAEM algorithm.
• When estimating the log-likelihood via importance sampling, the estimates does not seem to stabilize. What can do? The log-likelihood estimator obtained by importance sampling is biased by construction (see here for details). To reduce the bias, the conditional distribution $p_{\phi_i|y_i}$ should be known as well as possible. For this, estimate the individual parameters with the “Cond. means and s.d” option before estimating the log-likelihood.

#### Tricks

• How to compute AUC, time interval AUC, … using Mlxtran in a Monolix project?  See here.

### 5.1.Evolutions from Monolix433 to Monolix2016R1

Evolutions of the Mlxtran language now allow to share a model in the workflow from model exploration (using Mlxplore), to parameter estimation (using Monolix), and simulation (using Simulx). Therefore, the proposed evolutions of the language merge the needs of all these applications and the model written with this new syntax will be compatible with all Lixoft software. See here to see the evolution and the eventual impact on your model.

### Main improvements and new features:

– The interface was redesigned for clarity and simplicity.
– A button providing the statistical model summary was added
– A button providing a direct browse to the demo was added
– All the libraries are now using the Mlxtran syntax. Therefore, if you were using a model from the Matlab library, it will automatically be transformed into its Mlxtran equivalent model when saving. The Matlab models are still available although their use is no longer recommended.

#### Enhancements:

Numerical
– Improved LL computation when the sensitivity analysis is verified
– Improved simulated annealing initialization
System
Naming
– Latent covariate by default are now called lcat, lcat1, etc
– Error model parameters are by default now called (a,b,c) if there is only one continuous error model
– The autocorrelation parameter is by default now called (r) if there is only one output with autocorrelation
– IOV variability parameters are by default now called (gamma_) if there is only one level, (gamma1_, gamma2_) if there are more
– Latent covariate probabilities are by default now called (plcat_1, plcat_2) for a latent covariable with three categories if there is only one latent covariate, (plcat1_1, plcat1_2, ) if there are more.

#### Bugs fixed:

– bug fix: fixed effects were not correctly initialized when they were not simulated
– bug fix: fixed problems estimating autocorrelation (SAEM) and the corresponding Fisher information matrix
– bug fix: limits of the confidence intervals in VPC
– bug fix: dde solver precision in interpolation of delays
– bug fix: computation of the log-likelihood of a multivariate gaussian vector on an interval during FIM computation by linearization for censored data was corrected
– bug fix: simulated annealing on standard error parameters for mixture projects was not well used
– bug fix: when SS and ADDL were used at the same time in the data set, it was not well interpreted.
– bug fix: occasion time management was not correct when evid=1 and the first occasion of the subject had no observations
– bug fix: corrections when several covariates are created with the same initial covariate
– bug fix: corrections when IOV occurs with discrete models
– bug fix: correction on individual parameter estimate in conditional mode with no IIV

### 5.2.How to update your model to be compatible with the new Mlxtran V2 syntax ?

You have a Mlxtran model you used with Monolix 4.3.3 and older. In this case your model is using Mlxtran V1 syntax which is not compatible anymore with the current version of Monolix 2016R1. In order to continue using your model with Monolix 2016R1 you need to make some changes in the model syntax. These are very minor changes and, in the following, we show you how to convert your model from Mlxtran V1 to V2.

## Why an evolution of Mlxtran ?

With the transition to Monolix 2016R1 we are introducing the new version (V2) Mlxtran syntax. We made sure that Monolix 2016 R1 is backwards compatible with the old V1 Mlxtran syntax. This means that all your previous models can be run with Monolix 2016R1 if this is needed. However, there are two critical limitations with V1 Mlxtran and we strongly advice you to adopt V2 Mlxtran syntax.

• The first limitation is that you cannot edit V1 Mxltran models with Monolix 2016 R1. This means that if you want to change a model you must convert it to V2 syntax anyway.
• The second limitation concerns the use of Mlxplore and Simulx after parameter estimation with Monolix 2016R1 (workflow). This will not be possible if you do not use V2 syntax in Monolix

## What changed in the longitudinal model?

To define the file as a model, the block [LONGITUDINAL] must be added so that Monolix understands that the file contains a structural model. [LONGITUDINAL] should be placed at the beginning of the file or after the optional DESCRIPTION:.

### Modification of the input definition

The declaration of the inputs is changed.

Modification of the parameter definition.

In the V1 syntax, the parameters were defined by “parameter=”. Now, in V2, they must be defined with “input=”.

Modification of the regressor definition

In the V1 syntax, the regressors were defined by “regressor=”. Now, in V2, the following syntax must be used:

1/ The name of the regressors have to be added in the inputs.
2/ Each regressor has to be declared in the following way “NameRegressor = {use=regressor}”
Note that the order of definition of the regressors in the input list must agree with the order of definition in the data set (X columns).

### Modification of the observation definition

In the V1 syntax, the observations were defined by “OBSERVATION:”. Now, in V2, they must be defined with “DEFINITION:”.

## Two examples

Here are two examples of models implemented with V1 for Monolix 4.3.3 and the correct translation to V2 syntax for 2016R1.

### Turnover model for multiple doses.

In the warfarin demos, a turnover model for multiple doses is used. It writes

; Model using old V1 syntax
DESCRIPTION: Turnover model for multiple doses.

INPUT:
parameter =  {Tlag, ka, V, Cl, I_max, C50, Rin, kout}

EQUATION:
Cc = pkmodel(Tlag, ka, V, Cl)
E_0 = Rin/kout
ddt_E = Rin*(1-I_max*Cc/(Cc+C50)) - kout*E

OBSERVATION:
Concentration = {type=continuous, prediction=Cc, errorModel=combined1}
PCA = {type=continuous, prediction=E, errorModel=constant}

OUTPUT:
output = {Concentration, PCA}


In this case, one has to modify the input and observation definition. Thus, the model writes in the new V2 syntax

; Model using new V2 syntax
DESCRIPTION: Turnover model for multiple doses.

[LONGITUDINAL]
input =  {Tlag, ka, V, Cl, I_max, C50, Rin, kout}

EQUATION:
Cc = pkmodel(Tlag, ka, V, Cl)
E_0 = Rin/kout
ddt_E = Rin*(1-I_max*Cc/(Cc+C50)) - kout*E

DEFINITION:
Concentration = {type=continuous, prediction=Cc, errorModel=combined1}
PCA = {type=continuous, prediction=E, errorModel=constant}

OUTPUT:
output = {Concentration, PCA}


### Sequential PK/PD model – Individual PK parameters have been estimated first, they are defined as regression variables.

In the warfarin demos, a turnover model for multiple doses is used. It writes

; Model using old V1 syntax
DESCRIPTION: Sequential PK/PD model.
Individual PK parameters have been estimated first, they are defined as regression variables.

INPUT:
parameter =  {Imax, C50, Rin, kout}
regressor =  {Tlag, ka, V, Cl}

EQUATION:
Cc = pkmodel(Tlag,ka,V,Cl)
E_0 = Rin/kout
ddt_E= Rin*(1-Imax*Cc/(Cc+C50)) - kout*E

OUTPUT:
output = E


In this case, one has to modify the input definition. Thus, the model writes in the new V2 syntax

; Model using new syntax
DESCRIPTION: Sequential PK/PD model.
Individual PK parameters have been estimated first, they are defined as regression variables.

[LONGITUDINAL]
input = {Imax, C50, Rin, kout, Tlag, ka, V, Cl}
Tlag = {use = regressor}
ka = {use = regressor}
V = {use = regressor}
Cl = {use = regressor}

EQUATION:
Cc = pkmodel(Tlag,ka,V,Cl)
E_0 = Rin/kout
ddt_E= Rin*(1-Imax*Cc/(Cc+C50)) - kout*E

OUTPUT:
output = E


### 5.3.Monolix2016R1known bug list

Bugs have been reported for which there are workarounds

#### Structural model

• When you have a structural model using a delay differential equation, if the initial condition of the delayed variable is time varying, then the information is not taken into account. Workaround: It is possible to code the initial condition in a different way as can be seen here.
• If the initial value of a differential equation is defined by a regressor, it is not well taken into account. Workaround: In the Mlxtran model file, remove the regressor definition, and use instead a “PK:” block with a “depot” macro to add the initial value of your ODE at the desired time. If you do not have a AMT column, you just tag your regressor column X into AMT. Else wise, you need to modify your AMT column to include the regressor value and affect it to another administration route (using the ADM column).
• For the indirect response model from the PD/PDe libraries (all the turn_input and turn_output model files), the initialization is incorrect. Workaround: You have to add the initial condition E_0 = Rin/kout to your model. You have to change it directly in your library (in monolixSuiteInstallationDir/factory/library/pd and monolixSuiteInstallationDir/factory/library/pde).

#### Project definition

• If the data set contains header or categorical covariate with modalities containing special characters “.”, “-“, “*”,… (see here for the full list), the project runs fine and can be saved. However, there is an issue when reloading the project. Workaround: To be able to reload the saved project, change the special character by a letter (“X” for example). Then, the modified project will be loaded.  The categories will be automatically recomputed.
• In the covariate transform window, if you change the name of a categorical covariate without any transformation. There will be problems in the save/load. Workaround: Use the button “ADD”. Using that, you will define any transformed covariates, possibly just rename it if no transformation is applied.
• When selecting the option “mixed method” for parameter without variability in Settings > Pop. parameters > Advanced settings, the choice of the method is not properly saved when saving the project. As a consequense, when reloading a saved project, the  default method “no variability” will be used. Workaround: manually set the method to “mixed method” after having loaded the project

#### Output files

• When using “Export Graphics Data” for prediction-corrected (uppsala correction) VPCs, the outputted observation data points are not prediction-corrected. As a workaround, an R script permits to generate the prediction-corrected output file for most cases (no BLQ data, no stratification). Download here.

### 5.4.Running Monolix using a command line

Yes, it is possible to run Monolix using a simple command line:

• with the standalone version of Monolix
monolixSuiteInstallationFolder/bin/Monolix.sh [–nowin]  -p fullPathProjectName
-f [run|saem|fim|ll|graphics]

(replace .sh into .bat for windows operating system)

• with the Matlab version of Monolix (the command has to be launched from the matlab directory of the installation of Monolix , i.e. /matlab)
matlab -wait -nosplash -nodesktop -r "monolix(’-nowin’, ’-p’,
’fullpathProjectname’,’-f’,’[run|saem|fim|ll|graphics]’, ’-destroy’ ),exit"

Notice that the project name should be defined using a full path and not a partial path.
The program options are

•  -nowin : without opening a window, mandatory in no-desktop environments.
•  -p : project to run. It should be the full path name of the project
•  -f run|saem|indivfim|ll|graphics to run a task
• -f run: run the workflow defined in the project file.
• -f saem: estimate population parameter.
• -f indiv: estimation of the individual parameters
• -f fim: estimate the standard errors of the estimates and Fisher information matrix.
• -f LL: estimate the log-likelihood.
• -f graphics: generate the result graphics.
• To run several tasks, you just have to set them successively, e.g. -f saem -f LL

### 5.5.Monolix batch run

It is possible to run calculation of several Monolix projects through a single command line from R or Matlab. For that, you can use the following functions (here for R and here for Matlab). To use it

source('runMonolix.R')
runMonolix(name, mlxInstallDir, display, saveGraphics, scenario)


where

•  name: Define the Mlxtran files to run in absolute path
• If it is a list of Mlxtran files, run them
• If it is a list of directory, run all the Mlxtran files in the folder
• It can be a combination of both
• mlxInstallDir : Define where MonolixSuite is installed
• Typically for windows mlxInstallDir <- ‘C:/ProgramData/Lixoft/MonolixSuite2016R1/bin/’
• display : Boolean to define if Monolix calculation is displayed or not (FALSE by default)
• saveGraphics : Boolean to define if the graphics are saved or not (FALSE by default). It is not done by default in the Monolix project configuration.
• scenario : Monolix scenario could be a list of
• saem: estimate population parameters
• indiv: estimate individual parameters
• fim: estimate the standard errors of the estimates and Fisher information matrix
• ll: estimate the log-likelihood
• graphics: generate the result graphics
• By default, the run option is used to get the scenario from the project

Several examples,

• source('runMonolix.R')
runMonolix(name='C:/Users/jonathan/mydemos/1.creating_and_using_models/libraries_of_models',
mlxInstallDir='C:/ProgramData/Lixoft/MonolixSuite2016R1/bin/')

run all the projects in the defined folder. The scenario will be the one defined in each projects

• source('runMonolix.R')
runMonolix(name='C:/Users/jonathan/mydemos/1.creating_and_using_models/libraries_of_models',
mlxInstallDir='C:/ProgramData/Lixoft/MonolixSuite2016R1/bin/',scenario=c('saem','ll'))

run all the projects in the defined folder. The scenario will be the calculation of SAEM and the the Log-Likelihood for each projects

• source('runMonolix.R')
runMonolix(name=c('C:/Users/jonathan/mydemos/1.creating_and_using_models/libraries_of_models',
'C:/Users/jonathan/mydemos/1.creating_and_using_models/libraries_of_models/2.2.censored_data/censoring1_project.mlxtran'),
mlxInstallDir='C:/ProgramData/Lixoft/MonolixSuite2016R1/bin/', scenario=c('graphics'), saveGraphics = TRUE)

run all the projects in the defined folder and the other Mlxtran project. The scenario will only be the calculation of the graphics and they will be saved.

### 5.6.Submission of Monolix analysis to regulatory agencies

Monolix is used for regulatory submissions (including the FDA and the EMA) of population PK and PK/PD analyses. Monolix analyses for first in human dose estimation, dose-finding studies and registration studies have been routinely and successfully submitted to the FDA, EMA [3] and other agencies. Some of the agencies like the FDA or the EMA do have Monolix and the modelling experts to understand, review and run Monolix. Regulators are also taking part in publishing research articles with Monolix [2].
Regulatory guidelines [1] provide only little information on the required electronic files for submission. Based on exchanges with regulatory agencies and confirmed through past regulatory submissions using Monolix, the following listed files (in Table 1) are required for a complete Monolix analysis submission package. The Monolix analysis submission package listed in Table 1 has all the files needed to run Monolix without any implementation work and to reproduce the results. Attention must be payed to use relative file path definitions to facilitate the project transfer from one computer system to another (Monolix documentation). Further, it is important to consider that Table 1 represents the Monolix submission package for a typical analysis. There might be modelling cases that require the submission of additional material. Each submission should be treated separately and carefully reviewed for completeness cases by cases.

Table 1 Monolix analysis submission package

 File Explanation File type, format Report (.pdf) Report detailing all of the modelling according to the EMA or FDA guidelines [1] PDF Data (.csv) Data file containing all the observations, dosing history, patient specific information and other information provided via the data file Text file, CSV Project file (.mlxtran) Defines what data file, model file, algorithm settings, graphics settings and parameters have been used to generate the results Text file, Mlxtran Model file (.txt) The structural model in Mlxtran syntax Text file, Mlxtran Graphics file (.xmlx) Definition of the diagnostic plot settings Ext file, XMLX

In addition, some Monolix files define project independent parameters that are applied to all projects run on the same computer account. In all but the most special cases users will have these parameters set to the default values. In case these parameters are modified, they should also be communicated. They can be found in the following files:

Table 2 Computer account dependent settings

 File Parameter Explanation monolix/config/config.ini useAnalyticalFunctions=true Setting defines if analytical functions are used (default is ‘true’). The rest of the file are path configurations for temporary files. monolix/config/preferences.xmlx This defines the number of doses (default is 4) added before the dose entered with the SS flag set to true (so 5 doses in total, by default). It is only used when a SS column is used in the data set.

A Monolix run will automatically produce a large number of additional files. However, the files listed in Table 1 and Table 2 are sufficient to entirely reproduce the results. Note that all files in Table 1 are in a human readable format. Thus, the information contained in these files can also be included into the appendices of the report creating one single document that contains everything to reproduce the results.

#### References

[1] Applicable regulatory guidelines

• EMA “Guideline on reporting the results of population pharmacokinetic analyses” (CHMP/EWP/185990/06): details what should be contained in a PK or PK/PD analysis report.
• FDA “Guidance for Industry Population Pharmacokinetics”: takes a larger scope also addressing “Data Integrity and Computer Software’ and ‘Electronic Files”.
• FDA “Exposure-Response Relationships – Study Design, Data Analysis, and Regulatory Applications”.

[2] FDA publications using MonolixSuite

• “Plasma pharmacokinetics of ceftiofur metabolite desfuroylceftiofur cysteine disulfide in holstein steers: application of nonlinear mixed-effects modeling.”, J Vet Pharmacol Ther 2016 Apr;39(2):149-56, DOI: 10.1111/jvp.12245. O. A. Chiesa, S. Feng, P. Kijak,E. A. Smith, H. Li and J. Qiu.
• “Quantification of disease progression and dropout for Alzheimer’s disease.”, Int. Journal of Clinical Pharmacology and Therapeutics, Volume 51 – February (120 – 131),(Doi: 10.5414/CP201787). D. William-Faltaos, Y. Chen, Y. Wang, J. Gobburu,, and H. Zhu.
• “Estimation of Population Pharmacokinetic Parameters Using MLXTRAN Interpreter in MONOLIX 2.4”, D. William Faltaos, Acop 2009.

[3] Submission with Monolix analyses to the EMA

• Procedure No. EMEA/H/C/000402/II/0110/G – EMA/CHMP/186699/2015 –  Tamiflu – International non-proprietary name: OSELTAMIVIR. (link)
• Procedure No. EMEA/H/C/000401/II/0066 – EMA/168487/2015 – Tracleer –  International non-proprietary name: BOSENTAN. (link)

### Introduction

Initial values are specified for the Fixed effects, for the Variances of the prandom effects and for the Residual error parameters. It is usually recommended to run SAEM with different initializations and to compare the results (this can be done via a convergence assessment, accessible through the “Assessment” button in the GUI).

### Initialization of the “Fixed effects”

In this frame in the middle of the “Initialization” frame, the user can modify all the initial values of the fixed effects. There is at least on initial values by individual parameters. If no value is pre-defined (from a previously saved project for example), a default value at 1 is proposed. A right click on an initial value displays a list with three options.

• “Maximum likelihood estimation” (default): The parameter is estimated using maximum likelihood. In that case, the background remains white.
• “Fixed”: the parameter is kept to its initial value and so, it will not be estimated. In that case, the background is set to pink.
• “Bayesian estimation”: a pop-up windows opens to specify a prior distribution for this parameter. In that case, the background of the case is set to purple. To see more on the prior distribution, click here.

When an individual parameters depends on covariate, initial values for the dependency (named with $\beta$ prefix, for instance $\beta\_ka\_sex$ to add the depedency of sex, on parameter ka) are displayed. The default initial value is 0. The three same options are also valid for these parameters.

In case of a continuous covariate, the covariate is added linearly to the transformed parameter, with a coefficient $\beta$. For categorical covariates, the initial value for the reference category will be the one of the fixed effect, while for all other categories it will be the initial value for the fixed effect plus the initial value of the $\beta$, in the transformed parameter space. It is not possible to define different initial values for the non-reference categories. The equations for the parameters can be visualized by clicking on the white paper clip button next to the model file.
In case of IOV, the same initialization is performed but in the IOV window when clicking on “IOV” on the right of the covariance model.

### Initialization of the “Stand. dev. of the random effects”

In this frame in the middle of the “Initialization” frame, the user can modify all the initial values of the standard deviation of the random effects. The default value is set to 1. Each individual parameter has its initial value for the corresponding random effect. However, if the user specifies in the covariance model that a parameter has no variance (put 0 in the corresponding diagonal term of the covariance model), the corresponding standard deviation is set to 0 and the box is grayed to show the user that no modification can be done. A right click on an initial value displays a list with two options: “Maximum likelihood estimation” (default choice), and “Fixed”.
A button is added to offer the possibility to work with either the variance or the standard deviation. Notice that the transformations are done automatically by Monolix.

### Initialization of the “Residual error parameters”

In this frame on the right of the “Initialization” frame, the user can modify all the initial values of the residual error parameters. There are as many lines as outputs of the model. The non-grayed values depend on the considered observation model. The default value depends on the parameter (1 for “a”, 0.3 for “b” and 1 for “c”). A right click on an initial value displays a list with two options: “Maximum likelihood estimation” (default choice), and “Fixed”.

### How to initialize your parameters?

#### Check initial fixed effects

When clicking on the “Check the initial fixed effects” button, a pop-up window opens. The simulations obtained with the initial population fixed effects values are displayed for each individual together with the data points, in case of continuous observations. This feature is very useful to find some “good” initial values. Although Monolix is quite robust with respect to initial parameter values, good initial estimates speed up the estimation. You can change the values of the parameters with the buttons + and – and see how the agreement with the data change.

#### On the use of Mlxplore

Another convenient method is to use Mlxplore to have an initial guess. For that, you can export your model to Mlxplore (blue button next to the modle file in the Monolix GUI) and play with the initial values to see your model response. In the “Data” tab of the Mlxplore GUI, you can choose to overlay your data on top of the simulation. The main interest, compared to the “check the initial fixed effects” feature, is that you can also modify your structural model.

#### On the use of last estimates

If you have already estimated the population parameters for this project, then you can use the “Use the last estimates” button to use the previous estimates as initial values.

### 5.8.How to compute AUC using in Monolix and Mlxtran

Often the Area under the PK curve (AUC) is needed as an important PK metric to link with the pharamcodymanic effect. We show here how the AUC can be computed within the mlxtran model file and be outputted for later analysis.

In the EQUATION section of the Mlxtran model you can use the following code. It is the basic implementation of the AUC for a 1-compartmental model with absorption rate ka. It will compute the AUC from the start to the end of the integration.

Notice that the calculation can be used as an output and thus in the output={} definition, or just computed and available for post-treatment using table={} as can be seen here.

#### AUC for t=0 to t=tend and fast computation with linear ODE

INPUT:
parameter = {ka, V, Cl}

PK:
compartment(cmt=1, amount=Ac, volume=V, concentration=Cc)
elimination(cmt=1, Cl)
oral(ka, cmt=1)

EQUATION:
odeType = linear
ddt_AUC = 1/V * Ac

OUTPUT:
output = {Cc, AUC}


#### Time interval AUC

If you want to limit the computation of the AUC to a time period you can use the following code:

INPUT:
parameter = {ka, V, Cl}

PK:
compartment(cmt=1, amount=Ac, volume=V, concentration=Cc)
elimination(cmt=1, Cl)
oral(ka, cmt=1)

EQUATION:
ddt_AUC = 1/V * Ac
if(t < 50)
AUC50 = AUC
end
if(t < 100)
AUC100 = AUC
end
AUC50_100 =AUC100-AUC50

OUTPUT:
output = {Cc, AUC50_100}


Note that the t==tDose would not work because the integrator does not necessarily evaluate the time exactly at the times of doses. Thus the test t==tDose might not be tested at all during the computation.

#### Fast computation with linODE and dose interval AUC (AUCtau)

In many cases the mlxtran code above will be sufficient. For cases where you need fast execution you use a special solver for linear ODE systems by specifying odeType = linear in the EQUATION section. This does not work with if statements and you will need to use the code below. In this code the AUC is computed for each dose interval. Thus, at each dose the AUC is set to zero and the concentration is integrated until the next administration.

INPUT:
parameter = {ka, V, Cl}

PK:
compartment(cmt=1, amount=Ac, volume=V, concentration=Cc)
elimination(cmt=1, Cl)
oral(ka, cmt=1)

; Create a separate dummy compartment for the AUC
compartment(cmt=2, amount=AUCtau)
iv(cmt=2, p= - AUCtau/amtDose)

EQUATION:
odeType=linear
ddt_AUCtau = 1/V * Ac

OUTPUT:
output = {Cc, AUC}


### Purpose

When clicking of the “Output to save” button, the user can define the outputs (graphics and tables) that will be saved in the result folder.

### Graphics

The list of graphics corresponds to all the graphics that Monolix can generate

• Project summary
• Spaghetti plot:  This graphic displays the original data w.r.t. time along with some additional informations.
• Individual fits: This graphics displays the individual fits, using the population parameters with the individual covariates (red) and the individual parameters (green) w.r.t. time on a continuous grid.
• Obs. vs Pred: This graphic displays observations w.r.t. the predictions computed using the population parameters (on the left), and w.r.t. the individual parameters (on the right).
• Covariates: This graphic displays displays the estimators of the individual parameters in the Gaussian space (and those for random effects) w.r.t. the covariates.
• Parameters Distribution: This graphic displays the estimated population distributions of the individual parameters.
• Random Effects (boxplot): This graphic displays the distribution of the random effects with boxplots.
• Random Effects (joint dist.): This graphic displays scatter plots for each pairs of random effects.
• Convergence SAEM
• Residuals
• VPC
• NPC
• BLQ
• Prediction distribution
• Categorized Data
• Time to Event data
• Transition Probabilities
• Posterior distribution
• Contribution to likelihood

To know more about these graphics, see the Graphic section. Moreover, if these graphics are not generated (if not feasible nor requested by the user in the Graphic selection), they will not be saved.

### Tables

The list of tables corresponds to all the tables that Monolix can generate. A more detailed description of the content of the tables can be found here.

• Observation times:  It generates a file predictions.txt containing Observation times table. It contains the regression variables, the individual predictions, the population predictions, the weighted population residuals, the weighted individual residuals and the NPDE, for all the observation times in the data set.
• Fine-grid times: It generates a file finegrid.txt containing Fine-grid times table. It contains the regression variables, the  individual predictions, the population predictions in a fine (regular) grid.
• All times: It generates a file fulltimes.txt containing All times table. It contains the regression variables, the individual predictions, the population predictions for a possibly larger set of times in the data set, including observation times, lines where MDV column is set to 2.
• LL individuals contribution: it generates a file individualContributionToLL.txt containing LL individuals contribution table. It contains subject contribution to the total log-likelihood.
• Covariates summary: It generates a file covariatesSummary.txt containing Covariates summary table. It contains a summary of all the covariates defined in the project.

### 5.10.Bayesian feature : on the use of priors on a fixed effect

When you select to define a prior distribution on a fixed effect, a new window will open to define its law as in the following

As for individual parameters, you can choose from some given distributions (normal, log-normal, logit-normal and probit-normal) or you can define your own as a transform T of a Gaussian distributed variable.
Assuming $\theta$ to be the chosen fixed effect with the transformation $\theta = T(Z)$ where $Z \sim {\cal N} (\mu_Z; \sigma^2_Z)$ is a Gaussian distributed variable. You must specify $m_{\theta}$ the typical value of $\theta$ ($\mu_Z = T^{-1}(m_{\theta})$ and the variance or standard deviation of Z. By default, the current initial value will be used as typical value. Also, selecting a different typical value will set it automatically as initial value for the corresponding parameter. The user can also choose between two estimation methods: M.A.P. and posterior distribution.
Notice that Monolix can estimate the M.A.P only for

• gaussian priors if the parameter is a covariate coefficient (a “$\beta$”)
• priors with same distribution than the corresponding individual parameter if $\theta$ is an intercept. It means that if V is set as log-normal distributed, then the M.A.P of $\theta$ ($\theta =\textrm{intercept of V}$) can only be computed for log-normal priors on $\theta$.

### 5.11.How to export to Datxplore, Mlxpore and Simulx ?

Using Monolix interface, it is possible to export your data set, your model, your project to other Lixoft software.

## Export to Datxplore

Datxplore is a graphical and interactive software for the exploration and visualization of data. Datxplore provides various plots and graphics (box plots, histograms, survival curves…) to study the statistical properties of discrete and continuous data, and to analyze the behavior depending on covariates, individuals, etc. It is a great application to have a better look to your data set.

In Monolix, you can export your data set by clicking on the Datxplore icon in the Data and model frame as on the following figure. Datxplore will be directly launched with the defined data set.

Notice that, due to limitation in Datxplore, it is not possible to export data sets with IOV, or washout with an EVID=4.

## Export to Mlxplore

Mlxplore is an application for the visual and interactive exploration of your models. It is designed for intuitive and easy use. It is a powerful solution in your daily modeling work as well as for sharing and teaching of PK/PD principles or for real time dose-regimen exploration in front of an audience. Mlxplore also allows you to visualize the statistical components of the model, such as the impact of covariates and inter-individual variability.

In Monolix, you can export your model (longitudinal and statistical) and the data set by clicking on the Mlxplore icon in the Data and model frame as on the following figure. When clicking on the Mlxplore icon, a window will pop up, where the user can define

• The prediction/simulation grid
• minimum
• step
• maximum
• The considered individual (with its administration design)
• Only the first one
• All individuals
• User choice

After clicking on OK, Mlxplore launches. Thus, you will be able to explore your model and associate it to the data set.

Notice that some cases are not managed by Mlxplore:

• Models with a non-continuous output
• Models with regressors
• Models with bsmm functions
• Data set with IOV or EVID=4

Another important point is that Mlxplore is only managing predictions. Thus, even if a continuous error model is considered in the project, it will not be taken into account and the associated sliders have no impact on the results.

Moreover, when categorical covariate are considered, the following message appears in the export window

This implies that the categorical covariates are not anymore considered as a category but as a continuous covariate with 0 for the reference category.

## Export to Simulx

Simulx is a powerful and flexible simulator for clinical trial pharmacometrics that runs on top of the Lixoft simulation engine. It covers a comprehensive range of data types for simulation and links to a ready to use PK and PD library. It directly connects with Monolix for seamless Modeling and Simulation, and is the statistically most rigorous solution available today. Simulx is currently available via a comprehensive R package thanks to its combination with mlxR, a DDMoRe-sponsored library developed by Inria which contains several R functions for advanced clinical trial simulation.

The simulx function can either directly take the path to a Monolix project as input, or use a model file which can be generated from the Monolix project. This is done using the monolix2simulx function.

Several restrictions:

• Projects with IOV can not be exported.
• Projects with Matlab longitudinal projects can not be exported.
• Projects with several longitudinal models can not be exported.
• Projects with custom distributions for individual parameters can not be exported.
• Projects with bsmm function in the longitudinal model can not be exported.
Suggest Edit