1.Monolix Documentation
Version 2016R1
This documentation is for Monolix Suite 2016R1.
©Lixoft
Monolix
Monolix (MOdèles NOn LInéaires à effets miXtes) is a platform of reference for model based drug development. It combines the most advanced algorithms with unique ease of use. Pharmacometricians of preclinical and clinical groups can rely on Monolix for population analysis and to model PK/PD and other complex biochemical and physiological processes. Monolix is an easy, fast and powerful tool for parameter estimation in nonlinear mixed effect models, model diagnosis and assessment, and advanced graphical representation. Monolix is the result of a ten years research program in statistics and modeling, led by Inria (Institut National de la Recherche en Informatique et Automatique) on nonlinear mixed effect models for advanced population analysis, PK/PD, preclinical and clinical trial modeling & simulation.
Objectives
The objectives of Monolix are to perform:
 Parameter estimation for nonlinear mixed effects models
 computing the maximum likelihood estimator of the population parameters, without any approximation of the model (linearization, quadrature approximation, …), using the Stochastic Approximation Expectation Maximization (SAEM) algorithm,
 computing standard errors for the maximum likelihood estimator
 computing the conditional modes, the conditional means and the conditional standard deviations of the individual parameters, using the HastingsMetropolis algorithm
 Model selection and diagnosis
 comparing several models using some information criteria (AIC, BIC)
 testing hypotheses using the Likelihood Ratio Test
 testing parameters using the Wald Test
 Easy description of pharmacometric models (PK, PKPD, discrete data) with the Mlxtran language
 Goodness of fit plots
An interface for ease of use
Monolix can be used either via a graphical interface or command lines for powerful scripting. This means less programming and more focus on exploring models and pharmacology to deliver in time. The interface can be seen here
One can see several blocks in the interface
 Tasks
 Data and model
 Initialization
 Results
Each of this block has a specific section on this website. An advanced description of available graphics is also provided.
2.Data and models
Creating and using models
 Libraries of models learn how to use the Monolix libraries of PKPD models and create your own libraries
 Model implementation learn how to create a project by defining the model using the Graphical User Interface (GUI) and/or an external model file
 Outputs and Tables learn how to define outputs and create tables with selected outputs of the model
Models for continuous outcomes
 Residual error model learn how to use the predefined residual error models
 Handling censored data learn how to handle easily and properly censored data, i.e. data below (resp. above) a lower (resp.upper) limit of quantification (LOQ) or detection (LOD).
 Mixture of structural models learn how to implement between subject mixture models (BSMM) and within subject mixture models (WSMM).
Models for non continuous outcomes
 Timetoevent data model learn how to implement a model for (repeated) timetoevent data
 Count data model learn how to implement a model for count data, including hidden Markov model.
 Categorical data model learn how to implement a model for categorical data, assuming either independence or a Markovian dependence between observations.
Joint models for multivariate outcomes
 Continuous PKPD model learn how to implement a joint model for continuous pharmacokineticspharmacodynamics (PKPD) data
 Joint continuous and non continuous data model learn how to implement a joint model for continuous and non continuous data, including count, categorical and timetoevent data.
Models for the individual parameters
 Introduction
 Probability distribution of the individual parameters learn how to define the probability distribution and the correlation structure of the individual parameters.
 Model for individual covariates learn how to implement a model for continuous and/or categorical covariates.
 Inter occasion variability learn how to take into account inter occasion variability (IOV).
 Mixture of distributions learn how to implement a mixture of distributions for the individual parameters.
Pharmacokinetic models
 Single route of administration learn how to define and use a PK model for single route of administration.
 Multiple routes of administration learn how to define and use a PK model for multiple routes of administration.
 From multiple doses to steadystate learn how to define and use a PK model with multiple doses or assuming steadystate.
Some extensions
 Using regression variables learn how to define and use regression variables (time varying covariates).
 Bayesian estimation learn how to combine maximum likelihood estimation and Bayesian estimation of the population parameters.
 Delayed differential equations learn how to implement a model with delayed differential equations (DDE).
2.1.Creating and using models
2.1.1.Libraries of models
 The PK library
 The PD library
 Link a PK model and a PD model
 Link a PK model and a PD model with an effect compartment
 Create and use your own libraries
Objectives: learn how to use the Monolix libraries of PKPD models and create your own libraries.
Projects: warfarinPKlibrary_project, PDsim_project, warfarinPKPDlibrary_project, warfarinPKPDelibrary_project, hcv_project
The PK library
 theophylline_project (data = ‘theophylline_data.txt’ , model=’lib:oral1_1cpt_kaVCl.txt’)
A one compartment PK model for oral administration with first order absorption and linear elimination is used for fitting the theophylline data. The model oral1_1cpt_TlagkaVCl
from the PK library is used:
Click on the button Model file
to select or modify a model. Then, select the PK library from the menu
Monolix
Click on the Edit
button to edit the model:
[LONGITUDINAL] input = {ka, V, Cl} EQUATION: Cc = pkmodel(ka, V, Cl) OUTPUT: output = Cc
The name of the PK model appears in the section [LONGITUDINAL]
of the project file
<MODEL> [INDIVIDUAL] input = {ka_pop, omega_ka, V_pop, omega_V, Cl_pop, omega_Cl} DEFINITION: ka = {distribution=lognormal, typical=ka_pop, sd=omega_ka} V = {distribution=lognormal, typical=V_pop, sd=omega_V} Cl = {distribution=lognormal, typical=Cl_pop, sd=omega_Cl} [LONGITUDINAL] input = a file = 'lib:oral1_1cpt_kaVCl.txt' DEFINITION: concentration = {distribution=normal, prediction=Cc, errorModel=constant(a)}
The PD library
 PDsim_project (data = ‘PDsim_data.txt’ , model=’lib:immed_Emax_const.txt’)
An Emax PD model with constant baseline is used for this project.
Link a PK model and a PD model
 warfarinPKlibrary_project (data = ‘warfarin_data.txt’ , model=’lib:oral1_1cpt_TlagkaVCl.txt’)
We first fit the PK data of the warfarin data using a model from the PK library
 warfarinPKPDlibrary_project (data = ‘warfarin_data.txt’ , model= {‘lib:oral1_1cpt_TlagkaVCl.txt’, ‘lib:immed_Imax_const.txt’})
To fit simultaneously the PK and the PD warfarin data, we need to link the PK model with a PD model. Click on the +
and select a model from the PD library. We will use here the immediate response model immed_Imax_const from the Monolix
PD library
[LONGITUDINAL] input = {Cc, Imax, IC50, S0} Cc = {use=regressor} EQUATION: E = S0*(1  Imax*Cc/(Cc+IC50)) OUTPUT: output = E
Two models then appear in the list of model files:
Link a PK model and a PD model with an effect compartment
 warfarinPKPDelibrary_project (data = ‘warfarin_data.txt’ , model= {‘lib:oral1_1cpt_TlagkaVClke0.txt’, ‘lib:immed_Imax_const_Ce.txt’})
Inserting an effect compartment is made possible by using the PKe library
and the PDe library
Then, these 2 models are linked for fitting simultaneously the PK and the PD warfarin data.
Create and use your own libraries
 8.case_studies/hcv_project (data = ‘hcv_data.txt’ , model=’VKlibrary/hcvNeumann98_model.txt’)
You can use the mlxEditor, or any other text editor, for creating new models and new libraries of models. A basic library which includes several viral kinetics model is available in 8.case_studies/model. Click on Other list
for selecting this library or any other existing library of models.
2.1.2.Model implementation
Objectives: learn how to create a project by defining the model using the Graphical User Interface (GUI) and/or an external model file.
Projects: theo1_project, theo2_project
Use an external file for the structural model
 theo1_project (data = ‘theophylline_data.txt’ , model= {‘theo1a_model.txt’, ‘theo1b_model.txt’})
In this example, the structural model is implemented in a external file.
You can edit this model by clicking on the mlxEditor button
theo1a_model.txt is a text file which contains an OUTPUT section where Cc
is defined as the prediction for the observations
[LONGITUDINAL] input = {ka, V, Cl} EQUATION: Cc=pkmodel(ka, V, Cl) OUTPUT: output = Cc
The other components of the model (the residual error model and the model for the individual PK parameters) are defined in the graphical user interface (GUI).
Instead of theo1a_model.txt, we can use theo1b_model.txt where no output is defined.
[LONGITUDINAL] input = {ka, V, Cl} EQUATION: Cc=pkmodel(ka, V, Cl)
For changing the structural model, click on the button Model file
and select the new model in the list
Since no output is defined in this model, the name of the variable used as a prediction for the observations should be provided in a GUI:
The project can be saved and used equivalently either with theo1a_model.txt or theo1b_model.txt.
Use an external file for the structural model and the residual error model
 theo2_project (data = ‘theophylline_data.txt’ , model= {‘theo2a_model.txt’, ‘theo2b_model.txt’})
Both the structural model and the residual error model are defined in theo2a_model.txt
[LONGITUDINAL] input = {ka, V, Cl, a} EQUATION: Cc=pkmodel(ka, V, Cl) DEFINITION: y1 = {distribution=normal, prediction=Cc, errorModel=constant(a) } OUTPUT: output = y1
Then, the residual error model cannot be modified anymore in the main GUI
The same model for the observations is implemented in theo2b_model.txt, but where no output is defined (y_1
should defined as the output in the GUI).
2.1.3.Outputs and Tables
Objectives: learn how to define outputs and create tables from the outputs of the model.
Projects: tgi1_project, tgi2_project
About the OUTPUT block
 tgi1_project (data = ‘tgi_data.txt’ , model=’tgi1_model.txt’)
We use the Tumor Growth Inhibition (TGI) model proposed by Ribba et al. in this example (Ribba, B., Kaloshi, G., Peyre, M., Ricard, D., Calvez, V., Tod, M., . & Ducray, F., A tumor growth inhibition model for lowgrade glioma treated with chemotherapy or radiotherapy. Clinical Cancer Research, 18(18), 50715080, 2012.)
[LONGITUDINAL] input = {K, KDE, KPQ, KQPP, LAMBDAP, GAMMA, DELTAQP,PT0,Q0} PK: depot(target=C) EQUATION: t0=0 PT_0=PT0 Q_0 = Q0 PSTAR = PT+Q+QP ddt_C = KDE*C ddt_PT = LAMBDAP*PT*(1PSTAR/K) + KQPP*QP  KPQ*PT  GAMMA*KDE*PT*C ddt_Q = KPQ*PT  GAMMA*KDE*Q*C ddt_QP = GAMMA*KDE*Q*C  KQPP*QP  DELTAQP*QP
PSTAR is the tumor size predicted by the model. It is therefore used as a prediction for the observations in the project:
[LONGITUDINAL] input = b file = 'model/tgi1_model.txt' DEFINITION: y1 = {distribution=normal, prediction=PSTAR, errorModel=proportional(b)}
We can select which tables and graphics to save with the button Outputs to save
We select in this example the tables with predicted values computed at the observation times and on a finer grid (used for the graphics of individual fits)
The sequence of tasks selected in this project includes the estimation of the individual parameters (using the conditional models) and several diagnostic plots (including the individual fits).
Thus, individual predictions of the tumor size PSTAR are computed using both the conditional modes (indPred_mode) and the conditional means estimated during the last iterations of SAEM (indPred_mean*) and saved in table predictions.txt
and finegrid.txt
Remark: the same model file tgi1_model.txt can be used with different tools, including Mlxplore or Simulx (see this Shiny application for instance).
Add additional outputs in tables
 tgi2_project (data = ‘tgi_data.txt’ , model=’tgi2_model.txt’)
We can save in the tables additional variables defined in the model, such as PT, Q and QP, for instance, by adding a block OUTPUT: in the model file:
OUTPUT: output = PSTAR table = {PT, Q, QP}
Tables predictions.txt and finegrid.txt now include the predicted values of these variables for each individual (columns PT_mean*, Q_mean*, QP_mean*, PT_mode, Q_mode, QP_mode, PT_pop, Q_pop, QP_pop).
2.2.Models for continuous outcomes
2.2.1.Residual error model
 Introduction
 Defining the residual error model from the Monolix GUI
 Some basic residual error models
 Residual error models for bounded data
 Autocorrelated residuals
 Using different error models per group/study
Objectives: learn how to use the predefined residual error models.
Projects: warfarinPKlibrary_project, bandModel_project, autocorrelation_project, errorGroup_project
Introduction
For continuous data, we are going to consider scalar outcomes () and assume the following general model:
for i from 1 to N, and j from 1 to , where is the parameter vector of the structural model f for individual i. The residual error model is defined by the function g which depend on some additional vector of parameters . The residual errors are standardized Gaussian random variables (mean 0 and standard deviation 1). In this case, it is clear that and are the conditional mean and standard deviation of , i.e.,
The following error models are available in Monolix
:
 constant : and
 proportional : and
 combined1 : and
 combined2 : and
 proportionalc : and
 combined1c : and
 combined2c : and
The assumption that the distribution of any observation is symmetrical around its predicted value is a very strong one. If this assumption does not hold, we may want to transform the data to make it more symmetric around its (transformed) predicted value. In other cases, constraints on the values that observations can take may also lead us to transform the data.
Model can be extended to include a transformation of the data:
where u is a monotonic transformation (a strictly increasing or decreasing function). As we can see, both the data and the structural model f are transformed by the function u so that remains the prediction of . Several error models based on such transformation are available in Monolix
:
 exponential : (assuming that ) and . This is equivalent to assume that .
 logit : (assuming that ).
 band(0,10) : (assuming that ).
 band(0,100) : (assuming that ).
Defining the residual error model from the Monolix GUI
 A menu in the frame Data and model of the main GUI allows one to select the residual error model:
 a summary of the statistical model which includes the residual error model can be displayed:
 when a combined 1 error model is used, we can force parameter b to be positive by ticking the box b>0:
 Autocorrelation of the residual errors is estimated when the checkbox r is ticked:
Some basic residual error models
 warfarinPKlibrary_project (data = ‘warfarin_data.txt’, model = ‘lib:oral1_1cpt_TlagkaVCl.txt’)
The residual error model used with this project for fitting the PK of warfarin is a constant error model
Several diagnostic plots can then be used for evaluating the error model. Here, the VPC’s suggests that a proportional component perhaps should be included in the error model:
We can modify the residual error model and select, for instance, a combined1 error model directly from the menu in the GUI:
Estimated population parameters now include a and b
VPCs obtained with this error model do not show any mispecification
Remarks:
 When the residual error model is defined in the GUI, a bloc DEFINITION: is then automatically added to the project file in the section [LONGITUDINAL] of <MODEL> when the project is saved:
DEFINITION: y1 = {distribution=normal, prediction=Cc, errorModel=combined(a,b)}
 the statistical summary includes the residual error model
Residual error models for bounded data
 bandModel_project (data = ‘bandModel_data.txt’, model = ‘lib:immed_Emax_null.txt’)
In this example, data are known to take their values between 0 and 100. We can use a band(0,100) error model if we want to take this constraint into account.
VPCs obtained with this error model do not show any mispecification
The statistical summary includes this residual error model:
This residual error model is implemented in Mlxtran
as follows:
DEFINITION: effect = {distribution=logitnormal, min=0, max=100, prediction=E, errorModel=constant(a)}
Autocorrelated residuals
For any subject i, the residual errors are usually assumed to be independent random variables. The extension to autocorrelated errors is possible by assuming, that is a stationary autoregressive process of order 1, AR(1), which autocorrelation decreases exponentially:
where for each individual i. If for any (i,j), then and the autocorrelation function for individual i is given by
The residual errors are uncorrelated when .
 autocorrelation_project (data = ‘autocorrelation_data.txt’, model = ‘lib:infusion_1cpt_Vk.txt’)
Autocorrelation is estimated since the checkbox r is ticked in this project:
Estimated population parameters now include the autocorrelation r:
Remarks:
Monolix
accepts both regular and irregular time grids. Rich data are required (i.e. a large number of time points per individual) for estimating properly the autocorrelation structure of the residual errors.
Using different error models per group/study
 errorGroup_project (data = ‘errorGroup_data.txt’, model = ‘errorGroup_model.txt’)
Data comes from 3 different studies in this example:
We want to use different error models for the 3 studies. A solution consists in defining the column STUDY with the reserved keyword YTYPE. It will be then possible to define one error model per outcome:
We use here the same PK model for the 3 studies:
[LONGITUDINAL] input = {V, k} PK: Cc1 = pkmodel(V, k) Cc2 = Cc1 Cc3 = Cc1 OUTPUT: output = {Cc1, Cc2, Cc3}
Since 3 outputs are defined in the structural model, we can now define 3 error models in the GUI:
Different residual error parameters are estimated for the 3 studies. We can remark than, even if 2 proportional error models are used for the 2 first studies, different parameters and are estimated:
2.2.2.Handling censored (BLQ) data
 Introduction
 Theory
 PK data below a lower limit of quantification
 PK data below a lower limit of quantification or below a limit of detection
 PK data below a lower limit of quantification and PD data above an upper limit of quantification
 Combination of interval censored PK and PD data
 Case studies
Objectives: learn how to handle easily and properly censored data, i.e. data below (resp. above) a lower (resp.upper) limit of quantification (LOQ) or below a limit of detection (LOD).
Projects: censored1log_project, censored1_project, censored2_project, censored3_project, censored4_project
Introduction
Censoring occurs when the value of a measurement or observation is only partially known. For continuous data measurements in the longitudinal context, censoring refers to the values of the measurements, not the times at which they were taken. For example, the lower limit of detection (LLOD) is the lowest quantity of a substance that can be distinguished from its absence. Therefore, any time the quantity is below the LLOD, the “observation” is not a measurement but the information that the measured quantity is less than the LLOD. Similarly, in longitudinal studies of viral kinetics, measurements of the viral load below a certain limit, referred to as the lower limit of quantification (LLOQ), are so low that their reliability is considered suspect. A measuring device can also have an upper limit of quantification (ULOQ) such that any value above this limit cannot be measured and reported.
As hinted above, censored values are not typically reported as a number, but their existence is known, as well as the type of censoring. Thus, the observation (i.e., what is reported) is the measurement if not censored, and the type of censoring otherwise.
We usually distinguish between three types of censoring: left, right and interval. In each case, the SAEM algorithm implemented in Monolix
properly computes the maximum likelihood estimate of the population parameters, combining all the infomation provided by censored and non censored data.
Theory
In the presence of censored data, the conditional density function needs to be computed carefully. To cover all three types of censoring (left, right, interval), let be the (finite or infinite) censoring interval existing for individual i at time . Then,
where
We see that if is not censored (i.e. ), its contribution to the likelihood is the usual , whereas if it is censored, the contribution is .
For the calculation of the likelihood, this is equivalent to the M3 method in NONMEM when only the CENS column is given, and to the M4 method when both a CENS column and a LIMIT column are given.
PK data below a lower limit of quantification
Left censored data
 censored1log_project (data = ‘censored1log_data.txt’, model = ‘pklog_model.txt’)
PK data are logconcentration in this example. The limit of quantification of 1.8 mg/l for concentrations becomes log(1.8)=0.588 for logconcentrations. Column of observations (Y) contains either the LLOQ for data below the limit of quantification (BLQ data) or the measured logconcentrations for non BLQ data. Furthermore, Monolix
uses an additional column CENS to indicate if an observation is left censored (CENS=1) or not (CENS=0). In this example, subject 1 has two BLQ data at time 24h and 30h (the measured logconcentrations were below 0.588 at these times):
Monolix
then recognized this keyword CENS:
The plot of individual fits displays BLQ and non BLQ data together with the predicted logconcentrations on the whole time interval:
For diagnostic plots such as VPC, residuals of observations versus predictions, Monolix
proposes to sample the BLQ data from the conditional distribution
where and are the estimated population and individual parameters. This is the most efficient way to take into account the complete information provided by the data and the model for diagnostic plots such as VPCs:
A strong bias appears if LLOQ is used instead for the BLQ data:
Ignoring the BLQ data entails a loss of information:
Imputed BLQ data is also used for residuals:
and for observations versus predictions:
More about these diagnostic plots
A strong bias appears if LLOQ is used instead for the BLQ data for these two diagnostic plots:
while ignoring the BLQ data entails a loss of information:
Diagnostic plot BLQ plots the cumulative fraction of BLQ data (green line) with a 90
Interval censored data
 censored1_project (data = ‘censored1_data.txt’, model = ‘lib:oral1_1cpt_kaVk.txt’)
We use the original concentrations in this project. Then, BLQ data should be treated as interval censored data since a concentration is know to be positive. In other word, a data reported as BLQ data means that the (non reported) measured concentration is between 0 and 1.8mg/l. An additional column LIMIT reports the lower limit of the censored interval (0 in this example):
Remark: if this column is missing, then BLQ data is assumed to be leftcensored data that can take any positive and negative value below LLOQ.
Monolix
recognized this keyword LIMIT:
Monolix
will use this additional information for properly estimating the parameters of the model and imputing the BLQ data for the diagnostic plots.
PK data below a lower limit of quantification or below a limit of detection
 censored2_project (data = ‘censored2_data.txt’, model = ‘lib:oral1_1cpt_kaVk.txt’)
Several censoring processes can be combined in the same data set, assuming different limits. We may know, for instance that a PK data is either below a limit of detection (LLOD=1.2) or between the limit of detection and the limit of quantification (LLOD=1.2LLOQ=1.8). Column LIMIT and Y report respectively the lower and upper limits of the censoring intervals:
Plot of individual fits now displays LLOQ or LLOD with a red star when a PK data is censored:
PK data below a lower limit of quantification and PD data above an upper limit of quantification
 censored3_project (data = ‘censored3_data.txt’, model = ‘pkpd_model.txt’)
We work with PK and PD data in this project. We assume that the PD data may be right censored and that the upper limit of quantification is ULOQ=90. We use CENS=1 to indicate that an observation is right censored. In such case, the PD data can take any value above the upper limit reported in column Y (here YTYPE=1 and YTYPE=2 are used respectively for PK and PD data):
Plot of individual fits for the PD data now displays ULOQ and the predicted PD profile:
We can display the cumulated fraction of censored data both for the PK and the PD data:
Combination of interval censored PK and PD data
 censored4_project (data = ‘censored4_data.txt’, model = ‘pkpd_model.txt’)
We assume in this example
 2 different censoring intervals(0,1) and (1.2, 1.8) for the PK,
 a censoring interval (80,90) and right censoring (>90) for the PD.
Combining columns CENS, LIMIT and Y allow to combine efficiently these different censoring processes:
This coding of the data means that, for subject 1,
 PK data is between 0 and 1 at time 30h,
 PK data is between 1.2 and 1.8 at times 0.5h and 24h,
 PD data is between 80 and 90 at times 12 and 16,
 PD data is above 90 at times 4 and 8.
Plot of individual fits for the PK and the PD data displays the different limits of these censoring intervals:
Other diagnostic plots, such as the plot of observations versus predictions, adequately use imputed censored PK and PD data:
Case studies
 8.case_studies/hiv_project (data = ‘hiv_data.txt’, model = ‘hivLatent_model.txt’)
 8.case_studies/hcv_project (data = ‘hcv_data.txt’, model = ‘hcvNeumann98_model_latent.txt’)
2.2.3.Mixture of structural models
Objectives: learn how to implement between subject mixture models (BSMM) and within subject mixture models (WSMM).
Projects: bsmm1_project, bsmm2_project, wsmm_project
Introduction
There exist several types of mixture models useful in the context of mixed effects models. It may be necessary in some situations to introduce diversity into the structural models themselves:
 Betweensubject model mixtures (BSMM) assume that there exist subpopulations of individuals. Different structural models describe the response of each subpopulation, and each subject belongs to one of these subpopulations. One can imagine for example different structural models for responders, nonresponders and partial responders to a given treatment.
The easiest way to model a finite mixture model is to introduce a label sequence that takes its values in such that if subject i belongs to subpopulation m. is the probability for subject i to belong to subpopulation m. A BSMM assumes that the structural model is a mixture of M different structural models:
$$f\left( t_{ij};\psi_i,z_i \right) = \sum_{m=1}^M 1_{z_i = m} f_m\left( t_{ij};\psi_i \right) $$
In other word, each subpopulation has its own structural model: is the structural model for subpopulation m.
 Withinsubject model mixtures (WSMM) assume that there exist subpopulations (of cells, viruses, etc.) within each patient. In this case, different structural models can be used to describe the response of different subpopulations, but the proportion of each subpopulation depends on the patient.
Then, it makes sense to consider that the mixture of models happens within each individual. Such withinsubject model mixtures require additional vectors of individual parameters representing the proportions of the M models within each individual i:
The proportions are now individual parameters in the model and the problem is transformed into a standard mixed effects model. These proportions are assumed to be positive and to sum to 1 for each patient.
Between subject mixture models
Supervised learning
 bsmm1_project (data = ‘pdmixt1_data.txt’, model = ‘bsmm1_model.txt’)
We consider a very simple example here with two subpopulations of individuals who receive a given treatment. The outcome of interest is the measured effect of the treatment (a viral load for instance). The two populations are non responders and responders. We assume here that the status of the patient is known. Then, the data file contains an additional column GROUP. This column is duplicated because Monolix
uses it
 i) as a regression variable (X): it is used in the model to distinguish responders and non responders,
 ii) as a categorical covariate (CAT): it is used to stratify the diagnostic plots.
We can then display the data
and use the categorical covariate GROUP_CAT to split the plot into responders and non responders:
We use different structural models for non responders and responders. The predicted effect for non responders is constant:
while the predicted effect for responders decreases exponentially:
The model is implemented in the model file bsmm1_model.txt
(remark that the names of the regression variable in the data file and in the model script do not need to match):
[LONGITUDINAL] input = {A1, A2, k, g} g = {use=regressor} EQUATION: if g==1 f = A1 else f = A2*exp(k*max(t,0)) end OUTPUT: output = f
The plot of individual fits exhibit the two different structural models:
VPCs should then be splitted according to the GROUP_CAT
as well as the prediction distribution for non responders and responders:
Unsupervised learning
 bsmm2_project (data = ‘pdmixt2_data.txt’, model = ‘bsmm2_model.txt’)
The status of the patient is unknown in this project (which means that the column GROUP is not available anymore):
Let p be the proportion of non responders in the population. Then, the structural model for a given subject is with probability p and with probability 1p. The structural model is therefore a BSMM:
[LONGITUDINAL] input = {A1, A2, k, p} EQUATION: f1 = A1 f2 = A2*exp(k*max(t,0)) f=bsmm(f1, p, f2, 1p) OUTPUT: output = f
Important: p is a population parameter of the model to estimate. There is no interpatient variability on p: all the subjects have the same probability to be a non responder in this example. We use a logitnormal distribution for p in order to constrain it to be between 0 and 1, but without variability:
p is estimated with the other population parameters:
The individual parameters are estimated using the conditional modes in this project:
Then, the group to which a patient belong is also estimated as the group of highest conditional probability:
The estimated groups can be displayed together with the individual parameters:
and can be used as a stratifying variable to split some plots such as VPC’s
Within subject mixture models
 wsmm_project (data = ‘pdmixt2_data.txt’, model = ‘wsmm_model.txt’)
It may be too simplistic to assume that each individual is represented by only one welldefined model from the mixture. We consider here that the mixture of models happens within each individual and use a WSMM:
[LONGITUDINAL] input = {A1, A2, k, p} EQUATION: f1 = A1 f2 = A2*exp(k*max(t,0)) f=wsmm(f1, p, f2, 1p) OUTPUT: output = f
Remark: Here, writing f=wsmm(f1, p, f2, 1p) is equivalent to write f=p*f1 + (1p)*f2
Important: Here, p is an individual parameter: the subjects have different proportions of non responder cells. We use a probitnormal distribution for p in order to constrain it to be between 0 and 1, with variability:
There is no latent covariate when using WSMM: mixtures are continuous mixtures. We therefore cannot split anymore the VPCs and the prediction distributions.
2.3.Models for non continuous outcomes
2.3.1.Timetoevent data models
Objectives: learn how to implement a model for (repeated) timetoevent data with different censoring processes.
Projects: tte1_project, tte2_project, tte3_project, tte4_project, rtteWeibull_project, rtteWeibullCount_project
Introduction
Here, observations are the “times at which events occur”. An event may be oneoff (e.g., death, hardware failure) or repeated (e.g., epileptic seizures, mechanical incidents, strikes). Several functions play key roles in timetoevent analysis: the survival, hazard and cumulative hazard functions. We are still working under a population approach here so these functions, detailed below, are thus individual functions, i.e., each subject has its own. As we are using parametric models, this means that these functions depend on individual parameters .
 The survival function gives the probability that the event happens to individual i after time :
 The hazard function is defined for individual i as the instantaneous rate of the event at time t, given that the event has not already occurred:
This is equivalent to
 Another useful quantity is the cumulative hazard function , defined for individual i as
Note that . Then, the hazard function characterizes the problem, because knowing it is the same as knowing the survival function . The probability distribution of survival data is therefore completely defined by the hazard function.
Single event
To begin with, we will consider a oneoff event. Depending on the application, the length of time to this event may be called the survival time (until death, for instance), failure time (until hardware fails), and so on. In general, we simply say “timetoevent”. The random variable representing the timetoevent for subject i is typically written Ti.
Single event exactly observed or right censored
 tte1_project (data = tte1_data.txt , model=tte1_model.txt)
The event time may be exactly observed at time , but if we assume that the trial ends at time , the event may happen after the end. This is “right censoring”. Here, Y=0 at time t means that the event happened after t and Y=1 means that the event happened at time t. The rows with t=0 are included to show the trial start time :
By clicking on the button plot the data
, it is possible to display the Kaplan Meier plot (i.e. the empirical survival function) before fitting any model:
A very basic model with constant hazard is used for this data:
[LONGITUDINAL] input = Te DEFINITION: Event = { type = event, hazard = 1/Te, maxEventNumber = 1 }
Here, Te is the expected time to event. Specification of the maximum number of events is required both for the estimation procedure and for the diagnostic plots based on simulation, such as the predicted interval for the Kaplan Meier plot which is obtained by Monte Carlo:
Single event interval censored or right censored
 tte2_project (data = tte2_data.txt , model=tte2_model.txt)
We may know the event has happened in an interval but not know the exact time . This is interval censoring. Here, Y=0 at time t means that the event happened after t and Y=1 means that the event happened before time t.
Event for individual 1 happened between t=10 and t=15. No event was observed until the end of the experiment (t=100) for individual 5. We use the same basic model, but we need now to specify that the events are interval censored:
[LONGITUDINAL] input = Te DEFINITION: Event = {type=event, hazard=1/Te, eventType=intervalCensored, maxEventNumber=1, intervalLength=5, ; used for the graphics (not mandatory) rightCensoringTime=200 ; used for the graphics (not mandatory) }
Repeated events
Sometimes, an event can potentially happen again and again, e.g., epileptic seizures, heart attacks. For any given hazard function h, the survival function S for individual i now represents the survival since the previous event at , given here in terms of the cumulative hazard from to :
Repeated events exactly observed or right censored
 tte3_project (data = tte3_data.txt , model=tte3_model.txt)
A sequence of event times is precisely observed before :
We can then display the Kaplan Meier plot for the first event and the mean number of events per individual:
After fitting the model, prediction intervals for these two curves can also be displayed on the same graph.
Repeated events interval censored or right censored
 tte4_project (data = tte4_data.txt , model=tte4_model.txt)
We do not know the exact event times, but the number of events that occurred for each individual in each interval of time:
User defined likelihood function for timetoevent data
 rtteWeibull_project (data = rtte_data.txt , model=rtteWeibull_model.txt)
A Weibull model is used in this example:
EQUATION: h = (beta/lambda)*(t/lambda)^(beta1) DEFINITION: Event = {type=event, hazard=h}
 rtteWeibullCount_project (data = rtteCount_data.txt , model=rtteWeibullCount_model.txt)
Instead of defining the data as events, it is possible to consider the data as count data: indeed, we count the number of events per interval. An additional column with the start of the interval is added in the data file and defined as a regression variable:
We then use a model for count data (see rtteWeibullCount_model.txt).
2.3.2.Count data model
 Introduction
 Count data with constant distribution over time
 Count data with time varying distribution
 Hidden Markov model for count data
Objectives: learn how to implement a model for count data.
Projects: count1a_project, count1a_project, count1a_project, count2_project, hmm0_project, hmm1_project
Introduction
Longitudinal count data is a special type of longitudinal data that can take only nonnegative integer values {0, 1, 2, …} that come from counting something, e.g., the number of seizures, hemorrhages or lesions in each given time period . In this context, data from individual j is the sequence where is the number of events observed in the jth time interval .
Count data models can also be used for modeling other types of data such as the number of trials required for completing a given task or the number of successes (or failures) during some exercise. Here, is either the number of trials or successes (or failures) for subject i at time . For any of these data types we will then model as a sequence of random variables that take their values in {0, 1, 2, …}. If we assume that they are independent, then the model is completely defined by the probability mass functions for and . Here, we will consider only parametric distributions for count data.
Count data with constant distribution over time
 count1a_project (data = ‘count1_data.txt’, model = ‘count_library/poisson_mlxt.txt’)
A Poisson model is used for fitting the data:
[LONGITUDINAL] input = lambda DEFINITION: Y = { type = count, log(P(Y=k)) = lambda + k*log(lambda)  factln(k) }
Residuals for noncontinuous data reduce to NPDE’s. We can compare the empirical distribution of the NPDE’s with the distribution of a standardized normal distribution:
VPC’s for count data compare the observed and predicted frequencies of the categorized data over time:
 count1b_project (data = ‘count1_data.txt’, model = ‘count_library/poissonMixture_mlxt.txt’)
A mixture of 2 Poisson distributions is used to fit the same data.
Count data with time varying distribution
 count2_project (data = ‘count2_data.txt’, model = ‘count_library/poissonTimeVarying_mlxt.txt’)
The distribution of the data changes with time in this example:
We then use a Poisson distribution with a time varying intensity:
[LONGITUDINAL] input = {a,b} EQUATION: lambda= a*exp(b*t) DEFINITION: y = {type=count, P(y=k)=exp(lambda)*(lambda^k)/factorial(k)}
This model seems to fit the data very well:
Hidden Markov model for count data
Markov chains are a useful tool for analyzing categorical longitudinal data. However, sometimes the Markov process cannot be directly observed and only some output, dependent on the (hidden) state, is seen. More precisely, we assume that the distribution of this observable output depends on the underlying hidden state. Such models are called hidden Markov models. An HMM is thus defined for individual as a pair of processes , where the latent sequence is a Markov chain and the distribution of observation at time depends on state .
In the following example, the latent sequence take its values in . When (resp. ), is a Poisson random variable with parameter (resp. ).
Remark: Models in the 2 following examples are implemented in Matlab (implementation with Mlxtran is not possible with the current version of Monolix). These models can be used with both the StandAlone and Matlab versions of Monolix, but they can only be modified with the Matlab version.
 hmm0_project (data = ‘hmm_data.txt’, model = ‘count_library/hmm0_mlx.m’)
The latent sequence is a sequence of independent and identically distributed random variables.
 hmm1_project (data = ‘hmm_data.txt’, model = ‘count_library/hmm1_mlx.m’)
The latent sequence is a (hidden) Markov chain.
2.3.3.Categorical data model
 Introduction
 Ordered categorical data
 Ordered categorical data with regression variables
 Discretetime Markov chain
 Continuoustime Markov chain
Objectives: learn how to implement a model for categorical data, assuming either independence or a Markovian dependence between observations.
Projects: categorical1_project, categorical2_project, markov0_project, markov1a_project, markov1b_project, markov1c_project, markov2_project, markov3a_project, markov3b_project
Introduction
Assume now that the observed data takes its values in a fixed and finite set of nominal categories . Considering the observations for any individual as a sequence of conditionally independent random variables, the model is completely defined by the probability mass functions for and . For a given , the sum of the probabilities is 1, so in fact only of them need to be defined. In the most general way possible, any model can be considered so long as it defines a probability distribution, i.e., for each , , and . Ordinal data further assume that the categories are ordered, i.e., there exists an order such that
We can think, for instance, of levels of pain (low moderate severe) or scores on a discrete scale, e.g., from 1 to 10. Instead of defining the probabilities of each category, it may be convenient to define the cumulative probabilities for , or in the other direction: for . Any model is possible as long as it defines a probability distribution, i.e., it satisfies
It is possible to introduce dependence between observations from the same individual by assuming that forms a Markov chain. For instance, a Markov chain with memory 1 assumes that all that is required from the past to determine the distribution of is the value of the previous observation ., i.e., for all ,
Ordered categorical data
 categorical1_project (data = ‘categorical1_data.txt’, model = ‘categorical1_model.txt’)
In this example, observations are ordinal data that take their values in {0, 1, 2, 3}:
 Cumulative odds ratio are used in this example to define the model
where
This model is implemented in categorical1_model.txt
:
[LONGITUDINAL] input = {th1, th2, th3} EQUATION: lgp0 = th1 lgp1 = lgp0 + th2 lgp2 = lgp1 + th3 DEFINITION: level = { type = categorical, categories = {0, 1, 2, 3}, logit(P(level<=0)) = th1 logit(P(level<=1)) = th1 + th2 logit(P(level<=2)) = th1 + th2 + th3 }
A normal distribution is used for , while lognormal distributions for and ensure that these parameters are positive (even without variability)
Residuals for noncontinuous data reduce to NPDE’s. We can compare the empirical distribution of the NPDE’s with the distribution of a standardized normal distribution:
VPC’s for categorical data compare the observed and predicted frequencies of each category over time:
The prediction distribution can also be computed by MonteCarlo:
Ordered categorical data with regression variables
 categorical2_project (data = ‘categorical2_data.txt’, model = ‘categorical2_model.txt’)
A proportional odds model is used in this example, where PERIOD and DOSE are used as regression variables (i.e. timevarying covariates)
Discretetime Markov chain
If observation times are regularly spaced (constant length of time between successive observations), we can consider the observations to be a discretetime Markov chain.
 markov0_project (data = ‘markov1a_data.txt’, model = ‘markov0_model.txt’)
In this project, states are assumed to be independent and identically distributed:
Observations in markov1a_data.txt
take their values in {1, 2}. Before fitting any model to this data, it is possible to display the empirical transitions between states by selecting Transition probabilities
in the list of selected plots.
We can see that the probability to be in state 1 or 2 clearly depends on the previous state. The proposed model should therefore be discarted.
 markov1a_project (data = ‘markov1a_data.txt’, model = ‘markov1a_model.txt’)
Here,
[LONGITUDINAL] input = {p11, p21} DEFINITION: State = {type = categorical, categories = {1,2}, dependence = Markov P(State=1State_p=1) = p11 P(State=1State_p=2) = p21 }
The distribution of the initial state is not defined in the model, which means that, by default,
 markov1b_project (data = ‘markov1b_data.txt’, model = ‘markov1b_model.txt’)
The distribution of the initial state, , is estimated in this example
DEFINITION: State = {type = categorical, categories = {1,2}, dependence = Markov P(State_1=1)= p P(State=1State_p=1) = p11 P(State=1State_p=2) = p21 }
 markov3a_project (data = ‘markov3a_data.txt’, model = ‘markov3a_model.txt’)
Transition probabilities change with time in this example:
We then define time varying transition probabilities in the model:
[LONGITUDINAL] input = {a1, b1, a2, b2} EQUATION: lp11 = a1 + b1*t/100 lp21 = a2 + b2*t/100 DEFINITION: State = { type = categorical, categories = {1,2}, dependence = Markov logit(P(State=1State_p=1)) = lp11 logit(P(State=1State_p=2)) = lp21 }
 markov2_project (data = ‘markov2_data.txt’, model = ‘markov2_model.txt’)
Observations in markov2_data.txt
take their values in {1, 2, 3}. Then, 6 transition probabilities need to be defined in the model.
Continuoustime Markov chain
The previous situation can be extended to the case where time intervals between observations are irregular by modeling the sequence of states as a continuoustime Markov process. The difference is that rather than transitioning to a new (possibly the same) state at each time step, the system remains in the current state for some random amount of time before transitioning. This process is now characterized by transition rates instead of transition probabilities:
The probability that no transition happens between and is
Furthermore, for any individual and time , the transition rates satisfy for any ,
Constructing a model therefore means defining parametric functions of time that satisfy this condition.
 markov1c_project (data = ‘markov1c_data.txt’, model = ‘markov1c_model.txt’)
Observation times are irregular in this example:
Then, a continuous time Markov chain should be used in order to take into account the Markovian dependence of the data:
DEFINITION: State = { type = categorical, categories = {1,2}, dependence = Markov transitionRate(1,2) = q12 transitionRate(2,1) = q21 }
 markov3b_project (data = ‘markov3b_data.txt’, model = ‘markov3b_model.txt’)
Time varying transition rates are used in this example.
2.4.Joint models for multivariate outcomes
2.4.1.Joint models for continuous outcomes
 Introduction
 Fitting first a PK model to the PK data
 Simultaneous PKPD modeling using the PK and PD libraries
 Simultaneous PKPD modeling using a user defined model
 Sequential PKPD modelling
 Fitting a PKPD model to the PD data only
 Case studies
Objectives: learn how to implement a joint model for continuous PKPD data.
Projects: warfarinPK_project, warfarinPKPDlibrary_project, warfarinPKPDelibrary_project, warfarin_PKPDimmediate_project, warfarin_PKPDeffect_project, warfarin_PKPDturnover_project, warfarin_PKPDseq1_project, warfarin_PKPDseq2_project, warfarinPD_project
Introduction
A “joint model” describes two or more types of observation that typically depend on each other. A PKPD model is a “joint model” because the PD depends on the PK. Here we demonstrate how several observations can be modeled simultaneously. We also discuss the special case of sequential PK and PD modelling, using either the population PK parameters or the individual PK parameters as an input for the PD model.
Fitting first a PK model to the PK data
 warfarinPK_project (data = ‘warfarin_data.txt’, model = ‘lib:oral1_1cpt_TlagkaVCl.txt’)
The column DV of the data file contains both the PK and the PD measurements: the keyword Y is used by Monolix
for this column. The column DVID is a flag defining the type of observation: DVID=1 for PK data and DVID=2 for PD data: the keyword YTYPE is then used for this column.
We will use the model oral1_1cpt_TlagkaVCl from the Monolix
PK library
[LONGITUDINAL] input = {Tlag, ka, V, Cl} PK: Cc = pkmodel(Tlag, ka, V, Cl) OUTPUT: output = Cc
Only the predicted concentration Cc is defined as an output of this model. Then, this prediction will be automatically associated to the outcome of type 1 (DVID=1) while the other observations (DVID=2) will be ignored.
Remark: any other ordered values could be used for YTYPE: the smallest one will always be associated to the first prediction defined in the model.
Simultaneous PKPD modeling using the PK and PD libraries
See project warfarinPKPDlibrary_project from demo folder 1.creating_and_using_models/libraries_of_models to see how to link the PK model to an immediate response model from the Monolix PD library for modelling simultaneously the PK and the PD warfarin data.
See project warfarinPKPDelibrary_project from demo folder 1.creating_and_using_models/libraries_of_models to see how to add an effect compartement by using the PKe and PDe Monolix
libraries.
Simultaneous PKPD modeling using a user defined model
 warfarin_PKPDimmediate_project (data = ‘warfarin_data.txt’, model = ‘immediateResponse_model.txt’)
Is is also possible for the user to write his own PKPD model. The same PK model used previously and an immediate response model are defined in the model file immediateResponse_model.txt
[LONGITUDINAL] input = {Tlag, ka, V, Cl, Imax, IC50, S0} EQUATION: Cc = pkmodel(Tlag, ka, V, Cl) E = S0 * (1  Imax*Cc/(Cc+IC50) ) OUTPUT: output = {Cc, E}
Two predictions are now defined in the model: Cc for the PK (DVID=1) and E for the PD (DVID=2).
 warfarin_PKPDeffect_project (data = ‘warfarin_data.txt’, model = ‘effectCompartment_model.txt’)
An effect compartment is defined in the model file effectCompartment_model.txt
[LONGITUDINAL] input = {Tlag, ka, V, Cl, ke0, Imax, IC50, S0} EQUATION: {Cc, Ce} = pkmodel(Tlag, ka, V, Cl, ke0) E = S0 * (1  Imax*Ce/(Ce+IC50) ) OUTPUT: output = {Cc, E}
Ce is the concentration in the effect compartment
 warfarin_PKPDturnover_project (data = ‘warfarin_data.txt’, model = ‘turnover1_model.txt’)
An indirect response (turnover) model is defined in the model file turnover1_model.txt
[LONGITUDINAL] input = {Tlag, ka, V, Cl, Imax, IC50, Rin, kout} EQUATION: Cc = pkmodel(Tlag, ka, V, Cl) E_0 = Rin/kout ddt_E = Rin*(1Imax*Cc/(Cc+IC50))  kout*E OUTPUT: output = {Cc, E}
Several implementations of the same pkmodel are possible. For instance, the predicted concentration Cc is defined using equations in turnover2_model.txt instead of using the pkmodel function. The residual error models for the PK and PD measurements are both defined in turnover3_model.txt. Any of these two models can be used instead of turnover1_model.txt by right clicking on the Model file
button and selecting Change model
.
Sequential PKPD modelling
In the sequential approach, a PK model is developed and parameters estimated in the first step. For a given PD model, different strategies are then possible for the second step, i.e., for estimating the population PD parameters:
Using estimated population PK parameters
 warfarin_PKPDseq1_project (data = ‘warfarin_data.txt’, model = ‘turnover1_model.txt’)
Population PK parameters are set to their estimated values but individual PK parameters are not assumed to be known and sampled from their conditional distributions at each SAEM iteration. In Monolix
, this simply means changing the status of the population PK parameter values so that they are no longer used as initial estimates for SAEM but considered fixed.
The joint PKPD model defined in turnover1_model.txt is again used with this project.
Using estimated individual PK parameters
 warfarin_PKPDseq2_project (data = ‘warfarinSeq_data.txt’, model = ‘turnoverSeq_model.txt’)
Individual PK parameters are set to their estimated values and used as constants in the PKPD model for the fitting the PD data. In this example, individual PK parameters were estimated as the modes of the conditional distributions . An additional column MDV is necessary in the datafile in order to ignore the PK data
The estimated individual PK parameters are defined as regression variables, using the reserved keyword X. The covariates used for defining the distribution of the individual PK parameters are ignored.
We use the same turnover model for the PD data. Here, the PK parameters are defined as regression variables (i.e. regressors).
[LONGITUDINAL] input = {Imax, IC50, Rin, kout, Tlag, ka, V, Cl} Tlag = {use = regressor} ka = {use = regressor} V = {use = regressor} Cl = {use = regressor} EQUATION: Cc = pkmodel(Tlag,ka,V,Cl) E_0 = Rin/kout ddt_E= Rin*(1Imax*Cc/(Cc+IC50))  kout*E OUTPUT: output = E
Fitting a PKPD model to the PD data only
 warfarinPD_project (data = ‘warfarinPD_data.txt’, model = ‘turnoverPD_model.txt’)
In this example, only PD data are available:
Nevertheless, a PKPD model – where only the effect is defined as a prediction – can be used for fitting this data:
[LONGITUDINAL] input = {Tlag, ka, V, Cl, Imax, IC50, Rin, kout} EQUATION: Cc = pkmodel(Tlag, ka, V, Cl) E_0 = Rin/kout ddt_E = Rin*(1Imax*Cc/(Cc+IC50))  kout*E OUTPUT: output = E
Case studies
 8.case_studies/PKVK_project (data = ‘PKVK_data.txt’, model = ‘PKVK_model.txt’)
 8.case_studies/hiv_project (data = ‘hiv_data.txt’, model = ‘hivLatent_model.txt’)
2.4.2.Joint models for non continuous outcomes
 Joint model for continuous PK and categorical PD data
 Joint model for continuous PK and count PD data
 Joint model for continuous PK and timetoevent data
Objectives: learn how to implement a joint model for continuous and non continuous data.
Projects: warfarin_cat_project, PKcount_project, PKrtte_project
Joint model for continuous PK and categorical PD data
 warfarin_cat_project (data = ‘warfarin_cat_data.txt’, model = ‘PKcategorical1_model.txt’)
In this example, the original continuous PD data has been recoded as 1 (Low), 2 (Medium) and 3 (High).
More details about the data
International Normalized Ratio (INR) values are commonly used in clinical practice to target optimal warfarin therapy. Low INR values (<2) are associated with high blood clot risk and high ones (>3) with high risk of bleeding, so the targeted value of INR, corresponding to optimal therapy, is between 2 and 3.
Prothrombin complex activity is inversely proportional to the INR. We can therefore associate the three ordered categories for the INR to three ordered categories for PCA: Low PCA values if PCA is less than 33% (corresponding to INR>3), medium if PCA is between 33% and 50% () and high if PCA is more than 50% (INR<2).
The column dv contains both the PK and the new categorized PD measurements.
Instead of modeling the original continuous PD data, we can model the probabilities of each of these categories, which have direct clinical interpretations. The model is still a joint PKPD model since this probability distribution is expected to depend on exposure, i.e., the plasmatic concentration predicted by the PK model. We introduce an effect compartment to mimic the effect delay. Let be the PCA level for patient i at time . We can then use a proportional odds model for modeling this categorical data:
where is the predicted concentration of warfarin in the effect compartment at time for patient with PK parameters . This model defines a probability distribution for if .
If , the probability of low PCA at time () increases along with the predicted concentration . The joint model is implemented in the model file PKcategorical1_model.txt
[LONGITUDINAL] input = {Tlag, ka, V, Cl, ke0, alpha, beta, gamma} EQUATION: {Cc,Ce}= pkmodel(Tlag,ka,V,Cl,ke0) lp1 = alpha + beta*Ce lp2 = lp1+ gamma ; gamma >= 0 DEFINITION: Level = { type=categorical categories={1,2,3} logit(P(Level<=1)) = lp1 logit(P(Level<=2)) = lp2 } OUTPUT: output = {Cc, Level}
In this example, the residual error model for the PK data is defined in the Monolix GUI. We can equivalently use the model file PKcategorical2_model.txt where the the residual error model is defined in the model file:
[LONGITUDINAL] input = {Tlag, ka, V, Cl, ke0, alpha, beta, gamma, a, b} EQUATION: {Cc,Ce}= pkmodel(Tlag,ka,V,Cl,ke0) lp1 = alpha + beta*Ce lp2 = lp1+ gamma ; gamma >= 0 DEFINITION: Concentration = { distribution=normal, prediction=Cc, errorModel=combined1(a,b) } Level = { type=categorical categories={1,2,3} logit(P(Level<=1)) = lp1 logit(P(Level<=2)) = lp2 } OUTPUT: output = {Concentration, Level}
To use this model with the Monolix project, right click on the button Model file
and select Change model
. See Categorical data model for more details about categorical data models.
Joint model for continuous PK and count PD data
 PKcount_project (data = ‘PKcount_data.txt’, model = ‘PKcount1_model.txt’)
The data file used for this project is PKcount_data.txt where the PK and the count PD data are simulated data.
We use a Poisson distribution for the count data, assuming that the Poisson parameter is function of the predicted concentration. For any individual i, we have
where is the predicted concentration for individual at time and
The joint model is implemented in the model file PKcount1_model.txt
[LONGITUDINAL] input = {ka, V, Cl, lambda0, IC50} EQUATION: Cc = pkmodel(ka,V,Cl) lambda=lambda0*(1  Cc/(IC50+Cc)) DEFINITION: Seizure = { type = count, log(P(Seizure=k)) = lambda + k*log(lambda)  factln(k) } OUTPUT: output={Cc,Seizure}
Model file PKcount1_model.txt can be replaced by PKcount2_model.txt where the residual error model for the PK data is defined. See Count data model for more details about count data models.
Joint model for continuous PK and timetoevent data
 PKrtte_project (data = ‘PKrtte_data.txt’, model = ‘PKrtteWeibull1_model.txt’)
The data file used for this project is PKrtte_data.txt where the PK and the timetoevent data are simulated data.
We use a Weibull model for the events count data, assuming that the baseline is function of the predicted concentration. For any individual i, we define the hazard function as
where is the predicted concentration for individual i at time t. The joint model is implemented in the model file PKrtteWeibull1_model.txt
[LONGITUDINAL] input = {ka, V, Cl, gamma, beta} EQUATION: Cc = pkmodel(ka, V, Cl) if t<0.1 aux=0 else aux = gamma*Cc*(t^(beta1)) end DEFINITION: Hemorrhaging = {type=event, hazard=aux} OUTPUT: output = {Cc, Hemorrhaging}
Model file PKrtteWeibull1_model.txt can be replaced by PKrtteWeibull2_model.txt where the residual error model for the PK data is defined. See Timetoevent data model for more details about timetoevent data models.
2.5.Models for the individual parameters
2.5.1.Model for the individual parameters: introduction
A model for observations depend on a vector of individual parameters ψ . As we want to work with a population approach, we now suppose that comes from some probability distribution .
In this section, we are interested in the implementation of individual parameter distributions . Generally speaking, we assume that individuals are independent. This means that in the following analysis, it is sufficient to take a closer look at the distribution of a unique individual i. The distribution plays a fundamental role since it describes the interindividual variability of the individual parameter .
In Monolix, we consider that some transformation of the individual parameters is normally distributed and is a linear function of the covariates:
This model gives a clear and easily interpreted decomposition of the variability of around , i.e., of around :
The component describes part of this variability by way of covariates that fluctuate around a typical value .
The random component describes the remaining variability, i.e., variability between subjects that have the same covariate values.
By definition, a mixed effects model combines these two components: fixed and random effects. In linear covariate models, these two effects combine additively. In the present context, the vector of population parameters to estimate is . Several extensions of this basic model are possible:
We can suppose for instance that the individual parameters of a given individual can fluctuate over time. Assuming that the parameter values remain constant over some periods of time called \emph{occasions}, the model needs to be able to describe the interoccasion variability (IOV) of the individual parameters.
If we assume that the population consists of several homogeneous subpopulations, a straightforward extension of mixed effects models is a finite mixture of mixed effects models, assuming for instance that the distribution is a mixture of distributions.
2.5.2.Probability distribution of the individual parameters
 Introduction
 Marginal distributions of the individual parameters
 Correlation structure of the random effects
Objectives: learn how to define the probability distribution and the correlation structure of the individual parameters.
Projects: warfarin_distribution1_project, warfarin_distribution2_project, warfarin_distribution3_project, warfarin_distribution4_project
Introduction
One way to extend the use of Gaussian distributions is to consider that some transformation of the parameters in which we are interested is Gaussian, i.e., assume the existence of a monotonic function such that is normally distributed. Then, there exists some and such that, for each individual :
where is the predicted value of . In this section, we consider models for the individual parameters without any covariate. Then, the predicted value of is the and
Transformation defines the distribution of . Some predifined distributions/transformations are available in Monolix
:
 Normal distribution:
The two mathematical representations for normal distributions are equivalent:
 Lognormal distribution:
A lognormally random variable takes positive values only. A lognormal distribution looks like a normal distribution for a small variance . On the other hand the asymmetry of the distribution increases when increases.
Remark: the two mathematical representations for lognormal distributions are equivalent:
 Powernormal (or BoxCox) distribution:
Here, (with ) follows a normal distribution truncated so that . It therefore takes positive values. This distribution converges to a lognormal one when and a truncated normal one when .
 Logitnormal distribution:
A random variable with a logitnormal distribution takes its values in . The logit of is normally distributed, i.e.,
 Probitnormal distribution:
A random variable with a probitnormal distribution also takes its values in (0,1).The probit function is the inverse cumulative distribution function (quantile function) associated with the standard normal distribution :
The probitnormal distribution with and is the uniform distribution on . Logit and probit transformations can be generalized to any interval (a,b) by setting where is a random variable that takes values in (0,1) with a logitnormal (or probitnormal) distribution.
Marginal distributions of the individual parameters
 warfarin_distribution1_project (data = ‘warfarin_data.txt’, model = ‘lib:oral1_1cpt_TlagkaVCl.txt’)
We use the warfarin PK example here. The four PK parameters Tlag, ka, V and Cl are lognormally distributed:
Letter L
is then used for these four lognormal distributions in the main Monolix
graphical user interface:
The distribution of the 4 PK parameters defined in the Monolix
GUI is automatically translated into Mlxtran in the project file:
[INDIVIDUAL] input = {Tlag_pop, omega_Tlag, ka_pop, omega_ka, V_pop, omega_V, Cl_pop, omega_Cl} DEFINITION: Tlag = {distribution=lognormal, typical=Tlag_pop, sd=omega_Tlag} ka = {distribution=lognormal, typical=ka_pop, sd=omega_ka} V = {distribution=lognormal, typical=V_pop, sd=omega_V} Cl = {distribution=lognormal, typical=Cl_pop, sd=omega_Cl}
Estimated parameters are the parameters of the 4 lognormal distributions and the parameters of the residual error model:
Here, and means that the estimated population distribution for the volume is: or, equivalently, where .
Remark: is not the population mean of the distribution of , but the median of this distribution.
The four probability distribution functions are displayed Figure Parameter distributions
:
Click on Settings and select the checkbox Informations to display a summary of these distributions:
Remark 1: is not the population mean of the distribution of , but the median of this distribution. The same property holds for the 3 other distributions which are not Gaussian.
Remark 2: here, standard deviations , , and are approximatively the coefficients of variation (CV) of Tlag, ka, V and Cl since these 4 parameters are lognormally distributed with variances < 1.
 warfarin_distribution2_project (data = ‘warfarin_data.txt’, model = ‘lib:oral1_1cpt_TlagkaVCl.txt’)
Other distributions for the PK parameters are used in this project:
 we use a uniform distribution on (0,3) for Tlag. This is equivalent to rescale a probitnormal distribution on (0,3)
and fix the population value to 1.5 (=(0 + 3)/2) and the standard deviation to 1:
 we use a logitnormal distribution rescaled on (0.25, 4.25) for ka:
 a normal distribution for
 a lognormal distribution for
Distributions of Tlag and ka are therefore defined as Custom
:
Estimated parameters are the parameters of the 4 transformed normal distributions and the parameters of the residual error model:
Here, and means that while and means that
The four probability distribution functions are displayed Figure Parameter distributions
:
Remark 1: population values , , and are the medians of the distributions of the 4 PK parameters.
Remark 2: here, standard deviations , and cannot be interpreted as coefficients of variation (CV) since Tlag, ka and V[ are not lognormally distributed.
Correlation structure of the random effects
 warfarin_distribution3_project (data = ‘warfarin_data.txt’, model = ‘lib:oral1_1cpt_TlagkaVCl.txt’)
Correlation between the random effects can be introduced in the model. A block structure is used in this project, assuming linear correlations between and and between and :
Estimated population parameters now include these 2 correlations:
Remark: corr_Tlag_ka is not the correlation between the individual parameters and but the (linear) correlation between the random effects and .
 warfarin_distribution4_project (data = ‘warfarin_data.txt’, model = ‘lib:oral1_1cpt_TlagkaVCl.txt’)
In this example, does not vary in the population, which means that for all , while the three other random effects are correlated:
Estimated population parameters now include the 3 correlations between , and :
2.5.3.Model for individual covariates
 Model with continuous covariates
 Model with categorical covariates
 Transforming categorical covariates
Objectives: learn how to implement a model for continuous and/or categorical covariates.
Projects: warfarin_covariate1_project, warfarin_covariate2_project, warfarin_covariate3_project, phenobarbital_project
See also: Complex parametercovariate relationships and timedependent covariates
Model with continuous covariates
 warfarin_covariate1_project (data = ‘warfarin_data.txt’, model = ‘lib:oral1_1cpt_TlagkaVCl.txt’)
The warfarin data contains 2 individual covariates: weight which is a continuous covariate and sex which is a categorical covariate with 2 categories (1=Male, 0=Female). We can ignore these columns if are sure not to use them, or declare them using respectively the reserved keywords COV
(for continuous covariate) and CAT
(for categorical covariate):
Even if these 2 covariates are now available, we can choose to define a model without any covariate:
Here, a 0 on the first row (weight) and the third column (V), for instance, means that there is no relationship between weight and volume in the model. A diagnostic plot Covariates
is generated which displays possible relationships between covariates and individual parameters (even if these covariates are not used in the model):
 warfarin_covariate2_project (data = ‘warfarin_data.txt’, model = ‘lib:oral1_1cpt_TlagkaVCl.txt’)
We decide to use the weight in this project in order to explain part of the variability of and . We will implement the following model for these two parameters:
which means that population parameters of the PK parameters are defined for a typical individual of the population with weight=70kg.
More details about the model
The model for and can be equivalently written as follows:
The individual predicted values for and are therefore
and the statistical model describes how and are distributed around these predicted values:
Here, and are linear functions of : we then need to transform first the original covariate into by clicking on the button Transform
of the main GUI. We can then transform and rename the original covariates of the dataset:
We then define a new covariate model, where and are linear functions of the transformed weight :
Coefficients and are now estimated with their s.e. and the pvalues of the Wald tests are derived to test if these coefficients are different from 0:
Model with categorical covariates
 warfarin_covariate3_project (data = ‘warfarin_data.txt’, model = ‘lib:oral1_1cpt_TlagkaVCl.txt’)
We use sex instead of weight in this project, assuming different population values of volume and clearance for males and females.
More precisely, we consider the following model for and :
where if individual is a female and 0 otherwise. Then, and are the population volume and clearance for males while and are the population volume and clearance for females. By clicking on the button Transform
, we can modify the name of the categories (click on the name of each category and enter a new name) and the reference category (Male is the reference category in this project):
Define then the covariate model in the main GUI:
We can furthermore assume different variances of the random effects for males and females by clicking on the button Cat. Var.
and selecting which variances depend on the categorical covariate:
Estimated population parameters, including the coefficients and and the 2 standard deviations and are displayed with the results:
Estimated population parameters are also computed and displayed per category:
We can display the probability distribution functions of the 4 PK parameters:
We can then distinguish the distributions for males and females by clicking on Settings and ticking the box By group:
Transforming categorical covariates
 phenobarbital_project (data = ‘phenobarbital_data.txt’, model = ‘lib:bolus_1cpt_Vk.txt’)
The phenobarbital data contains 2 covariates: the weight and the Apgar score which is considered as a categorical covariate:
Instead of using the 10 original levels of the Apgar score, we will transform this categorical covariate and create 3 categories: Low = {1,2,3}, Medium = {4, 5, 6, 7} and High={8,9,10}.
If we assume, for instance that the volume is related to the Apgar score, then and are estimated (assuming that Medium is the reference level).
2.5.4.Inter occasion variability (IOV)
 Introduction
 Cross over study
 Occasions with washout
 Occasions without washout
 Multiple levels of occasions
Objectives: learn how to take into account inter occasion variability (IOV).
Projects: iov1_project, iov2_project, iov3_project, iov4_project
Introduction
A simple model consists of splitting the study into K time periods or occasions and assuming that individual parameters can vary from occasion to occasion but remain constant within occasions. Then, we can try to explain part of the intraindividual variability of the individual parameters by piecewiseconstant covariates, i.e., occasiondependent or occasionvarying (varying from occasion to occasion and constant within an occasion) ones. The remaining part must then be described by random effects. We will need some additional notation for describing this new statistical model. Let
 be the vector of individual parameters of individual i for occasion k, where and .
 be the vector of covariates of individual i for occasion k. Some of these covariates remain constant (gender, group treatment, ethnicity, etc.) and others can vary (weight, treatment, etc.).
Let be the sequence of K individual parameters for individual i. We also need to define:
 , the vector of random effects which describes the random interindividual variability of the individual parameters,
 , the vector of random effects which describes the random intraindividual variability of the individual parameters in occasion k, for each .
Here and in the following, the superscript (0) is used to represent interindividual variability, i.e., variability at the individual level, while superscript (1) represents interoccasion variability, i.e., variability at the occasion level for each individual. The model now combines these two sequences of random effects:
Remark: Individuals do not need to share the same sequence of occasions: the number of occasions and the times defining the occasions can differ from one individual to another.
Cross over study
 iov1_project (data = ‘iov1_data.txt’, model = ‘lib:oral1_1cpt_kaVk.txt’)
In this example, PK data are collected for each patient during two independent treatment periods of time (each one starting at time 0). A column OCC is used to identify the period:
This column is defined using the reserved keyword OCC:
Here, treatment (column TREAT) changes with the occasion. Then, Monolix
automatically creates a categorical covariate sTREAT which is the sequence of treatments (AB or BA). This new categorical covariates doesn’t change with time and can be used at the individual level” with SEX. We first define the distribution of the individual parameters at the “
individual level” as usual:
By clicking on the button IOV
, we then define the distribution of the individual parameters at the occasion level”. In this example, we assume that the volume depends on the treatment we assume interoccasion variability for and :
The population parameters now include the standard deviations of the random effects for the 2 levels of variability (omega is used fo IIV and gamma for IOV):
By clicking on Stratify
in the menu bar of the graphics, you can use the occasion to split the plots
VPC per occasion:
Prediction distribution per plot:
Occasions with washout
 iov2_project (data = ‘iov2_data.txt’, model = ‘lib:oral1_1cpt_kaVk.txt’)
The time is increasing in this example, but the dynamical system (i.e. the compartments) is reset when the second period starts. Column EVID provides some information about events concerning dose administration. In particular, EVID=4 indicates that the system is reset (washout) when a new dose is administrated
This column is defined using the reserved keyword EVID:
Monolix
automatically proposes to define the treatment periods (between successive resetting) as statistical occasions and introduce IOV, as we did in the previous example. We can display the individual fit by splitting each occasion for each individual
Or by merging the different occasions in a unique plot for each individual:
Occasions without washout
 iov3_project (data = ‘iov3_data.txt’, model = ‘lib:oral1_1cpt_kaVk.txt’)
Multiple doses are administrated to each patient. We consider each period of time between successive doses as an statistical occasion. A column OCC is therefore necessary in the data file:
The model for IIV and IOV can then be defined as usual. The plot of individual fits allows us to check that the predicted concentration is now continuous over the different occasions for each individual:
Multiple levels of occasions
 iov4_project (data = ‘iov4_data.txt’, model = ‘lib:oral1_1cpt_kaVk.txt’)
We can easily extend such approach to multiple levels of variability. In this example, columns P1 and P2 define embedded occasions. The are both defined as occasions:
We then define a statistical model for each level of variability. We can switch to the next/previous level by clicking on the arrow button
2.5.5.Mixture of distributions
 Introduction
 Mixture of distributions – supervised learning
 Mixture of distributions – unsupervised learning
Objectives: learn how to implement a mixture of distributions for the individual parameters.
Projects: PKgroup_project, PKmixt_project
Introduction
Mixed effects models allow us to take into account betweensubject variability. One complicating factor arises when data is obtained from a population with some underlying heterogeneity. If we assume that the population consists of several homogeneous subpopulations, a straightforward extension of mixed effects models is a finite mixture of mixed effects models, assuming, for instance, that the probability distribution of some individual parameters vary from one subpopulation to another one. The introduction of a categorical covariate (e.g., sex, genotype, treatment, status, etc.) into such a model already supposes that the whole population can be decomposed into subpopulations. The covariate then serves as a \emph{label} for assigning each individual to a subpopulation.
In practice, the covariate can either be known or not. Mixture models usually refer to models for which the categorical covariate is unknown, but whichever the case, the model is the same. The difference appears when having to perform certain tasks and in the methods needed to implement them. For instance, model building is different depending on whether the labels are known or unknown: we have supervised learning if the labels are known and unsupervised otherwise. For the sake of simplicity, we will consider a basic model that involves individual parameters and observations . Then, the easiest way to model a finite mixture model is to introduce a label sequence that takes its values in such that if subject i belongs to subpopulation m.
In some situations, the label sequence is known and can be used as a categorical covariate in the model. If is unknown, it can modeled as a set of independent random variables taking its values in where for , is the probability that individual i belongs to group m. We will assume furthermore that the are identically distributed, i.e., does not depend on i for .
Mixture of distributions – supervised learning
 PKgroup_project (data = ‘PKmixt_data.txt’, model = ‘lib:oral1_1cpt_kaVCl.txt’)
The sequence of labels is known as GROUP in this project and is therefore defined as a categorical covariate:
We can then assume, for instance different population values for the volume in the 2 groups:
and estimate the population parameters using this covariate model
Then, this covariate GROUP can be used as a stratification variable.
Mixture of distributions – unsupervised learning
 PKmixt_project (data = ‘PKmixt_data.txt’, model = ‘lib:oral1_1cpt_kaVCl.txt’)
We will use the same data with this project but ignoring the column GROUP (which is equivalent to assume that the label is unknown)
If we suspect some heterogeneity in the population, we can introduce a “latent covariate” and select the number of categories for this latent covariate.
Remark: several latent covariates can be introduced in the model, with different number of categories.
We can then use this latent covariate lcat as any observed categorical covariate. We can assume again different population values for the volume in the 2 groups:
and estimate the population parameters using this covariate model. Proportions of each group are also estimated:
Once the population parameters are estimated, the sequence of latent covariates , i.e. the group to which belongs each subject, can be estimated together with the individual parameters, as the modes of the conditional distributions.
The sequence of estimated latent covariates can be used as a stratification variable:
We can for example display the VPC in the 2 groups:
By plotting the distribution of the individual parameters, we see that has a bimodal distribution
This distribution can be decomposed into 2 distributions by ticking the box By group:
2.6.Pharmacokinetic models
2.6.1.PK model: single route of administration
 Introduction
 Intravenous bolus injection
 Intravenous infusion
 Oral administration
 Using different parametrizations
Objectives: learn how to define and use a PK model for single route of administration.
Projects: bolusLinear_project, bolusMM_project, bolusMixed_project, infusion_project, oral1_project, oral0_project, sequentialOral0Oral1_project, simultaneousOral0Oral1_project, oralAlpha_project, oralTransitComp_project
Introduction
Once a drug is administered, we usually describe subsequent processes within the organism by the pharmacokinetics (PK) process known as ADME: absorption, distribution, metabolism, excretion. A PK model is a dynamical system mathematically represented by a system of ordinary differential equations (ODEs) which describes transfers between compartments and elimination from the central compartment.
See this web animation for more details.
Mlxtran
is remarkably efficient for implementing simple and complex PK models:
 The function
pkmodel
can be used for standard PK models. The model is defined according to the provided set of named arguments. Thepkmodel
function enables different parametrizations, different models of absorption, distribution and elimination.  PK macros define the different components of a compartmental model. Combining such PK components provide a high degree of flexibility for complex PK models. They can also extend a custom ODE system.
 A system of ordinary differential equations (ODEs) can be implemented very easily.
It is also important to highlight the fact that the data file use by Monolix
for PK modelling only contains information about dosing, i.e. how and when the drug is administrated. There is no need to integrate in the data file any information related to the PK model. This is an important remark since it means that any (complex) PK model can be used with the same data file. In particular, we make a clear distinction between administration (related to the data) and absorption (related to the model).
The pkmodel function
The PK model is defined by the names of the input parameters of the pkmodel
function. These names are reserved keywords.
Absorption
 p: Fraction of dose which is absorbed
 ka: absorption constant rate (first order absorption)
 or, Tk0: absorption duration (zero order absorption)
 Tlag: lag time before absorption
 or, Mtt, Ktr: mean transit time & transit rate constant
Distribution
 V: Volume of distribution of the central compartment
 k12, k21: Transfer rate constants between compartments 1 (central) & 2 (peripheral)
 or V2, Q2: Volume of compartment 2 (peripheral) & inter compartment clearance, between compartments 1 and 2,
 k13, k31: Transfer rate constants between compartments 1 (central) & 3 (peripheral)
 or V3, Q3: Volume of compartment 3 (peripheral) & inter compartment clearance, between compartments 1 and 3.
Elimination
 k: Elimination rate constant
 or Cl: Clearance
 Vm, Km: Michaelis Menten elimination parameters
Effect compartment
 ke0: Effect compartment transfer rate constant
Intravenous bolus injection
Linear elimination
 bolusLinear_project
A single iv bolus is administered at time 0 to each patient. The data file bolus1_data.txt contains 4 columns: id, time, amt (the amount of drug in mg) and y (the measured concentration):
The names of these columns are recognized as keywords by Monolix
:
It is important to remark that, in this data file, a row contains either some information about the dose (in which case y = ".") or a measurement (in which case amt = "."). We could equivalently use the data file bolus2_data.txt which contains 2 additional columns: EVID and MDV (these two names are also recognized by Monolix
):
Here, EVID=1 means that this record describes a dose while EVID=0 means that this record contains an observed value. On the other hand, MDV=1 means that the observed value of this record should be ignored while MDV=0 means that this record contains an observed value. The two data files bolus1_data.txt and bolus2_data.txt contain exactly the same information and provide exactly the same results. A one compartment model with linear elimination is used with this project:
Here, and are, respectively, the amount and the concentration of drug in the central compartment at time . When a dose arrives in the central compartment at time , an iv bolus administration assumes that
where (resp. ) is the amount of drug in the central compartment just before (resp. after) . Parameters of this model are and . We therefore use the model bolus_1cpt_Vk from the Monolix
PK library:
[LONGITUDINAL] input = {V, k} EQUATION: Cc = pkmodel(V, k) OUTPUT: output = Cc
We could equivalently use the model bolusLinearMacro.txt (click on the button Model
and select the new PK model in the library 6.PK_models/model)
[LONGITUDINAL] input = {V, k} PK: compartment(cmt=1, amount=Ac) iv(cmt=1) elimination(cmt=1, k) Cc = Ac/V
These two implementations generate exactly the same C++ code and then provide exactly the same results. Here, the ODE system is linear and Monolix
uses its analytical solution. Of course, it is also possible (but not recommended with this model) to use the ODE based PK model bolusLinearODE.txt :
[LONGITUDINAL] input = {V, k} PK: depot(target = Ac) EQUATION: ddt_Ac =  k*Ac Cc = Ac/V
Results obtained with this model are slightly different from the ones obtained with the previous implementations since a numeric scheme is used here for solving the ODE. Individual fits obtained with model look nice
but the VPC show some misspecification in the elimination process:
Michaelis Menten elimination
 bolusMM_project
A non linear elimination is used with this project:
This model is available in the Monolix
PK library as bolus_1cpt_VVmKm:
[LONGITUDINAL] input = {V, Vm, Km} PK: Cc = pkmodel(V, Vm, Km) OUTPUT: output = Cc
Instead of this model, we could equivalently use PK macros with bolusNonLinearMacro.txt from the library 6.PK_models/model:
[LONGITUDINAL] input = {V, Vm, Km} PK: compartment(cmt=1, amount=Ac, volume=V) iv(cmt=1) elimination(cmt=1, Vm, Km) Cc = Ac/V OUTPUT: output = Cc
or an ODE with bolusNonLinearODE:
[LONGITUDINAL] input = {V, Vm, Km} PK: depot(target = Ac) EQUATION: ddt_Ac = Vm*Ac/(V*Km+Ac) Cc=Ac/V OUTPUT: output = Cc
Results obtained with these three implementations are identical since no analytical solution is available for this non linear ODE. We can then check that this PK model seems to describe much better the elimination process of the data:
Mixed elimination
 bolusMixed_project
THe Monolix
PK library contains standard” PK models. More complex models should be implemented by the user in a model file. For instance, we assume in this project that the elimination process is a combination of linear and nonlinear elimination processes:
This model is not available in the Monolix
PK library. It is implemented in bolusMixed.txt:
[LONGITUDINAL] input = {V, k, Vm, Km} PK: depot(target = Ac) EQUATION: ddt_Ac = Vm*Ac/(V*Km+Ac)  k*Ac Cc=Ac/V OUTPUT: output = Cc
This model seems to describe very well the data:
Intravenous infusion
 infusion_project
Intravenous infusion assumes that the drug is administrated intravenously with a constant rate (infusion rate), during a given time (infusion time). Since the amount is the product of infusion rate and infusion time, an additional column rate or time is required in the data file: Monolix
can use both indifferently. Data file infusion_rate_data.txt has an additional column rate:
It can be replaced by infusion_tinf_data.txt which contains exactly the same information:
We use with this project a 2 compartment model with non linear elimination and parameters , , , , :
This model is available in the Monolix
PK library as infusion_2cpt_V1QV2VmKm:
[LONGITUDINAL] input = {V1, Q, V2, Vm, Km} PK: V = V1 k12 = Q/V1 k21 = Q/V2 Cc = pkmodel(V, k12, k21, Vm, Km) OUTPUT: output = Cc
Oral administration
firstorder absorption
 oral1_project
This project uses the data file oral_data.txt. For each patient, information about dosing is the time of administration and the amount:
A one compartment model with first order absorption and linear elimination is used with this project. Parameters of the model are , and . we will then use model oral1_kaVCl.txt from the Monolix
PK library
[LONGITUDINAL] input = {ka, V, Cl} EQUATION: Cc = pkmodel(ka, V, Cl) OUTPUT: output = Cc
Both the individual fits and the VPCs show that this model doesn’t describe properly the absorption process:
There exist many options for implementing this PK model with Mlxtran
:
– using PK macros: oralMacro.txt:
[LONGITUDINAL] input = {ka, V, Cl} PK: compartment(cmt=1, amount=Ac) oral(cmt=1, ka) elimination(cmt=1, k=Cl/V) Cc=Ac/V OUTPUT: output = Cc
– using a system of two ODEs as in oralODEb.txt:
[LONGITUDINAL] input = {ka, V, Cl} PK: depot(target=Ad) EQUATION: k = Cl/V ddt_Ad = ka*Ad ddt_Ac = ka*Ad  k*Ac Cc = Ac/V
– combining PK macros and ODE as in oralMacroODE.txt (macros are used for the absorption and ODE for the elimination):
[LONGITUDINAL] input = {ka, V, Cl} PK: compartment(cmt=1, amount=Ac) oral(cmt=1, ka) EQUATION: k = Cl/V ddt_Ac =  k*Ac Cc = Ac/V
– or equivalently, as in oralODEa.txt:
[LONGITUDINAL] input = {ka, V, Cl} PK: depot(target=Ac, ka) EQUATION: k = Cl/V ddt_Ac =  k*Ac Cc = Ac/V
Remark: Models using the pkmodel function or PK macros only use an analytical solution of the ODE system.
zeroorder absorption
 oral0_project
A one compartment model with zero order absorption and linear elimination is used to fit the same PK data with this project. Parameters of the model are Tk0, V and Cl. We will then use model oral1_Tk0VCl.txt from the Monolix
PK library
[LONGITUDINAL] input = {Tk0, V, Cl} EQUATION: Cc = pkmodel(Tk0, V, Cl) OUTPUT: output = Cc
Both the individual fits and the VPCs show that a zero absorption process better fit the data than a first order one, but the absorption process is not properly described yet…
Remark 1: implementing a zeroorder absorption process using ODEs is not easy… on the other hand, it becomes extremely easy to implement using either the pkmodel function or the PK macro oral(Tk0).
Remark 2: The duration of a zeroorder absorption has nothing to do with an infusion time: it is a parameter of the PK model (exactly as the absorption rate constant ka for instance), it is not part of the data.
sequential zeroorder firstorder absorption
 sequentialOral0Oral1_project
More complex PK models can be implemented using Mlxtran
. A sequential zeroorder firstorder absorption process assumes that a fraction Fr of the dose is first absorbed during a time Tk0 with a zeroorder process, then, the remaining fraction is absorbed with a firstorder process. This model is implemented in sequentialOral0Oral1.txt using PK macros:
[LONGITUDINAL] input = {Fr, Tk0, ka, V, Cl} PK: compartment(amount=Ac) absorption(Tk0, p=Fr) absorption(ka , Tlag=Tk0 , p=1Fr) elimination(k=Cl/V) Cc=Ac/V
Both the individual fits and the VPCs show that this PK model perfectly describes the whole ADME process for the same PK data:
simultaneous zeroorder firstorder absorption
 simultaneousOral0Oral1_project
A simultaneous zeroorder firstorder absorption process assumes that a fraction of the dose is absorbed with a zeroorder process while the remaining fraction is absorbed simultaneously with a firstorder process. This model is implemented in simultaneousOral0Oral1.txt using PK macros:
[LONGITUDINAL] input = {Fr, Tk0, ka, V, Cl} PK: compartment(amount=Ac) absorption(Tk0, p=Fr) absorption(ka, p=1Fr) elimination(k=Cl/V) Cc=Ac/V
alphaorder absorption
 oralAlpha_project
An order absorption process assumes that the rate of absorption is proportional to some power of the amount of drug in the depot compartment:
This model is implemented in oralAlpha.txt using ODEs:
[LONGITUDINAL] input = {r, alpha, V, Cl} PK: depot(target = Ad) EQUATION: dAd = Ad^alpha ddt_Ad = r*dAd ddt_Ac = r*Ad  (Cl/V)*Ac Cc = Ac/V
transit compartment model
 oralTransitComp_project
A PK model with transit compartment of transit rate and mean transit time can be implemented using the PK macro oral(ka, Mtt, Ktr), or using the pkmodel function, as in oralTransitComp.txt:
[LONGITUDINAL] input = {Mtt, Ktr, ka, V, Cl} EQUATION: Cc = pkmodel(Mtt, Ktr, ka, V, Cl)
Using different parametrizations
The PK macros and the function pkmodel use some preferred parametrizations and some reserved names as input arguments: Tlag, ka, Tk0, V, Cl, k12, k21… It is however possible to use another parametrization and/or other parameter names. As an example, consider a 2 compartments model for oral administration with a lag, a first order absorption and a linear elimination. We can use the pkmodel function with, for instance, parameters ka, V, k, and :
[LONGITUDINAL] input = {ka, V, k, k12, k21} PK: Cc = pkmodel(ka, V, k, k12, k21)
Imagine now that we want i) to use the clearance instead of the elimination rate constant , ii) to use capital letters for the parameter names. We can still use the pkmodel function as follows:
[LONGITUDINAL] input = {KA, V, CL, K12, K21} PK: Cc = pkmodel(ka=KA, V, k=CL/V, k12=K12, k21=K21)
2.6.2.PK model: multiple routes of administration
Objectives: learn how to define and use a PK model for multiple routes of administration..
Projects: ivOral1_project, ivOral2_project
Combining iv and oral administrations – Example 1
 ivOral1_project (data = ‘ivOral1_data.txt’ , model = ‘ivOral1Macro_model.txt’)
In this example, we combine oral and iv administrations of the same drug. The data file ivOral1_data.txt contains an additional column ADM which indicates the route of administration (1=oral, 2=iv):
We assume here a one compartment model with firstorder absorption process from the depot compartment (oral administration) and a linear elimination process from the central compartment. We further assume that only a fraction F (bioavailability) of the drug orally administered is absorbed. This model is implemented in ivOral1Macro_model.txt
using PK macros:
[LONGITUDINAL] input = {F, ka, V, k} PK: compartment(cmt=1, amount=Ac) oral(type=1, cmt=1, ka, p=F) iv(type=2, cmt=1) elimination(cmt=1, k) Cc = Ac/V
A logitnormal distribution is used for bioavability F that takes it values in (0,1):
The model properly fits the data:
Remark: the same PK model could be implemented using ODEs instead of PK macros.
Let and be, respectively, the amounts in the depot compartment (gut) and the central compartment (bloodtsream). Kinetics of and are described by the following system of ODEs
The target compartment is the depot compartment () for oral administrations and the central compartment () for iv administrations. This model is implemented in ivOral1ODE_model.txt
using a system of ODEs:
[LONGITUDINAL] input = {F, ka, V, k} PK: depot(type=1, target=Ad, p=F) depot(type=2, target=Ac) EQUATION: ddt_Ad = ka*Ad ddt_Ac = ka*Ad  k*Ac Cc = Ac/V
Solving this ODEs system is less efficient than using the PK macros which uses the analytical solution of the linear system.
Combining iv and oral administrations – Example 2
 ivOral2_project (data = ‘ivOral2_data.txt’ , model = ‘ivOral2Macro_model.txt’)
In this example (based on simulated PK data), we combine intraveinous injection with 3 different types of oral administrations of the same drug. The datafile ivOral2_data.txt contains column ADM which indicates the route of administration (1,2,3=oral, 4=iv):
We assume that one type of oral dose (adm=1) is absorbed into a latent compartment following a zeroorder absorption process. the 2 oral doses (adm=2,3) are absorbed into the central compartment following firstorder absorption processes with different rates. Bioavailabilities are supposed to be different for the 3 oral doses. There is linear transfer from the latent to the central compartment. A peripheral compartment is linked to the central compartment. The drug is eliminated by a linear process from the central compartment:
This model is implemented in ivOral2Macro_model.txt
using PK macros:
[LONGITUDINAL] input = {F1, F2, F3, Tk01, ka2, ka3, kl, k23, k32, V, Cl} PK: compartment(cmt=1, amount=Al) compartment(cmt=2, amount=Ac) peripheral(k23,k32) oral(type=1, cmt=1, Tk0=Tk01, p=F1) oral(type=2, cmt=2, ka=ka2, p=F2) oral(type=3, cmt=2, ka=ka3, p=F3) iv(type=4, cmt=2) transfer(from=1, to=2, kt=kl) elimination(cmt=2, k=Cl/V) Cc = Ac/V OUTPUT: output = Cc
Here, logitnormal distributions are used for bioavabilities , and . The model properly fits the data:
Remark: the number and type of doses vary from one patient to another one in this example.
2.6.3.From multiple doses to steadystate
Objectives: learn how to define and use a PK model with multiple doses or assuming steadystate.
Projects: multidose_project, addl_project, ss1_project, ss2_project, ss3_project
Multiple doses
 multidose_project (data = ‘multidose_data.txt’ , model = ‘lib:bolus_1cpt_Vk.txt’)
In this project, each patient receives several iv bolus injections. Each dose is represented by a row in the data file multidose_data.txt:
The PK model and the statistical model used in this project properly fit the observed data of each individual. Even if there is no observations between 12h and 72h, predicted concentrations computed on this time interval exhibit the multiple doses received by each patient:
VPCs, which is a diagnostic tool, are based on the design of the observations and therefore ignore” what may happen between 12h and 72h:
On the other hand, the prediction distribution, which is not a diagnostic tool, computes the distribution of the predicted concentration at any time point:
Additional doses (ADDL)
 addl_project (data = ‘addl_data.txt’ , model = ‘lib:bolus_1cpt_Vk.txt’)
We can remark in the previous project, that, for each patient, the interval time between two successive doses is the same (12 hours for each patient) and the amount of drug which is administrated is always the same as well (40mg for each patient). We can take advantage of this design in order to simplify the data file by defining, for each patient, a unique amount (AMT), the number of additional doses which are administrated after the first one (ADDL) and the time interval between successive doses (II):
The keywords ADDL and II are automatically recognized by Monolix
:
Remarks:
 Results obtained with this project, i.e. with this data file, are identical to the ones obtained with the previous project.
 It is possible to combine single doses (using ADDL=0) and repeated doses in a same data file.
Steadystate
 ss1_project (data = ‘ss1_data.txt’ , model = ‘lib:oral0_1cpt_Tk0VCl.txt’)
The dose orally administrated at time 0 to each patient is assumed to be a *steadystate dose* which means that a large” number of doses before time 0 have been administrated, with a constant amount and a constant interval dosing, such that steadystate, i.e. equilibrium, is reached at time 0. The data file ss1_data contains a column SS which indicates if the dose is a staedystate dose or not and a column II for the interdose interval:
The keywords SS and II are automatically recognized by Monolix
:
Click on Check the initial fixed effects
and set the minimum value of the Xaxis (min X) to display the predicted concentration between the last dose administrated at time 0:
We see on this plot that Monolix adds 4 doses before the last dose to reach steadystate. Individual fits display the predicted concentrations computed with these additional doses:
 ss2_project (data = ‘ss2_data.txt’ , model = ‘lib:oral0_1cpt_Tk0VCl.txt’)
Steadystate and non steadysates doses are combined in this project:
Individual fits display the predicted concentrations computed with this combination of doses:
 ss3_project (data = ‘ss3_data.txt’ , model = ‘lib:oral0_1cpt_Tk0VCl.txt’)
Two different steadystate doses are combined in this project:
Monolix
automatically creates occasions between successive steadystate doses in such situation. It is therefore possible to introduce interoccasion variability (IOV) in the model:
2.7.Extensions
2.7.1.Using regression variables
Objectives: learn how to define and use regression variables (time varying covariates).
Projects: reg1_project, reg2_project
Introduction
A regression variable is a variable which is a given function of time, which is not defined in the model but which is used in the model. is only defined at some time points (possibly different from the observation time points), but is a function of time that should be defined for any (if is used in an ODE for instance, or if a prediction is computed on a fine grid). Then, Mlxtran defines the function by intepolating the given values . In the current version of Mlxtran, interpolation is performed by using the last given value:
Continuous regression variables
 reg1_project (data = reg1_data.txt , model=reg1_model.txt)
We consider a basic PD model in this example, where some concentration values are used as a regression variable:
[LONGITUDINAL] input = {Emax, EC50, Cc} Cc = {use=regressor} EQUATION: E = Emax*Cc/(EC50 + Cc)
The predicted effect is therefore piecewise constant: it changes at the time points where concentration values are provided:
Categorical regression variables
 reg2_project (data = reg2_data.txt , model=reg2_model.txt)
The variable takes its values in {1, 2} in this example and represents the state of individual at time . We then assume that the observed data has a Poisson distribution with parameter if and parameter if . is known in this example: it is then defined as a regression variable in the model:
[LONGITUDINAL] input = {lambda1, lambda2, z} z = {use=regressor} EQUATION: if z==0 lambda=lambda1 else lambda=lambda2 end DEFINITION: y = {type=count, log(P(y=k)) = lambda + k*log(lambda)  factln(k) }
2.7.2.Bayesian estimation
 Introduction
 Computing the Maximum a posteriori (MAP) estimate
 Estimating the posterior distribution
 Fixing the value of a parameter
 Bayesian estimation of several population parameters
Objectives: learn how to combine maximum likelihood estimation and Bayesian estimation of the population parameters.
Projects: theobayes1_project, theobayes2_project, theobayes3_project
Introduction
The Bayesian approach considers the vector of population parameters as a random vector with a prior distribution . We can then define the *posterior distribution* of :
We can estimate this conditional distribution and derive statistics (posterior mean, standard deviation, quantiles, etc.) and the socalled maximum a posteriori (MAP) estimate of :
The MAP estimate maximizes a penalized version of the observed likelihood. In other words, MAP estimation is the same as penalized maximum likelihood estimation. Suppose for instance that is a scalar parameter and the prior is a normal distribution with mean and variance . Then, the MAP estimate is the solution of the following minimization problem:
This is a tradeoff between the MLE which minimizes the deviance, , and which minimizes . The weight given to the prior directly depends on the variance of the prior distribution: the smaller is, the closer to the MAP is. In the limiting case, ; this means that is fixed at and no longer needs to be estimated. Both the Bayesian and frequentist approaches have their supporters and detractors. But rather than being dogmatic and following the same rulebook every time, we need to be pragmatic and ask the right methodological questions when confronted with a new problem.
All things considered, the problem comes down to knowing whether the data contains sufficient information to answer a given question, and whether some other information may be available to help answer it. This is the essence of the art of modeling: find the right compromise between the confidence we have in the data and our prior knowledge of the problem. Each problem is different and requires a specific approach. For instance, if all the patients in a clinical trial have essentially the same weight, it is pointless to estimate a relationship between weight and the model’s PK parameters using the trial data. A modeler would be better served trying to use prior information based on physiological knowledge rather than just some statistical criterion.
Generally speaking, if prior information is available it should be used, on the condition of course that it is relevant. For continuous data for example, what does putting a prior on the residual error model’s parameters mean in reality? A reasoned statistical approach consists of including prior information only for certain parameters (those for which we have real prior information) and having confidence in the data for the others. Monolix allows this hybrid approach which reconciles the Bayesian and frequentist approaches. A given parameter can be
 a fixed constant if we have absolute confidence in its value or the data does not allow it to be estimated, essentially due to lack of identifiability.
 estimated by maximum likelihood, either because we have great confidence in the data or no information on the parameter.
 estimated by introducing a prior and calculating the MAP estimate or estimating the posterior distribution.
Computing the Maximum a posteriori (MAP) estimate
 theobayes1_project (data = ‘theophylline_amt_data.txt’ , model = ‘lib:oral1_1cpt_kaVCl.txt’)
We want to introduce a prior distribution for in this example. Right click on the initial value for and select Prior
We propose to use a lognormal distribution with mean and standard deviation for and to compute the MAP estimate for .
The MAP estimate of is a penalized maximum likelihood estimate:
Estimating the posterior distribution
 theobayes2_project (data = ‘theophylline_amt_data.txt’ , model = ‘lib:oral1_1cpt_kaVCl.txt’)
Using the same prior distribution, we can choose to estimate the posterior distribution of instead of its mode only:
Estimation of the individual parameters using the conditional mean and sd is checked in this example. Then, the MCMC algorithm used for estimating the conditional distributions of the individual parameters is also used for estimating the posterior distribution of :
Once the (population and individual) parameters have been estimated, the posterior and prior distributions of is displayed with the graphics
Fixing the value of a parameter
 theobayes3_project (data = ‘theophylline_amt_data.txt’ , model = ‘lib:oral1_1cpt_kaVCl.txt’)
We can combine different strategies for the population parameters: Bayesian estimation for , maximum likelihood estimation for and fixed value for , for instance.
Remark: is not estimated (it’s s.e. is not computed) but the standard deviation is estimated as usual in this example:
Bayesian estimation of several population parameters
 theobayes4_project (data = ‘theophylline_amt_data.txt’ , model = ‘lib:oral1_1cpt_kaVCl.txt’)
Prior distributions can be used for several parameters:
Then, the MCMC algorithm is used for estimating these 3 posterior distributions:
2.7.3.Delayed differential equations
 Ordinary differential equations based model
 Don’t forget the initial conditions!
 Delayed differential equations based model
 Case studies
Objectives: learn how to implement a model with ordinary differential equations (ODE) and delayed differential equations (DDE).
Projects: tgi_project, seir_project, seir_noDelay_project
Ordinary differential equations based model
 tgi_project (data = tgi_data.txt , model = tgi_model.txt)
We consider here the tumor growth inhibition (TGI) model proposed by Ribba *et al.* (Ribba, B., Kaloshi, G., Peyre, M., Ricard, D., Calvez, V., Tod, M., . & Ducray, F., *A tumor growth inhibition model for lowgrade glioma treated with chemotherapy or radiotherapy*. Clinical Cancer Research, 18(18), 50715080, 2012.). This model is defined by a set of ordinary differential equations
where is the total tumor size. This set of ODEs is valid for , while
for .This model (derivatives and initial conditions) can easily be implemented with Mlxtran:
[LONGITUDINAL] input = {K, KDE, KPQ, KQPP, LAMBDAP, GAMMA, DELTAQP, PT0, Q0} PK: depot(target=C) EQUATION: t0=0 PT_0=PT0 Q_0 = Q0 PSTAR = PT+Q+QP ddt_C = KDE*C ddt_PT = LAMBDAP*PT*(1PSTAR/K) + KQPP*QP  KPQ*PT  GAMMA*KDE*PT*C ddt_Q = KPQ*PT  GAMMA*KDE*Q*C ddt_QP = GAMMA*KDE*Q*C  KQPP*QP  DELTAQP*QP
Remark: t0, PT_0 and Q_0 are reserved keywords that define the initial conditions.
Then, the graphic of individual fits clearly shows that the tumor size is constant until and starts changing according to the model at .
Don’t forget the initial conditions!
 tgiNoT0_project (data = tgi_data.txt , model = tgi1NoT0_model.txt)
The initial time is not specified in this example
[LONGITUDINAL] input = {K, KDE, KPQ, KQPP, LAMBDAP, GAMMA, DELTAQP, PT0, Q0} PK: depot(target=C) EQUATION: PT_0=PT0 Q_0 = Q0 PSTAR = PT+Q+QP ddt_C = KDE*C ddt_PT = LAMBDAP*PT*(1PSTAR/K) + KQPP*QP  KPQ*PT  GAMMA*KDE*PT*C ddt_Q = KPQ*PT  GAMMA*KDE*Q*C ddt_QP = GAMMA*KDE*Q*C  KQPP*QP  DELTAQP*QP
Since is missing, Monolix uses the first time value encountered for each individual. If, for instance, the tumor size is computed for for the individual fits, then will be used for defining the initial conditions, which introduces a shift in the plot:
Conclusion: don’t forget to specify properly the initial conditions of a system of ODEs!
Delayed differential equations based model
A system of delay differential equations (DDEs) can be implemented in a block EQUATION
of the section [LONGITUDINAL]
of a script Mlxtran. Mlxtran provides the command delay(x,T) where x is a onedimensional component and T is the explicit delay. Therefore, DDEs with a nonconstant past of the form
can be solved.
 seir_project (data = seir_data.txt , model = seir_model.txt)
The model is a system of 4 DDEs:
[LONGITUDINAL] input = {A, lambda, gamma, epsilon, d, tau, omega} EQUATION: t0 = 0 S_0 = 15 E_0 = 0 I_0 = 2 R_0 = 3 N = S + E + I + R ddt_S = A  d*S  lambda*S*I/N + gamma*delay(I,tau)*exp(d*tau) ddt_E = lambda*S*I/N  d*E  lambda*delay(S,omega)*delay(I,omega)*exp(d*omega)/(delay(I,omega) + delay(S,omega) + delay(E,omega) + delay(R,omega)) ddt_I = (gamma+epsilon+d)*I + lambda*delay(S,omega)*delay(I,omega)*exp(d*omega)/(delay(I,omega) + delay(S,omega) + delay(E,omega) + delay(R,omega)) ddt_R = gamma*I  d*R  gamma*delay(I,tau)*exp(d*tau)
Introducing these delays allows to obtain nice fits for the 4 outcomes, including :
 seir_noDelay_project (data = seir_data.txt , model = seir_model.txt)
On the other hand, if ODEs are used, i.e. without any delay,
then, the model is not able anymore to fit correctly the observed data, such as :
Case studies
 8.case_studies/arthritis_project
3.Tasks
Monolix includes several estimation algorithms:
 estimation of the population parameters,
 the Fisher information matrix and standard errors,
 the individual parameters,
 and loglikelihood.
Also, different types of results are available in the form of graphics and tables. The tasks can run individually, or you can define a workflow (see here). Moreover, Monolix includes a convergence assessment tool. It is possible to execute a workflow as defined above but for different, randomly generated, initial values of fixed effects.
3.1.Population parameter estimation using SAEM
Purpose
Maximum likelihood estimation of the population parameters is performed with the button. The sequence of estimated parameters () is displayed in a figure which is automatically saved as saem.png in the results folder at the end of the estimation. Also, the final estimations are displayed in the command window.
This algorithm produces the file pop_parameters.txt in the result folder, with all the estimated parameters: population parameters, variances (or standard deviations), correlations, error model parameters, etc. The estimation of the population parameters by groups defined by the continuous and categorical covariates used in the model is also given.
For example, in the theophylline_project, where one has a continuous covariate based on a transformation, the population parameters estimation file writes
****************************************************************** * theophylline_project.mlxtran * November 23, 2015 at 13:09:49 * Monolix version: 4.4.0 ****************************************************************** Estimation of the population parameters parameter ka_pop : 1.57 V_pop : 0.454 beta_V_t_WEIGHT : 0.471 Cl_pop : 0.0399 omega_ka : 0.641 omega_V : 0.113 omega_Cl : 0.272 a : 0.735 Numerical covariates t_WEIGHT = log(WEIGHT/70)
and the following figure is shown.
In another example, in the iov1_project, where we have both categorical covariates (SEX, TREAT, OCC, sTREAT) and IOV (the estimated parameters are called gamma), the population parameters estimation file writes
****************************************************************** * iov1_project.mlxtran * November 24, 2015 at 08:44:55 * Monolix version: 4.4.0 ****************************************************************** Estimation of the population parameters parameter ka_pop : 1.42 beta_ka_sTREATBA : 0.153 V_pop : 0.469 beta_V_TREATB : 0.208 beta_V_sTREATBA : 0.0415 beta_V_SEXM : 0.0287 beta_V_OCC2 : 0.0261 Cl_pop : 0.0371 beta_Cl_SEXM : 0.0412 beta_Cl_OCC2 : 0.0188 omega_ka : 0.26 omega_V : 0.0957 omega_Cl : 0.124 gamma_ka : 0 gamma_V : 0.0275 gamma_Cl : 0.187 a : 0.108 b : 0.107
Main population parameter settings
Seed: The default value of the seed used for the random numbers generator is 123456. This seed can be randomly generated with the “generate” button. It is accessible in Menu/Settings/Seed.
Number of iterations: There are two stages in the population parameters estimation algorithm. Use K1 and K2 to define the number of iterations for each one. If auto is selected, then the algorithm will determine automatically if it can jump from one stage to the other or if it can finish the estimation process. In this case, the numbers and will represent the maximum number of iterations for each stage. Here, these numbers will not exceed 500 and 200. By default, the automatic determination of is proposed. It is accessible in Menu/Settings/Pop. parameters.
Number of chains: When the data set is small, just a few of subjects for instance, replicating the data set can improve the precision of the estimation. The number of (Markov) chains specifies how many times the data set need to be replicated. You can specify this number yourself, or you can let Monolix to compute how many replicates are needed to ensure a minimum number of subjects. In this case, you should specify this minimum size and any time you change the data set, the number of chains will be updated according to the number of subjects present in the new data set. For instance, in the theophylline example there are only 12 subjects in the data set. Setting to 50 the minimum number of subjects, Monolix choose to use 5 Markov chains. It is accessible in Menu/Settings/Pop. parameters.
Simulated annealing: A Simulated Annealing version of SAEM is used to estimate the population parameters (the variances are constrained to decrease slowly during the first iterations of SAEM). It is accessible in Menu/Settings/Pop. parameters.
Very advance population parameters settings
In this paragraph, we define all the settings used for the population parameter estimation. However, we strongly encourage the user to get help from Lixoft’s support when considering using these parameters.
SAEM algorithm: Additionally, is the number of burning iterations. The sequence of step sizes decreases as for j equals 1 or 1. By default, and , then during the first iterations is constant and is equal to 1 and during the next iterations. , and define the stopping rules when auto is checked. It is accessible in Menu/Settings/Pop. parameters and by clicking on “Advanced settings” .
MCMC: The default number of Markov chains is L = 1. , , and are the numbers of transitions of the four different kernels used in the MCMC algorithm. The default value are , , , and . The transition is recommended only in some specific cases, when the observed kinetics are very different (i.e. viral kinetics of responders, non responders, rebounders, …). It is accessible in Menu/Settings/Pop. parameters and by clicking on “Advanced settings” .
Simulated Annealing: When Simulated Annealing is checked, the temperature decreases as and in the two phases of the SAEM algorithm. Notice that you can use to force the estimated variance(s) to increase from a small initial value to the estimated value. Then, can be used if you want to obtain fits as good as possible and if you want to obtain intersubject variability as small as possible. It is accessible in Menu/Settings/Pop. parameters and by clicking on “Advanced settings” .
What is and what is not SAEM? And the related implications.
The objective of this paragraph is to explain what is the SAEM, what the user can wait for and its limitation.
The SAEM algorithm is the stochastic approximation expectationmaximization algorithm. It has been shown to be extremely efficient for a wide variety of complex models: categorical data, count data, timetoevent data, mixture models, hidden Markov models, differential equation based models, censored data, … Furthermore, convergence of SAEM has been rigorously proved and the implementation in Monolix is efficient.
SAEM is an iterative algorithm: SAEM as implemented in Monolix has two phases. The goal of the first is to get to a neighborhood of the solution in only a few iterations. A simulated annealing version of SAEM accelerates this process when the initial value is far from the actual solution. The second phase consists of convergence to the located maximum with behavior that becomes increasingly deterministic, like a gradient algorithm.
SAEM is a stochastic algorithm: We cannot claim that SAEM always converges (i.e., with probability 1) to the global maximum of the likelihood. We can only say that it converges under quite general hypotheses to a maximum – global or perhaps local – of the likelihood. A large number of simulation studies have shown that SAEM converges with high probability to a “good” solution – it is hoped the global maximum – after a small number of iterations. Therefore, we encourage the user to use the convergence assessment tool to look at the sensitivity of the results w.r.t. the initial parameters
SAEM is a stochastic algorithm (2): The trajectory of the outputs of SAEM depends on the sequence of random numbers used by the algorithm. This sequence is entirely determined by the “seed.” In this way, two runs of SAEM using the same seed will produce exactly the same results. If different seeds are used, the trajectories will be different but convergence occurs to the same solution under quite general hypotheses. However, if the trajectories converge to different solutions, that does not mean that any of these results are “false”. It just means that the model is case sensitive to the seed or to the initial conditions.
3.2.Standard error using the Fisher Information Matrix
Purpose
The variance of the maximum likelihood estimate (MLE) , and thus confidence intervals, can be derived from the observed Fisher information matrix (FIM), itself derived from the observed likelihood (i.e., the pdf of observations y):
There are two different algorithms: by linearization or by stochastic approximation. When “linearization” is used, the structural model is linearized, and the full statistical model is approximated by a Gaussian model. When “stochastic approximation” is used, the exact model is used, and the Fisher information matrix (F.I.M) is approximated stochastically. The final estimations are displayed in the command window (Matlab or DOS) together with the population parameters:
 the estimated fixed effects, their standarderrors, the absolute and relative pvalues obtained from the Wald test (only for the coefficients of the covariates),
 the estimated variances (or standard deviations) and their standarderrors,
 the estimated residual error parameters and their standarderrors,
 the estimated correlation matrix of the random effects if the covariance matrix is not diagonal,
 the correlation matrix of the fixed effect estimates, with the smallest and largest eigenvalues,
 the correlation matrix of the variance components estimates, with the smallest and largest eigenvalues,
All that information is appended to the file pop_parameters.txt. The following file presents an example on the theophylline_project.
****************************************************************** * theophylline_project.mlxtran * November 26, 2015 at 16:17:01 * Monolix version: 4.4.0 ****************************************************************** Estimation of the population parameters parameter s.e. (lin) r.s.e.(%) pvalue ka_pop : 1.57 0.31 20 V_pop : 0.454 0.019 4 beta_V_t_WEIGHT : 0.471 0.3 63 0.11 Cl_pop : 0.0399 0.0034 9 omega_ka : 0.641 0.14 22 omega_V : 0.113 0.035 31 omega_Cl : 0.272 0.066 24 a : 0.735 0.056 8 ______________________________________________ correlation matrix of the estimates(linearization) ka_pop 1 V_pop 0.15 1 beta_V_t_WEIGHT 0 0.09 1 Cl_pop 0.06 0.15 0.01 1 Eigenvalues (min, max, max/min): 0.79 1.3 1.6 omega_ka 1 omega_V 0.02 1 omega_Cl 0 0.01 1 a 0.02 0.11 0.05 1 Eigenvalues (min, max, max/min): 0.87 1.1 1.3 Numerical covariates t_WEIGHT = log(WEIGHT/70)
Notice that the chosen method is displayed in the file.
Best practices: when to use “linearization” and when to use “stochastic approximation”
Firstly, it is only possible to use the linearization algorithm for the continuous data. In that case, this method is generally much faster than stochastic approximation and also gives good estimates of the FIM. The FIM by model linearization will generally be able to identify the main features of the model. More precise– and timeconsuming – estimation procedures such as stochastic approximation and importance sampling will have very limited impact in terms of decisions for these most obvious features. Precise results are required for the final runs where it becomes more important to rigorously defend decisions made to choose the final model and provide precise estimates and diagnostic plots. In the theophylline example, the evaluation of the FIM is presented with the CPU time.
Method  s.e.  s.e.  e.v. correlation matrix of the estimates  CPU time [s] 

Linearization  (0.31, 0.019, .3, .0035)  (0.14, 0.035, .066)  (0.8, 1.3, 1.6, 0.87, 1.1, 1.3)  .1 
Stochastic Approximation  (0.31, 0.019, .3, .0035)  (0.15, 0.035, .068)  (0.8, 1.2, 1.5, 0.79, 1.2, 1.5)  3.4 
One can see that it the stochastic approximation is much more costly in computation time but the result is more precise.
3.3.Individual parameters estimation
Purpose
Although the population parameter estimation algorithm gives a rough estimation of the individual parameters, it can be estimated two more precise estimators: the conditional mode and the conditional mean.
If the option ”Estimate the conditional modes” is selected, the individual parameters are estimated by maximizing the conditional probabilities , i.e.
It corresponds to the “optimal” value for the fit. If the option “Estimate the cond. means and s.d.” is selected, the conditional means and standard deviations are estimated by MCMC. The conditional distribution of the vector of individual parameters can be estimated for each individual using the MetropolisHastings (MH) algorithm. For each , this algorithm generates a sequence which converges in distribution to the conditional distribution and can be used for estimating any summary statistic of it (mean, standard deviation, quantiles, etc.).
The MH algorithm therefore allows us to define an initial estimator of the individual parameter that approximates the conditional mean:
For each parameter, the mean of these quantities over all the subjects is displayed together with a interval. Two files indiv_parameters.txt and indiv_eta.txt are created with the estimated individual parameters and random effects in table format. Also, if there were defined priors on some fixed effects, and it was selected prior distribution method for some of them, a new file called simulatedPopParams.txt is created simulations using their posterior distribution laws.
Advanced settings for individual parameter estimation:
MCMC: , , and are the numbers of transitions of the four different kernels used in the MCMC algorithm. The default value are , , , and . and define the stopping rule of the MCMC algorithm. The number of iterations of MCMC increases with and decreases with . It is accessible in Menu/Settings/Indiv. parameters.
Best practices : When do we look at the conditional mode and when do we look at the conditional mean?
The choice of using the conditional mean or conditional mode is arbitrary. By default, Monolix uses the conditional mode for computing predictions, taking the philosophy that the “most likely” values of the individual parameters are the most suited for computing the “most likely” predictions.
3.4.Log Likelihood estimation
Purpose
Performing likelihood ratio tests and computing information criteria for a given model requires computation of the loglikelihood
where is the vector of population parameter estimates for the model being considered. The loglikelihood cannot be computed in closed form for nonlinear mixed effects models. It can however be estimated in a general framework for all kinds of data and models using the importance sampling Monte Carlo method. This method has the advantage of providing an unbiased estimate of the loglikelihood – even for nonlinear models – whose variance can be controlled by the Monte Carlo size.
The estimation of the loglikelihood is performed. Two different algorithms are proposed to estimate the loglikelihood: by linearization and by Importance sampling. The estimated loglikelihoods are appended to pop_parameters.txt.
Loglikelihood by importance sampling
The observed loglikelihood can be estimated without requiring approximation of the model, using a Monte Carlo approach. Since
we can estimate for each individual and derive an estimate of the loglikelihood as the sum of these individual loglikelihoods. We will now explain how to estimate for any individual i. Using the representation of the model (the individual parameters are transformed to be Gaussian), notice first that can be decomposed as follows:
Thus, is expressed as a mean. It can therefore be approximated by an empirical mean using a Monte Carlo procedure:
 Draw M independent values , , …, from the marginal distribution .
 Estimate with
By construction, this estimator is unbiased, and consistent since its variance decreases as 1/M:
We could consider ourselves satisfied with this estimator since we “only” have to select M large enough to get an estimator with a small variance. Nevertheless, it is possible to improve the statistical properties of this estimator.
The problem is that it is not possible to generate the with this conditional distribution, since that would require to compute a normalizing constant, which here is precisely .
Nevertheless, this conditional distribution can be estimated using the MetropolisHastings algorithm described in the MetropolisHastings algorithm for simulating the individual parameters and a practical proposal “close” to the optimal proposal can be derived. We can then expect to get a very accurate estimate with a relatively small Monte Carlo size M.
The mean and variance of the conditional distribution are estimated by MetropolisHastings for each individual i. Then, the are drawn with a noncentral student tdistribution:
where and are estimates of and , and is a sequence of i.i.d. random variables distributed with a Student’s tdistribution with degrees of freedom.
Remark: Even if is an unbiased estimator of , is a biased estimator of . Indeed, by Jensen’s inequality, we have :
However, the bias decreases as M increases and also if is close to . It is therefore highly recommended to use a proposal as close as possible to the conditional distribution , which means having to estimate this conditional distribution before estimating the loglikelihood (i.e run task “individual parameter” with “Cond. means and s.d.” option).
Advance settings for the loglikelihood
A tdistribution is used as proposal. The number of degrees of freedom of this distribution can be either fixed or optimized. In such a case, the default possible values are 2, 5, 10 and 20 degree of freedom. A distribution with a small number of degree of freedom (i.e. heavy tails) should be avoided in case of stiff ODE’s defined models. We recommend to set a degree of freedom at 5. It is accessible in Menu/Settings/Loglikelihood.
Loglikelihood by linearization
The likelihood of the nonlinear mixed effects model cannot be computed in a closedform. An alternative is to approximate this likelihood by the likelihood of the Gaussian model deduced from the nonlinear mixed effects model after linearization of the function f (defining the structural model) around the predictions of the individual parameters .
Notice that the loglikelihood can not be computed by linearization for discrete outputs (categorical, count, etc.) nor for mixture models or when the posterior distribution have been estimated for some parameters with priors.
Best practices: When should I use the linearization and when should I use the importance sampling?
Firstly, it is only possible to use the linearization algorithm for the continuous data. In that case, this method is generally much faster than importance sampling method and also gives good estimates of the LL. The LL calculation by model linearization will generally be able to identify the main features of the model. More precise– and timeconsuming – estimation procedures such as stochastic approximation and importance sampling will have very limited impact in terms of decisions for these most obvious features. Selection of the final model should instead use the unbiased estimator obtained by Monte Carlo. In the warfarin example, the evaluation of the loglikelihood (along with the AIC and BIC) is presented with the CPU time.
Method  2LL  AIC  BIC  CPU time [s] 

Linearization  2178.78  2220.78  2251.56  1.5 
Important sampling  2119.76  2161.76  2192.54  27.1 
One can see that it the importance sampling is much more costly in computation time but the result is more precise.
3.5.How to run several tasks ?
With the button Run, you can execute successively several tasks. The list of tasks to run is called “scenario” or “workflow”. You can define your own scenario by checking the corresponding checkboxes at the left of the Run button.
For example, define the following workflow in order to:
 estimate the population parameters,
 estimate the standard errors,
 estimate the individual parameters,
 display graphical results using the individual parameters estimated previously.
If you just want to compute the population parameters with their standard errors, define the workflow as follows:
Notice that you can load and save a workflow using the menu Workflow.
3.6.Algorithms convergence assessment
Monolix includes a convergence assessment tool. It is possible to execute a workflow as defined above but for different, randomly generated, initial values of fixed effects. Click in the “Assessment” button to open the Assessment graphical interface.
You can enter the number of runs, or replicates, the parameters to generate initial values and the intervals where those values should be (uniformly) simulated. Click on Run to execute the tool. Thus you are able to estimate the population parameters using several initial seeds and/or several initial conditions. Notice that you are able to set one initial parameter constant while generating the others. The minimum and maximum of the generated parameters can be modified by the user.
The workflow is the same between the runs and is the one defined in the interface (population parameters estimation must be selected). A result folder is generated for each set of initial parameters. Two kinds of graphics are given as a summary of the results
 one showing the estimated values (blue) with the estimated standard errors (red bars) for each replicate (if the Fisher information matrix estimation was included in the scenario)
 the superimposed convergence graphics for each parameter. In case of F.I.M estimation is included, then the black lines represents the mean of the estimated s.e..
Also, a .mat file is produced with all the results used for those graphics. Press Last Results button to reproduce the last result graphics if you already have ran the tool for the current project, and Cancel to close the interface.
Best practices: what is the use the convergence assessment tool?
We cannot claim that SAEM always converges (i.e., with probability 1) to the global maximum of the likelihood. We can only say that it converges under quite general hypotheses to a maximum – global or perhaps local – of the likelihood. A large number of simulation studies have shown that SAEM converges with high probability to a “good” solution – it is hoped the global maximum – after a small number of iterations. The purpose of this tool is to evaluate the SAEM algorithm with initial conditions and see if the estimated parameters are the “global” minimum.
The trajectory of the outputs of SAEM depends on the sequence of random numbers used by the algorithm. This sequence is entirely determined by the “seed.” In this way, two runs of SAEM using the same seed will produce exactly the same results. If different seeds are used, the trajectories will be different but convergence occurs to the same solution under quite general hypotheses. However, if the trajectories converge to different solutions, that does not mean that any of these results are “false”. It just means that the model is case sensitive to the seed or to the initial conditions. The purpose of this tool is to evaluate the SAEM algorithm with several seeds to see the robustness of the convergence.
4.Graphics
What kind of plots can be generated by Monolix?
The list of graphics corresponds to all the graphics that Monolix can generate
Model summary and convergence
 Project summary
 Spaghetti plot: This graphic displays the original data w.r.t. time along with some additional informations.
 Convergence SAEM: This graphic displays the convergence of the estimated parameters with respect to the iteration number.
Diagnostic plot based on observed data
 Individual fits: This graphics displays the individual fits, using the population parameters with the individual covariates (red) and the individual parameters (green) w.r.t. time on a continuous grid.
 Obs. vs Pred: This graphic displays observations w.r.t. the predictions computed using the population parameters (on the left), and w.r.t. the individual parameters (on the right).
 Residuals: This graphic displays the PWRES (population weighted residuals), the IWRES (individual weighted residuals), and the NPDE (Normalized Prediction Distribution Errors) w.r.t. the time (on the top) and the prediction (bottom).
 VPC: This graphic displays the Visual Predictive Check.
 NPC: This graphic displays the numerical predictive check.
Diagnostic plots based on individual parameters
 Covariates: This graphic displays displays the estimators of the individual parameters in the Gaussian space (and those for random effects) w.r.t. the covariates.
 Parameters Distribution: This graphic displays the estimated population distributions of the individual parameters.
 Random Effects (boxplot): This graphic displays the distribution of the random effects with boxplots.
 Random Effects (joint dist.): This graphic displays scatter plots for each pairs of random effects.
Extensions
 BLQ: This graphic displays the proportion of censored data w.r.t. time.
 Prediction distribution: This graphic displays the prediction distribution.
 Categorized Data
 Time to Event data: This graphic displays the empiric (from observations) and theoretical (from simulations) survival function and average number of events for time to event data.
 Transition Probabilities: This figures displays the proportion of each category knowing the previous one for discrete models.
 Posterior distribution
 Contribution to likelihood: This graphic displays the contribution of each individual to the loglikelihood.
What can be done with the plots ?
The graphics menu bar proposes different optionsIt is possible to
 “Save as”, “Print”, “Close”, “CopyFigure” save, print, an close the figure and, in Windows, copy and paste it to any other document.
 “Zoom” menu activates or deactivates the zoom
 “Axes” menu proposes to use semilog or loglog scale plots. Settings option gives the opportunity to choose labels and to change the scale of graphics. It is also possible to use the same axis limits for the different plots of the same graphic.
 “Stratify” menu opens the stratify window, see below.
 “Settings” menu opens a new window with the settings of the current graphic
How to stratify graphics ?
When clicking on the stratify menu, the following window pops up.
The Stratify menu allows to create and use covariates for stratification purposes. It is possible to select a categorical covariate for splitting, filtering or coloring the data set. Also, in the bottom section, it can be defined a time filter.
How to save the graphics ?
When clicking on the button “Outputs to save” in the interface, the following window pops up
The user can choose which graphics to save. Notice that
 the save starts after the display of the graphics.
 the graphics are saved in the result folder.
 if the graphic is not defined by the user using the “Select plots” button, it will not be saved.
4.1.Model summary and convergence
4.1.1.Spaghetti plot
Purpose
The purpose of this plot is to display the original data w.r.t. time along with some additional informations.
Example of graphic
In the proposed example, the concentration for the warfarin data set is presented.
Settings
 Segments : Add/remove the segments between the observations,
 BLQ (with the color): Add/remove the segments between the observations
 Legend: Add/remove the legend (available only if there are several groups on the figure),
 Informations : Add/remove the informations on the measurements (total number of subjects, mean number of doses per subject, total, average, min and max number of observations)
By default, only Segments and Informations are activated.
In the following example, we defined stratification by the category SEX for the effect on the warfarin data set. Additionally, the following settings are the default settings and the legend.
Best practices
 It is always good to have a look first at the Spaghetti plot before running the parameter estimation. Indeed, it is very convenient to see if all your data are ok, if something weird happens on a subject, …
 It is possible to see the Spaghetti plot just after loading the datas. For that, click on the icon next to the data file choice.
 For a better understanding and/or exploration of the data set, it is also possible to export the data set in Datxplore.
4.1.2.Convergence SAEM
Purpose
This graphic displays the convergence of the estimated parameters with respect to the iteration number.
Example of graphic
In the following example, the parameters of a one compartmental model with first order absorption and linear elimination (on the theophylline data set) are estimated. One can see for distribution of the random effects associated to each parameter.
Settings
 Display: number of graphics the user want to see at the same time. It is convenient when there are a lot of parameters. Indeed, using this setting, the user can focus on particular parameter convergence for example.
By default, all parameters are displayed.
4.2.Diagnostic plot based on observed data
4.2.1.Individual fits
Purpose
The figure displays the individual fits, using the population parameters with the individual covariates (red) and the individual parameters (green) on a continuous grid. This is a good way to see on each subject the validity of the model, and the actual fit proposed. It is possible to show the computed individual parameters on the figure. Moreover, it is also possible to display the median and a confidence interval for () estimated with a Monte Carlo procedure. The design and the covariates of each subject are used for the simulation.
Example of graphic
In the proposed example, the concentration for the warfarin data set is presented. For each subject, the data are displayed with blue crosses along with the individual fit and population fit (the prediction using the estimated individual and population parameters respectively).
Settings
 Grid arrange. Define the number of subjects that are displayed. The user can define the number of rows and the number of columns. Moreover, a slider is present to be able to change the subjects under consideration.
 Display
 Filter : filter the observations when stratify.
 Split occasions: Split the individual by occasions in case of IOV.
 Population fits : Model estimation on the subject’s design based on the population parameters.
 Individual fits : Model estimation on the subject’s design and covariate based on the individual parameters. The individual parameters come from the conditional mode or the conditional mean estimation.
 BLQ data : show and put in a different color the data that are BLQ (Below the Limit of Quantification).
 Legend : add/remove the legend. There is only one legend for all the individual fits.
 Informations : add/remove the individual parameter estimation for each individual fit.
 Prediction distribution
 The median value obtained with the predicted distribution coming from the population parameter.
 The prediction interval (associated with a level): allows to see the variability driven by the interindividual variability.
By default, only the observed data and the individual fit are proposed.
4.2.2.Obs. vs Pred.
Purpose
This figure displays observations () versus the predictions () computed using the population parameters (on the left), and with the individual parameters (on the right).
Example of graphic
In the proposed example, the concentration for the warfarin example is presented. It is modeled by a one compartment first order absorption model with a linear elimination. On the left, the scatter plot of the observed concentration with respect to the predicted concentration using the estimated population parameters is displayed. On the right, the scatter plot of the observed concentration with respect to the predicted concentration using the individual parameters is displayed.
Settings
 Prediction
 Population prediction: add/remove the figure with the comparison between the population prediction and the observations.
 Individual prediction: add/remove the figure with the comparison between the individual prediction (based on the chosen conditional estimation) and the observations.

 Observed datas,
 Segments by individuals: Add/remove segments between the observation of each individual
 BLQ data : show and put in a different color the datas that are BLQ (Below the Limit of Quantification)
 Splie: add/remove a spline interpolation on the comparison between the prediction and the observation.
 Legend : add/remove the legend. There is only one legend for all the individual fits
By default, both population and individual prediction are proposed.
4.2.3.Residuals
Purpose
This graphic displays the PWRES (population weighted residuals), the IWRES (individual weighted residuals), and the NPDE (Normalized Prediction Distribution Errors) w.r.t. the time (on the top) and the prediction (bottom). The PWRES are computed using the population parameters and the IWRES are computed using the individual parameters. For discrete outputs, only NPDE and individual NPDE are used.
Population Weighted Residuals () are defined as the normalized difference between the observations and their mean. Let be the vector of observations for subject i. The mean of is the vector . Let be the variancecovariance matrix of . Then, the ith vector of the population weighed residuals is defined by
and are not known in practice but can be estimated by MonteCarlo simulation.
Individual weighted residuals () are estimates of the standardized residual () based on individual predictions
If the residual errors are assumed to be correlated, the individual weighted residuals can be decorrelated by multiplying each individual vector by , where is the estimated correlation matrix of the vector of residuals .
Normalized prediction distribution errors () are a nonparametric version of based on a rank statistic. For any (i,j), let where is the cumulative distribution function (cdf) of . NPDEs are defined as an empirical estimation of . In practice, one simulates a large number of simulated data set using the model, and estimate as the fraction of simulated data below the original data, i.e:
By definition, the distribution of is uniform on [0,1], we thus rather use , which follows a standard normal distribution (with the cdf of the standard normal distribution). NPDEs are defined as an empirical estimation of , i.e .
For continuous outputs, this figure shows the empirical distributions of the weighted residuals and the NPDE together with the standard Gaussian pdf and qqplots to check if the residuals are Gaussian.
Example of graphic
In the following example, the parameters of a one compartmental model with first order absorption and linear elimination (on the theophylline data set) are estimated. One can see the PWRES, the IWRES and the NPDE w.r.t. the time (on the top), to the prediction (in the middle) and the comparison between the empirical and theoretical pdf of the PWRES, IWRES and NPDE are presented on the bottom. Notice, that in that example, the prediction intervals were added.
Settings
 Residuals: The user can choose which residual will be displayed
 PWRES
 IWRES (using the individual parameter estimated using the conditional mode or the conditional mean)
 NPDE
 Select Graphics: The user can define which graphics will be represented
 Scatter Plot (TIME): the representation w.r.t. the time
 Scatter Plot (Predictions): the representation w.r.t. the prediction
 Histogram: the pdf of the associated residual
 QQplot for the residuals.
 Scatter Plot
 Observed data can be added
 BLQ data (and a possibility to add a different color) can be added if present
 Spline interpolation can be added
 Empirical percentiles for the 10%, 50% and 90% quantiles
 Theoretical percentiles for the 10%, 50% and 90% quantiles
 Prediction interval for the 10%, 50% and 90% quantiles
 Outliers (in dots or in area) to show the residuals that are out of the prediction interval
 By default, both displays are proposed with the individual parameters coming from the conditional mode estimation.
 Residual Distribution
 Empirical pdf
 Theoretical pdf
 Histogram
By default, all the residuals and all the graphics are displayed.
4.2.4.VPC
Purpose
The principle of the VPC (Visual Predictive Check) is to assess graphically whether simulations from a model of interest are able to reproduce both the central trend and variability in the observed data, when plotted versus an independent variable (typically time). A VPC is based on multiple simulations with the model of interest and the design structure of the observed data (i.e., dosing, timing, and number of samples). Percentiles of the simulated data are compared to the corresponding percentiles of the observed data. The percentiles are calculated either for each unique value of the independent (xaxis) variable or for a bin across the independent variable. It allows us to summarize in the same graphic the structural and statistical models by computing several quantiles of the empirical distribution of the data after having regrouped them into bins over successive intervals.
By default, the median and 10th and 90th percentiles are presented. The interpercentile range between the outer percentiles of all the simulated data, i.e. the prediction interval, is displayed with a level of 90%.
Example of graphic
In the following example, the parameters of a one compartmental model with first order absorption and linear elimination (on the theophylline data set) are estimated. The figure presents the VPC with the prediction interval for quantile 10%, 50% and 90%. Additionally,
Settings
 Display:
 Observed data
 BLQ data (with a possibility to color them specifically)
 Empirical percentile
 Theoretical percentile
 Prediction interval
 Outliers (in dots or in area)
 Bins
 pcVPC using Uppsala prediction correction
 Legend
4.2.5.NPC
Purpose
This graphic displays the numerical predictive check. It allows to compare empirical cumulative distribution function (CDF) of the observations with theoretical’s (computed from simulations).
Example of graphic
In the following example, the parameters of a one compartmental model with first order absorption and linear elimination (on the theophylline data set) are estimated. One can see the numerical predictive check of the concentration along with the data set along with the prediction interval and the theoretical cdf.
Settings
 Display
 Observed CDF
 Theoretical CDF
 Prediction interval (and its level)
 Outliers area
 Grid size for the sampling of the output
 Legend
4.3.Diagnostic plots based on individual parameters
4.3.1.Parameter Distributions
Purpose
This figure displays the estimated population distributions of the individual parameters. Settings allow you to show a summary of the theoretical distributions (mean, median, mode, percentiles), histograms, and a nonparametric estimation of the distribution. It is also possible to display the empirical distribution of the individual parameters and the shrinkages computed as follows for each random effect:
Example of graphic
In the following example, the parameter distribution of a one compartmental model with first order absorption and linear elimination (on the theophylline data set) is proposed. One can see for each parameter (ka,V,Cl) the estimated law, the histogram of the realizations and the calculation of the shrinkage.
Settings
 Display
 Population distribution: parameter distribution given by the user with the estimated population parameter,
 Histogram: represents the histogram of the realization of the considered values,
 Non param. pdf: add/remove distribution estimated by the estimated realization of the random effects.
 Legend : add/remove the legend.
 Informations : add/remove the shrinkage in %
 Estimator. The user can define which estimator is used for the definition of the individual parameters.
By default, only the population distribution is proposed with the individual parameters coming from the conditional mode estimation.
4.3.2.Covariates
Purpose
The figure displays the estimators of the individual parameters in the Gaussian space, and those for random effects, (e.g. the conditional expectations and for i from 1 to N and the conditional modes) v.s. the covariates.
Example of graphic
In the proposed example, the parameters estimation for a PKPD model on the warfarin data set is presented. The random effects of 3 parameters of the PD model are displayed w.r.t. a transformed version of the weight (t_wt=log(wt/70)), the weight, and the sex category.
Settings
 Grid arrange. Define the number of individual parameters (or random effects) and covariate that are displayed. The user can define the number of rows and the number of columns.
 Display
 The user can choose to see either the individual parameters or the Random effects.
 Data
 Splines: add/remove the a spline interpolation
 Regression lines: add/remove the affine fit
 Baseline: add/remove the baseline.
 Informations : add/remove the correlation information for each continuous covariate and each random effect.
 Legend : add/remove the legend.
 Estimator. The user can define which estimator is used for the definition of the individual parameters.
4.3.3.Random Effects (boxplot)
Purpose
This graphic displays the distribution of the random effects with boxplots. The horizontal dotted lines show the interquartile interval of a standard Gaussian distribution.
Example of graphic
In the following example, the parameters of a one compartmental model with first order absorption and linear elimination (on the theophylline data set) are estimated. One can see for distribution of the random effects associated to each parameter.
Settings
 Display
 Theoretical median
 Theoretical quartiles
 Estimator. The user can define which estimator is used for the definition of the individual parameters and thus for the random effects.
By default, both displays are proposed with the individual parameters coming from the conditional mode estimation.
4.3.4.Random Effects (joint dist.)
Purpose
This graphic displays scatter plots for each pair of random effects.
Example of graphic
In the following example, the parameters of a one compartmental model with first order absorption and linear elimination (on the theophylline data set) are estimated. One can see the random effects associated with each other. Regressions and spline interpolation are added to see the correlation between random effects.
Settings
 Display
 Data
 Spline
 Regression
 Estimator. The user can define which estimator is used for the definition of the individual parameters and thus for the random effects.
By default, both displays are proposed with the individual parameters coming from the conditional mode estimation.
4.4.Extensions
4.4.1.BLQ
Purpose
This graphic displays the proportion of censored data w.r.t. time. It is possible to chose the censoring interval. This graphic is only available for projects with censored data.
Example of graphic
In the following example, the parameters of a PKPD model (on the censored warfarin data set) are estimated. The figure presents the BLQ w.r.t. with the prediction interval.
Settings
 Display:
 Censored data min and max. By default, the limit and the censored values are used. However, one can look at smaller censored interval for example.
 BLQ Frequency observation calculated by simulation
 Median
 prediction interval with the associated level
 Outliers (area)
 Grid time size
 Legend
By default, the censored area corresponds to the data set description and the BLQ frequency observation, the prediction interval, the outliers and the legend are displayed.
4.4.2.Prediction distribution
Purpose
This graphic displays the prediction distribution. It allows to compare the observations with the theoretical distribution of the predictions.
Example of graphic
In the following example, the parameters of a one compartmental model with first order absorption and linear elimination (on the theophylline data set) are estimated. One can see the prediction distribution of the concentration along with the data set.
Settings
 Display
 Observed data
 Median
 Continuous or discrete representation of the distribution
 Level (meaning that the distribution corresponds to )
 Number of band and the associated percentile in case of a discrete representation
 Legend
By default, only the prediction distribution and the median are displayed.
4.4.3.Transition probabilities
Purpose
This graphic displays the proportion of each category knowing the previous one. It is only available for discrete observations and any discrete model.
Example of graphic
In the following example, the parameters of a threecategories markov model are estimated. The figure presents the proportion of each category knowing the previous one. Moreover, there are no dependency on the previous one, only on figure representing the modalities proportion is displayed.
Settings
 Proportion
 Cumulated by modalities
 Separated
 Display
 Bins
 Legend
4.4.4.Time to Event data
Purpose
This graphic displays the empiric (from observations) and theoretical (from simulations) survival function and average number of events for time to event data. It is only available for outputs of type event.
Example of graphic
In the following example, the parameter Te of a event model with a constant hazard function (=1/Te) is estimated. The figure presents the KaplanMeier plot w.r.t. the time (top) and the mean number of events (bottom).
Settings
 Selection:KaplanMeier plot which represent the survival function.
 mean number of events
 Display
 Empirical
 Predicted median
 Prediction interval
 Censored data
 Accuracy
 Legend
 prediction distribution
 Level
 Number of bands
By default, the empirical of the survival function and of the mean number of events is displayed.
4.4.5.Contribution to likelihood
Purpose
This graphic displays the contribution of each individual to the loglikelihood. This graphic is only available if the loglikelihood was previously computed.
Example of graphic
In the following example, the parameters of a one compartmental model with first order absorption and linear elimination (on the theophylline data set) are estimated. One can see the contribution of each subject.
Settings
 Label
 Index
 Name of the subject
 Legend
5.FAQ
This page summarizes the frequent questions about Monolix.
Monolox2016R1: Evolution form Monolix433 and known bug list
 Evolution from Monolix433 to Monolix2016R1
 How to update your model to be compatible with the new Mlxtran V2 syntax ?
 Known bug and error messages list
 Error handling: If you encounter the error “There was an error while parsing the project file” while loading your project, check that your data file headers and strings contain only allowed characters.
Regulatory
 What is needed for a regulatory submission using Monolix? Monolix is used for regulatory submissions (including the FDA and the EMA) of population PK and PK/PD analyses. The summary of elements needed for submission can be found here.
 How to cite Monolix2016R1? To cite Monolix, please reference it as here
Monolix version 2016R1. Antony, France: Lixoft SAS, 2016. http://lixoft.com/products/monolix/
Running Monolix
 Do I need Matlab to run Monolix? No, there is a stand alone version.
 On what operating system Monolix run? MonolixSuite runs on Windows, Linux and MacOS platform.
 Is it possible to run Monolix using a simple command line? Yes, see here.
 Can I run a batch function to run a list of projects and/or all the projects in a directory? Yes, please see here R and Matlab functions to do that.
Initialization
 How to initialize my parameters? There are several ways to initialize your parameters and visualize the impact. See here the different possibilities.
Results
 Can I define my self the result folder? By default, the result folder corresponds to the project name. However, you can define it by yourself. See here to see how to define it on the user interface.
 Can I define the output to save (graphics, tables, …)? Yes, see here.
 What result files are generated by Monolix? Monolix proposes a lot of different output files depending on the tasks done by the user. here is a complete listing of the files along with the condition for creation. See here for more information.
 Can I replot the graphics using another plotting software? Yes, if you go to the menu Graphics and click on “Export graphics data”, all the data needed to reproduce the graphics are stored in text files.
Tasks
 How are the censored data handled? The handling of censored data is described here.
 How are the parameters without variability handled? The different methods for parameters without variability are explained here.
 What is the difference between the complete loglikelihood computed and displayed during SAEM and the loglikelihood computed as separate task (by linearization and/or importance sampling)? The loglikelihood is defined as . It is the relevant quantity to compare model, but unfortunately it cannot be computed in closed form because the individual parameters are unobserved. Thus, to estimate the loglikelihood an importance sampling Monte Carlo method is used in a separate task (or an approximation is calculated via linearization). To know more on the loglikelihood calculation using linearization or importance sampling, see here. On the contrary, the complete likelihood refers to the joint distribution . By decomposing the terms as , we see that this quantity can easily be computed using as the individual parameters drawn by MCMC for the current iteration of SAEM. This quantity is calculated at each SAEM step and is useful to assess the convergence of the SAEM algorithm.
 When estimating the loglikelihood via importance sampling, the estimates does not seem to stabilize. What can I do? The loglikelihood estimator obtained by importance sampling is biased by construction (see here for details). To reduce the bias, the conditional distribution should be known as well as possible. For this, estimate the individual parameters with the “Cond. means and s.d” option before estimating the loglikelihood.
Model definition
 Is it possible to use timevarying covariates? Yes, however the covariates relationship must be defined in the model instead of the GUI. See here how to do that.
 Is it possible to define complex covariateparameter relationships such as MichaelisMenten for instance? Yes, this can be done directly in the modle file. See here how to do it.
 Is it possible to define mixture of structural models? Yes, it may be necessary in some situations to introduce diversity into the structural models themselves using betweensubject model mixtures (BSMM) or withinsubject model mixtures (WSMM). The handling of mixture of structural models is defined here. Notice that in the case of a BSMM, the proportion p between groups is a population parameter of the model to estimate. There is no interpatient variability on p: all the subjects have the same probability and a logitnormal distribution for p must be used to constrain it to be between 0 and 1 without any variability.
 Is it possible to define mixture of distributions? Mixture models usually refer to models for which the categorical covariate is unknown, but whichever the case, the structural model is the same. The handling of mixture of structural models is defined here.
 Can I set bounds on the population parameters for example between a and b? Not directly through the interface. However, you can manage it with the distribution choice and some calculation in the Mlxtran. Indeed, a lognormal distribution will imply a strictly positive parameter, while a logit distribution will imply a parameter in ]0, 1[.
 Thus, to have a parameter p between two bounds a and b you have to
 define a p_logit as an input for your structural model
 set p = a+p_logit*(ba) in EQUATION: in your structural model
 define logit as the distribution for p_logit
 To have a parameter p higher than a, you have to 1) define a p_log as an input for your structural model, 2) set p = a+p_log in EQUATION: in your structural model, and 3) define log as the distribution for p_log
 To have a parameter p lower than b, you have to 1) define a p_log as an input for your structural model, 2) set p = bp_log in EQUATION: in your structural model, and 3) define log as the distribution for p_log
Notice that, using that transformation, you have to multiply the standard error of p_logit by (ba) in the first case to have the standard error of the initial p parameter. It is not necessary for the two other cases as it is an offset.
 Thus, to have a parameter p between two bounds a and b you have to
 Can I put any distribution on the population parameters? Not directly through the interface. Using the interface, you can only put normal, lognormal and logitnormal. However, you can set any transformation of your parameter in the EQUATION: part of the structural model and apply any transformation on it.
Tricks
 How to compute AUC, time interval AUC, … using Mlxtran in a Monolix project? See here.
Export
5.1.Evolutions from Monolix433 to Monolix2016R1
Evolutions of the Mlxtran language now allow to share a model in the workflow from model exploration (using Mlxplore), to parameter estimation (using Monolix), and simulation (using Simulx). Therefore, the proposed evolutions of the language merge the needs of all these applications and the model written with this new syntax will be compatible with all Lixoft software. See here to see the evolution and the eventual impact on your model.
Main improvements and new features:
– The interface was redesigned for clarity and simplicity.
– A button providing the statistical model summary was added
– A button providing a direct browse to the demo was added
– All the libraries are now using the Mlxtran syntax. Therefore, if you were using a model from the Matlab library, it will automatically be transformed into its Mlxtran equivalent model when saving. The Matlab models are still available although their use is no longer recommended.
Enhancements:
Numerical
– Improved LL computation when the sensitivity analysis is verified
– Improved simulated annealing initialization
System
– Added configuration option for sysadmin to enforce number of structural model threads and maximum computational matlab threads
Naming
– Latent covariate by default are now called lcat, lcat1, etc
– Error model parameters are by default now called (a,b,c) if there is only one continuous error model
– The autocorrelation parameter is by default now called (r) if there is only one output with autocorrelation
– IOV variability parameters are by default now called (gamma_) if there is only one level, (gamma1_, gamma2_) if there are more
– Latent covariate probabilities are by default now called (plcat_1, plcat_2) for a latent covariable with three categories if there is only one latent covariate, (plcat1_1, plcat1_2, ) if there are more.
Bugs fixed:
– bug fix: fixed effects were not correctly initialized when they were not simulated
– bug fix: fixed problems estimating autocorrelation (SAEM) and the corresponding Fisher information matrix
– bug fix: limits of the confidence intervals in VPC
– bug fix: dde solver precision in interpolation of delays
– bug fix: computation of the loglikelihood of a multivariate gaussian vector on an interval during FIM computation by linearization for censored data was corrected
– bug fix: simulated annealing on standard error parameters for mixture projects was not well used
– bug fix: when SS and ADDL were used at the same time in the data set, it was not well interpreted.
– bug fix: occasion time management was not correct when evid=1 and the first occasion of the subject had no observations
– bug fix: corrections when several covariates are created with the same initial covariate
– bug fix: corrections when IOV occurs with discrete models
– bug fix: correction on individual parameter estimate in conditional mode with no IIV
5.2.How to update your model to be compatible with the new Mlxtran V2 syntax ?
You have a Mlxtran model you used with Monolix 4.3.3 and older. In this case your model is using Mlxtran V1 syntax which is not compatible anymore with the current version of Monolix 2016R1. In order to continue using your model with Monolix 2016R1 you need to make some changes in the model syntax. These are very minor changes and, in the following, we show you how to convert your model from Mlxtran V1 to V2.
Why an evolution of Mlxtran ?
With the transition to Monolix 2016R1 we are introducing the new version (V2) Mlxtran syntax. We made sure that Monolix 2016 R1 is backwards compatible with the old V1 Mlxtran syntax. This means that all your previous models can be run with Monolix 2016R1 if this is needed. However, there are two critical limitations with V1 Mlxtran and we strongly advice you to adopt V2 Mlxtran syntax.
 The first limitation is that you cannot edit V1 Mxltran models with Monolix 2016 R1. This means that if you want to change a model you must convert it to V2 syntax anyway.
 The second limitation concerns the use of Mlxplore and Simulx after parameter estimation with Monolix 2016R1 (workflow). This will not be possible if you do not use V2 syntax in Monolix
What changed in the longitudinal model?
Addition of [LONGITUDINAL]
To define the file as a model, the block [LONGITUDINAL] must be added so that Monolix understands that the file contains a structural model. [LONGITUDINAL] should be placed at the beginning of the file or after the optional DESCRIPTION:.
Modification of the input definition
The declaration of the inputs is changed.
Modification of the parameter definition.
In the V1 syntax, the parameters were defined by “parameter=”. Now, in V2, they must be defined with “input=”.
Modification of the regressor definition
In the V1 syntax, the regressors were defined by “regressor=”. Now, in V2, the following syntax must be used:
1/ The name of the regressors have to be added in the inputs.
2/ Each regressor has to be declared in the following way “NameRegressor = {use=regressor}”
Note that the order of definition of the regressors in the input list must agree with the order of definition in the data set (X columns).
Modification of the observation definition
In the V1 syntax, the observations were defined by “OBSERVATION:”. Now, in V2, they must be defined with “DEFINITION:”.
Two examples
Here are two examples of models implemented with V1 for Monolix 4.3.3 and the correct translation to V2 syntax for 2016R1.
Turnover model for multiple doses.
In the warfarin demos, a turnover model for multiple doses is used. It writes
; Model using old V1 syntax DESCRIPTION: Turnover model for multiple doses. INPUT: parameter = {Tlag, ka, V, Cl, I_max, C50, Rin, kout} EQUATION: Cc = pkmodel(Tlag, ka, V, Cl) E_0 = Rin/kout ddt_E = Rin*(1I_max*Cc/(Cc+C50))  kout*E OBSERVATION: Concentration = {type=continuous, prediction=Cc, errorModel=combined1} PCA = {type=continuous, prediction=E, errorModel=constant} OUTPUT: output = {Concentration, PCA}
In this case, one has to modify the input and observation definition. Thus, the model writes in the new V2 syntax
; Model using new V2 syntax DESCRIPTION: Turnover model for multiple doses. [LONGITUDINAL] input = {Tlag, ka, V, Cl, I_max, C50, Rin, kout} EQUATION: Cc = pkmodel(Tlag, ka, V, Cl) E_0 = Rin/kout ddt_E = Rin*(1I_max*Cc/(Cc+C50))  kout*E DEFINITION: Concentration = {type=continuous, prediction=Cc, errorModel=combined1} PCA = {type=continuous, prediction=E, errorModel=constant} OUTPUT: output = {Concentration, PCA}
Sequential PK/PD model – Individual PK parameters have been estimated first, they are defined as regression variables.
In the warfarin demos, a turnover model for multiple doses is used. It writes
; Model using old V1 syntax DESCRIPTION: Sequential PK/PD model. Individual PK parameters have been estimated first, they are defined as regression variables. INPUT: parameter = {Imax, C50, Rin, kout} regressor = {Tlag, ka, V, Cl} EQUATION: Cc = pkmodel(Tlag,ka,V,Cl) E_0 = Rin/kout ddt_E= Rin*(1Imax*Cc/(Cc+C50))  kout*E OUTPUT: output = E
In this case, one has to modify the input definition. Thus, the model writes in the new V2 syntax
; Model using new syntax DESCRIPTION: Sequential PK/PD model. Individual PK parameters have been estimated first, they are defined as regression variables. [LONGITUDINAL] input = {Imax, C50, Rin, kout, Tlag, ka, V, Cl} Tlag = {use = regressor} ka = {use = regressor} V = {use = regressor} Cl = {use = regressor} EQUATION: Cc = pkmodel(Tlag,ka,V,Cl) E_0 = Rin/kout ddt_E= Rin*(1Imax*Cc/(Cc+C50))  kout*E OUTPUT: output = E
5.3.Monolix2016R1known bug list
Bugs have been reported for which there are workarounds
Structural model
 When you have a structural model using a delay differential equation, if the initial condition of the delayed variable is time varying, then the information is not taken into account. Workaround: It is possible to code the initial condition in a different way as can be seen here.
 If the initial value of a differential equation is defined by a regressor, it is not well taken into account. Workaround: In the Mlxtran model file, remove the regressor definition, and use instead a “PK:” block with a “depot” macro to add the initial value of your ODE at the desired time. If you do not have a AMT column, you just tag your regressor column X into AMT. Else wise, you need to modify your AMT column to include the regressor value and affect it to another administration route (using the ADM column).
 For the indirect response model from the PD/PDe libraries (all the turn_input and turn_output model files), the initialization is incorrect. Workaround: You have to add the initial condition E_0 = Rin/kout to your model. You have to change it directly in your library (in monolixSuiteInstallationDir/factory/library/pd and monolixSuiteInstallationDir/factory/library/pde).
 Error “The model could not be compiled. Error message: Cannot load the library …”: this error usually appears when the keyword inftDose is used. Workaround: use a regressor column containing the infusion duration information
Project definition
 Error “id:lixoft:Exception … message:assert(son) failed …” when saving: the error appears if you change the name of a categorical covariate without applying any transformation. Note that the first save may work but if the project is loaded and saved again, the error appears. Workaround: Use the button “ADD” in the covariate transformation window. Using that, you can define any transformed covariates, possibly just rename even if no transformation is applied.
 When selecting the option “mixed method” for parameter without variability in Settings > Pop. parameters > Advanced settings, the choice of the method is not properly saved when saving the project. As a consequence, when reloading a saved project, the default method “no variability” will be used. Workaround: manually set the method to “mixed method” after having loaded the project
 Error “A message identifier must be followed by another input argument, of type char, representing the message text” when saving: Reason 1: this message appears when a covariate transformation has been defined using centering by mean or median, and the data set contains individuals without any observation for some YTYPEs. Workaround: use option “other” and define your covariate transformation, for instance “log(cov/70)”. Reason 2: the same error appears if you have set an ADM column, but not AMT column. Workaround: ignore the ADM column.
 Error “The (lexicographical) ordering of the data observation type ’10’ does not match the ordering of its associated observation” when saving: this error appears when a project has more than 10 outputs (and 10 YTYPE). There is no workaround for the 2016R1 version, but you can use the 4.3.3 version which does not have this bug.
 Error “Error when estimating the individual parameters: Attempted to access X(6); index out of bounds because numel(X)=1.” in the console when estimating individual parameters: this error appears for projects with a latent covariate, and parameters that have no influence on some individuals (for instance ka for individuals which receive a bolus dose). Workaround: estimate the individual parameters using the “conditional mean” option.
 Error “Error: matrix must be definite positive” at the end of the first SAEM phase: this error may appear if you have a latent covariate and have chosen one the methods “add decreasing variability”, “variability in the first stage” or “mixed” for parameters without variability. Workaround: use method “no variability”.
 Error “Execution error : CAT arguments dimensions are not consistent” at the end of Fisher Information Matrix calculation: error appears for categorical data with IOV. No workaround for 2016R1 version.
 Error “Execution error: subscripted assigment dimension mismatch” when generating the graphics: this error may happen if some individuals do not have observations for all outputs/YTYPEs. Workaround: reorder your data set such that the last individual of the data set has observations for all outputs
 Error “Bad lexical cast: source type value could not be interpreted as target” at project load: this error happens when bayesian priors with negative typical value have been defined in the project. Workaround: manually remove the minus signs in the section [POPULATION] DEFINITION of the .mlxtran project file, load and put the minus signs back in the Monolix GUI before running the project.
 Error “Execution error: All of the input arguments must be of the same size and shape. Previous inputs had size 1 in dimension 1. Input #3 has size 0.“. This error appears for TTE models, when calculating the individual parameters via mean. Workaround: use the calculation via mode instead
Output files
 When using “Export Graphics Data” for predictioncorrected (uppsala correction) VPCs, the outputted observation data points are not predictioncorrected. As a workaround, an R script permits to generate the predictioncorrected output file for most cases (no BLQ data, no stratification). Download here.
Export to Mlxplore
 column RATE, TINF and ADDL of the data set are not properly exported. The infusions will be replaced by boluses and the additional doses not taken into account. Workaround: manually add “tinf=” or “rate=” and the additional doses in the lines of the [ADMINISTRATION] block. Mlxplore documentation here.
 export of custom parameter distributions is not supported. No error is generated at the time of export but generated Mlxplore script will not compile. It is not possible to use custom distributions with Mlxplore.
Meaning of error messages
 Error “Cannot merge a regression update, as it would erase a meaningful value” at start of SAEM: this error appear when different regressor values are defined at the time for the same individual. Solution: check our data set; or if you use occasions, and would like to have different regressor values at the same time for different occasions, add an EVID column and set EVID=4 at the beginning of each occasion.
 Error “There was a problem verifying the data file” when loading the data: the error might come from special characters (such as quotes, etc) in your data file. Solution: Check here for the allowed characters and modify your data set accordingly
 Error “A model’s first parameter must have one column for each individual parameter. Model’s buffer could be out of date” : this error may appear if you modify the structural model file in an external editor and run tasks without having recompiled the model. Solution: recompile the model
 Error “Execution error: Vectors must be the same lengths” during graphics generation: for some graphics predictions are calculated for time points not present in the original data set, and possibly for negative times. If your model returns NaNs for some time points (for instance if you output log(Cc) and Cc(t=0)=0 ), this error appears. Solution: use a saturation in your model file, for instance Ccsat=max(1e6,Cc) and then use log(Ccsat).
 Error “Execution error: Input #2 expected to be a cell array, was double instead” or “index exceeds matrix dimension” when generating the graphics: this error appears if you have selected the table “covariate summary” in the “outputs to save” and you have set as CAT a column that have only one covariate value (all individuals have ‘0’ for male for instance). Solution: ignore this column in the data set loading window.
 Error “Error while stratifying: the information about the covariates is not consistent. Impossible to load stratify, using default” when generating the graphics: This error may appear if you have used “attach current settings to project” and have changed the data set afterwards. Solution: manually remove the file stratify.mat from the result folder and the *_graphics.xmlx file, and load the project again.
 Error “Parse not ended” when loading a project: If the data set contains header or categorical covariate with modalities containing special characters “.”, ““, “*”,… (see here for the full list), the project runs fine and can be saved. However, there is an issue when reloading the project. Solution: To be able to reload the saved project, change the special character by a letter (“X” for example). Then, the modified project can be loaded. The categories will be automatically recomputed.
5.4.Running Monolix using a command line
Yes, it is possible to run Monolix using a simple command line:
 with the standalone version of Monolix
monolixSuiteInstallationFolder/bin/Monolix.sh [–nowin] p fullPathProjectName f [runsaemfimllgraphics]
(replace .sh into .bat for windows operating system)
 with the Matlab version of Monolix (the command has to be launched from the matlab directory of the installation of Monolix , i.e. /matlab)
matlab wait nosplash nodesktop r "monolix(’nowin’, ’p’, ’fullpathProjectname’,’f’,’[runsaemfimllgraphics]’, ’destroy’ ),exit"
Notice that the project name should be defined using a full path and not a partial path.
The program options are
 nowin : without opening a window, mandatory in nodesktop environments.
 p : project to run. It should be the full path name of the project
 f runsaemindivfimllgraphics to run a task
 f run: run the workflow defined in the project file.
 f saem: estimate population parameter.
 f indiv: estimation of the individual parameters
 f fim: estimate the standard errors of the estimates and Fisher information matrix.
 f LL: estimate the loglikelihood.
 f graphics: generate the result graphics.
 To run several tasks, you just have to set them successively, e.g. f saem f LL
5.5.Monolix batch run
It is possible to run calculation of several Monolix projects through a single command line from R or Matlab. For that, you can use the following functions (here for R and here for Matlab). To use it
source('runMonolix.R') runMonolix(name, mlxInstallDir, display, saveGraphics, scenario)
where
 name: Define the Mlxtran files to run in absolute path
 If it is a list of Mlxtran files, run them
 If it is a list of directory, run all the Mlxtran files in the folder
 It can be a combination of both
 mlxInstallDir : Define where MonolixSuite is installed
 Typically for windows mlxInstallDir < ‘C:/ProgramData/Lixoft/MonolixSuite2016R1/bin/’
 display : Boolean to define if Monolix calculation is displayed or not (FALSE by default)
 saveGraphics : Boolean to define if the graphics are saved or not (FALSE by default). It is not done by default in the Monolix project configuration.
 scenario : Monolix scenario could be a list of
 saem: estimate population parameters
 indiv: estimate individual parameters
 fim: estimate the standard errors of the estimates and Fisher information matrix
 ll: estimate the loglikelihood
 graphics: generate the result graphics
 By default, the run option is used to get the scenario from the project
Several examples,

source('runMonolix.R') runMonolix(name='C:/Users/jonathan/mydemos/1.creating_and_using_models/libraries_of_models', mlxInstallDir='C:/ProgramData/Lixoft/MonolixSuite2016R1/bin/')
run all the projects in the defined folder. The scenario will be the one defined in each projects

source('runMonolix.R') runMonolix(name='C:/Users/jonathan/mydemos/1.creating_and_using_models/libraries_of_models', mlxInstallDir='C:/ProgramData/Lixoft/MonolixSuite2016R1/bin/',scenario=c('saem','ll'))
run all the projects in the defined folder. The scenario will be the calculation of SAEM and the the LogLikelihood for each projects

source('runMonolix.R') runMonolix(name=c('C:/Users/jonathan/mydemos/1.creating_and_using_models/libraries_of_models', 'C:/Users/jonathan/mydemos/1.creating_and_using_models/libraries_of_models/2.2.censored_data/censoring1_project.mlxtran'), mlxInstallDir='C:/ProgramData/Lixoft/MonolixSuite2016R1/bin/', scenario=c('graphics'), saveGraphics = TRUE)
run all the projects in the defined folder and the other Mlxtran project. The scenario will only be the calculation of the graphics and they will be saved.
5.6.Submission of Monolix analysis to regulatory agencies
Monolix is used for regulatory submissions (including the FDA and the EMA) of population PK and PK/PD analyses. Monolix analyses for first in human dose estimation, dosefinding studies and registration studies have been routinely and successfully submitted to the FDA, EMA [3] and other agencies. Some of the agencies like the FDA or the EMA do have Monolix and the modelling experts to understand, review and run Monolix. Regulators are also taking part in publishing research articles with Monolix [2].
Regulatory guidelines [1] provide only little information on the required electronic files for submission. Based on exchanges with regulatory agencies and confirmed through past regulatory submissions using Monolix, the following listed files (in Table 1) are required for a complete Monolix analysis submission package. The Monolix analysis submission package listed in Table 1 has all the files needed to run Monolix without any implementation work and to reproduce the results. Attention must be payed to use relative file path definitions to facilitate the project transfer from one computer system to another (Monolix documentation). Further, it is important to consider that Table 1 represents the Monolix submission package for a typical analysis. There might be modelling cases that require the submission of additional material. Each submission should be treated separately and carefully reviewed for completeness cases by cases.
Table 1 Monolix analysis submission package
File  Explanation  File type, format 
Report (.pdf)  Report detailing all of the modelling according to the EMA or FDA guidelines [1]  
Data (.csv)  Data file containing all the observations, dosing history, patient specific information and other information provided via the data file  Text file, CSV 
Project file (.mlxtran)  Defines what data file, model file, algorithm settings, graphics settings and parameters have been used to generate the results  Text file, Mlxtran 
Model file (.txt)  The structural model in Mlxtran syntax  Text file, Mlxtran 
Graphics file (.xmlx)  Definition of the diagnostic plot settings  Ext file, XMLX 
In addition, some Monolix files define project independent parameters that are applied to all projects run on the same computer account. In all but the most special cases users will have these parameters set to the default values. In case these parameters are modified, they should also be communicated. They can be found in the following files:
Table 2 Computer account dependent settings
File  Parameter  Explanation 
monolix/config/config.ini  useAnalyticalFunctions=true  Setting defines if analytical functions are used (default is ‘true’). The rest of the file are path configurations for temporary files. 
monolix/config/preferences.xmlx  <dosesToAddForSteadyState value=”5″/>  This defines the number of doses (default is 4) added before the dose entered with the SS flag set to true (so 5 doses in total, by default). It is only used when a SS column is used in the data set. 
A Monolix run will automatically produce a large number of additional files. However, the files listed in Table 1 and Table 2 are sufficient to entirely reproduce the results. Note that all files in Table 1 are in a human readable format. Thus, the information contained in these files can also be included into the appendices of the report creating one single document that contains everything to reproduce the results.
References
[1] Applicable regulatory guidelines
 EMA “Guideline on reporting the results of population pharmacokinetic analyses” (CHMP/EWP/185990/06): details what should be contained in a PK or PK/PD analysis report.
 FDA “Guidance for Industry Population Pharmacokinetics”: takes a larger scope also addressing “Data Integrity and Computer Software’ and ‘Electronic Files”.
 FDA “ExposureResponse Relationships – Study Design, Data Analysis, and Regulatory Applications”.
[2] FDA publications using MonolixSuite
 “Plasma pharmacokinetics of ceftiofur metabolite desfuroylceftiofur cysteine disulfide in holstein steers: application of nonlinear mixedeffects modeling.”, J Vet Pharmacol Ther 2016 Apr;39(2):14956, DOI: 10.1111/jvp.12245. O. A. Chiesa, S. Feng, P. Kijak,E. A. Smith, H. Li and J. Qiu.
 “Quantification of disease progression and dropout for Alzheimer’s disease.”, Int. Journal of Clinical Pharmacology and Therapeutics, Volume 51 – February (120 – 131),(Doi: 10.5414/CP201787). D. WilliamFaltaos, Y. Chen, Y. Wang, J. Gobburu,, and H. Zhu.
 “Estimation of Population Pharmacokinetic Parameters Using MLXTRAN Interpreter in MONOLIX 2.4”, D. William Faltaos, Acop 2009.
[3] Submission with Monolix analyses to the EMA
5.7.Initialization
Introduction
Initial values are specified for the Fixed effects, for the Variances of the prandom effects and for the Residual error parameters. It is usually recommended to run SAEM with different initializations and to compare the results (this can be done via a convergence assessment, accessible through the “Assessment” button in the GUI).
Initialization of the “Fixed effects”
In this frame in the middle of the “Initialization” frame, the user can modify all the initial values of the fixed effects. There is at least on initial values by individual parameters. If no value is predefined (from a previously saved project for example), a default value at 1 is proposed. A right click on an initial value displays a list with three options.
 “Maximum likelihood estimation” (default): The parameter is estimated using maximum likelihood. In that case, the background remains white.
 “Fixed”: the parameter is kept to its initial value and so, it will not be estimated. In that case, the background is set to pink.
 “Bayesian estimation”: a popup windows opens to specify a prior distribution for this parameter. In that case, the background of the case is set to purple. To see more on the prior distribution, click here.
When an individual parameters depends on covariate, initial values for the dependency (named with prefix, for instance to add the depedency of sex, on parameter ka) are displayed. The default initial value is 0. The three same options are also valid for these parameters.
In case of a continuous covariate, the covariate is added linearly to the transformed parameter, with a coefficient . For categorical covariates, the initial value for the reference category will be the one of the fixed effect, while for all other categories it will be the initial value for the fixed effect plus the initial value of the , in the transformed parameter space. It is not possible to define different initial values for the nonreference categories. The equations for the parameters can be visualized by clicking on the white paper clip button next to the model file.
In case of IOV, the same initialization is performed but in the IOV window when clicking on “IOV” on the right of the covariance model.
Initialization of the “Stand. dev. of the random effects”
In this frame in the middle of the “Initialization” frame, the user can modify all the initial values of the standard deviation of the random effects. The default value is set to 1. Each individual parameter has its initial value for the corresponding random effect. However, if the user specifies in the covariance model that a parameter has no variance (put 0 in the corresponding diagonal term of the covariance model), the corresponding standard deviation is set to 0 and the box is grayed to show the user that no modification can be done. A right click on an initial value displays a list with two options: “Maximum likelihood estimation” (default choice), and “Fixed”.
A button is added to offer the possibility to work with either the variance or the standard deviation. Notice that the transformations are done automatically by Monolix.
Initialization of the “Residual error parameters”
In this frame on the right of the “Initialization” frame, the user can modify all the initial values of the residual error parameters. There are as many lines as outputs of the model. The nongrayed values depend on the considered observation model. The default value depends on the parameter (1 for “a”, 0.3 for “b” and 1 for “c”). A right click on an initial value displays a list with two options: “Maximum likelihood estimation” (default choice), and “Fixed”.
How to initialize your parameters?
Check initial fixed effects
When clicking on the “Check the initial fixed effects” button, a popup window opens. The simulations obtained with the initial population fixed effects values are displayed for each individual together with the data points, in case of continuous observations. This feature is very useful to find some “good” initial values. Although Monolix is quite robust with respect to initial parameter values, good initial estimates speed up the estimation. You can change the values of the parameters with the buttons + and – and see how the agreement with the data change.
On the use of Mlxplore
Another convenient method is to use Mlxplore to have an initial guess. For that, you can export your model to Mlxplore (blue button next to the modle file in the Monolix GUI) and play with the initial values to see your model response. In the “Data” tab of the Mlxplore GUI, you can choose to overlay your data on top of the simulation. The main interest, compared to the “check the initial fixed effects” feature, is that you can also modify your structural model.
On the use of last estimates
If you have already estimated the population parameters for this project, then you can use the “Use the last estimates” button to use the previous estimates as initial values.
5.8.How to compute AUC using in Monolix and Mlxtran
Often the Area under the PK curve (AUC) is needed as an important PK metric to link with the pharmacodymanic effect. We show here how the AUC can be computed within the mlxtran model file and be outputted for later analysis.
In the EQUATION section of the Mlxtran model you can use the following code. It is the basic implementation of the AUC for a 1compartmental model with absorption rate ka. It will compute the AUC from the start to the end of the integration.
Notice that the calculation can be used as an output (to be matched to observations of the data set) and thus in the output={} definition, or just computed and available for posttreatment using table={} as can be seen here.
AUC for t=0 to t=tend and fast computation with linear ODE
INPUT: parameter = {ka, V, Cl} PK: compartment(cmt=1, amount=Ac, volume=V, concentration=Cc) elimination(cmt=1, Cl) oral(ka, cmt=1) EQUATION: odeType = linear ddt_AUC = 1/V * Ac OUTPUT: output = {Cc} table = {AUC}
Time interval AUC
If you want to limit the computation of the AUC to a time period you can use the following code:
INPUT: parameter = {ka, V, Cl} PK: compartment(cmt=1, amount=Ac, volume=V, concentration=Cc) elimination(cmt=1, Cl) oral(ka, cmt=1) EQUATION: ddt_AUC = 1/V * Ac if(t < 50) AUC50 = AUC end if(t < 100) AUC100 = AUC end AUC50_100 = AUC100  AUC50 OUTPUT: output = {Cc} table = {AUC50_100}
Note that the t==tDose would not work because the integrator does not necessarily evaluate the time exactly at the times of doses. Thus the test t==tDose might not be tested at all during the computation.
Fast computation with linODE and dose interval AUC (AUCtau)
In many cases the mlxtran code above will be sufficient. For cases where you need fast execution you use a special solver for linear ODE systems by specifying odeType = linear in the EQUATION section. This does not work with if statements and you will need to use the code below. In this code the AUC is computed for each dose interval. Thus, at each dose the AUC is set to zero and the concentration is integrated until the next administration.
INPUT: parameter = {ka, V, Cl} PK: compartment(cmt=1, amount=Ac, volume=V, concentration=Cc) elimination(cmt=1, Cl) oral(ka, cmt=1) ; Create a separate dummy compartment for the AUC compartment(cmt=2, amount=AUCtau) iv(cmt=2, p=  AUCtau/amtDose) EQUATION: odeType=linear ddt_AUCtau = 1/V * Ac OUTPUT: output = {Cc} table = {AUC}
5.9.Output to save
Purpose
When clicking of the “Output to save” button, the user can define the outputs (graphics and tables) that will be saved in the result folder.
Graphics
The list of graphics corresponds to all the graphics that Monolix can generate
 Project summary
 Spaghetti plot: This graphic displays the original data w.r.t. time along with some additional informations.
 Individual fits: This graphics displays the individual fits, using the population parameters with the individual covariates (red) and the individual parameters (green) w.r.t. time on a continuous grid.
 Obs. vs Pred: This graphic displays observations w.r.t. the predictions computed using the population parameters (on the left), and w.r.t. the individual parameters (on the right).
 Covariates: This graphic displays displays the estimators of the individual parameters in the Gaussian space (and those for random effects) w.r.t. the covariates.
 Parameters Distribution: This graphic displays the estimated population distributions of the individual parameters.
 Random Effects (boxplot): This graphic displays the distribution of the random effects with boxplots.
 Random Effects (joint dist.): This graphic displays scatter plots for each pairs of random effects.
 Convergence SAEM
 Residuals
 VPC
 NPC
 BLQ
 Prediction distribution
 Categorized Data
 Time to Event data
 Transition Probabilities
 Posterior distribution
 Contribution to likelihood
To know more about these graphics, see the Graphic section. Moreover, if these graphics are not generated (if not feasible nor requested by the user in the Graphic selection), they will not be saved.
Tables
The list of tables corresponds to all the tables that Monolix can generate. A more detailed description of the content of the tables can be found here.
 Observation times: It generates a file predictions.txt containing Observation times table. It contains the regression variables, the individual predictions, the population predictions, the weighted population residuals, the weighted individual residuals and the NPDE, for all the observation times in the data set.
 Finegrid times: It generates a file finegrid.txt containing Finegrid times table. It contains the regression variables, the individual predictions, the population predictions in a fine (regular) grid.
 All times: It generates a file fulltimes.txt containing All times table. It contains the regression variables, the individual predictions, the population predictions for a possibly larger set of times in the data set, including observation times, lines where MDV column is set to 2.
 LL individuals contribution: it generates a file individualContributionToLL.txt containing LL individuals contribution table. It contains subject contribution to the total loglikelihood.
 Covariates summary: It generates a file covariatesSummary.txt containing Covariates summary table. It contains a summary of all the covariates defined in the project.
5.10.Bayesian feature : on the use of priors on a fixed effect
When you select to define a prior distribution on a fixed effect, a new window will open to define its law as in the following
As for individual parameters, you can choose from some given distributions (normal, lognormal, logitnormal and probitnormal) or you can define your own as a transform T of a Gaussian distributed variable.
Assuming to be the chosen fixed effect with the transformation where is a Gaussian distributed variable. You must specify the typical value of ( and the variance or standard deviation of Z. By default, the current initial value will be used as typical value. Also, selecting a different typical value will set it automatically as initial value for the corresponding parameter. The user can also choose between two estimation methods: M.A.P. and posterior distribution.
Notice that Monolix can estimate the M.A.P only for
 gaussian priors if the parameter is a covariate coefficient (a “”)
 priors with same distribution than the corresponding individual parameter if is an intercept. It means that if V is set as lognormal distributed, then the M.A.P of () can only be computed for lognormal priors on .
5.11.How to export to Datxplore, Mlxpore and Simulx ?
Using Monolix interface, it is possible to export your data set, your model, your project to other Lixoft software.
Export to Datxplore
Datxplore is a graphical and interactive software for the exploration and visualization of data. Datxplore provides various plots and graphics (box plots, histograms, survival curves…) to study the statistical properties of discrete and continuous data, and to analyze the behavior depending on covariates, individuals, etc. It is a great application to have a better look to your data set.
In Monolix, you can export your data set by clicking on the Datxplore icon in the Data and model frame as on the following figure. Datxplore will be directly launched with the defined data set.
Notice that, due to limitation in Datxplore, it is not possible to export data sets with IOV, or washout with an EVID=4.
Export to Mlxplore
Mlxplore is an application for the visual and interactive exploration of your models. It is designed for intuitive and easy use. It is a powerful solution in your daily modeling work as well as for sharing and teaching of PK/PD principles or for real time doseregimen exploration in front of an audience. Mlxplore also allows you to visualize the statistical components of the model, such as the impact of covariates and interindividual variability.
In Monolix, you can export your model (longitudinal and statistical) and the data set by clicking on the Mlxplore icon in the Data and model frame as on the following figure. When clicking on the Mlxplore icon, a window will pop up, where the user can define
 The prediction/simulation grid
 minimum
 step
 maximum
 The considered individual (with its administration design)
 Only the first one
 All individuals
 User choice
After clicking on OK, Mlxplore launches. Thus, you will be able to explore your model and associate it to the data set.
Notice that some cases are not managed by Mlxplore:
 Models with a noncontinuous output
 Models with regressors
 Models with bsmm functions
 Data set with IOV or EVID=4
Another important point is that Mlxplore is only managing predictions. Thus, even if a continuous error model is considered in the project, it will not be taken into account and the associated sliders have no impact on the results.
Moreover, when categorical covariate are considered, the following message appears in the export window
This implies that the categorical covariates are not anymore considered as a category but as a continuous covariate with 0 for the reference category.
Export to Simulx
Simulx is a powerful and flexible simulator for clinical trial pharmacometrics that runs on top of the Lixoft simulation engine. It covers a comprehensive range of data types for simulation and links to a ready to use PK and PD library. It directly connects with Monolix for seamless Modeling and Simulation, and is the statistically most rigorous solution available today. Simulx is currently available via a comprehensive R package thanks to its combination with mlxR, a DDMoResponsored library developed by Inria which contains several R functions for advanced clinical trial simulation.
The simulx function can either directly take the path to a Monolix project as input, or use a model file which can be generated from the Monolix project. This is done using the monolix2simulx function.
Several restrictions:
 Projects with IOV can not be exported.
 Projects with Matlab longitudinal projects can not be exported.
 Projects with several longitudinal models can not be exported.
 Projects with custom distributions for individual parameters can not be exported.
 Projects with bsmm function in the longitudinal model can not be exported.