This case study presents the modeling of the **tobramycin pharmacokinetics**, and the determination of a priori dosing regimens in patients with various degrees of renal function impairment. It takes advantage of the integrated use of Datxplore for data visualization, Mlxplore for model exploration, Monolix for parameter estimation and Simulx for simulations and best dosing regimen determination.

The case study is presented in 5 sequential parts, that we recommend to read in order:

- Part 1: Introduction
- Part 2: Data visualization with Datxplore
- Part 3: Model development with Monolix
- Part 4: Model exploration with Mlxplore
- Part 5: Simulations for individualized dosing with Simulx and Monolix

**Part 3: Model development with Monolix**

- Set up of a Monolix project: definition of data, structural and statistical model for the initial model
- Running tasks on the initial model
- Estimation of the population parameter
- Estimation of the standard errors of the parameters, via the Fisher Information Matrix
- Estimation of the individual parameters
- Estimation of the log-likelihood
- Generation of the diagnostic plots

- Model assessment using the diagnostic plots
- Covariate search
- Test of a two compartment model
- Identification of the observation error model
- Convergence assessment
- Overview

### Set up of a Monolix project: definition of data, structural and statistical model for the initial model

We first create a new project in Monolix and load the data, assigning the columns as we have done in Datxplore. We then need to **define the model**, both the structural part and the statistical part, including the residual error model, the parameter distributions (variance-covariance matrix) and the covariate model. We will **develop the model step-wise**, starting with the structural part.

As a first guess, we try a** one-compartment model** with bolus administration and linear elimination. This model is available in the** Monolix PK library**, which is accessed when clicking on “Model file > Monolix Library > PK”. A (V,Cl) and a (V,k) parameterization are available. We choose the V,k one:

For the** residual error model**, we try a combined model with an additive and a proportional term (model combined1 in the list), and we will see later on if one of the two terms can be dropped.

For the **parameter distributions**, we keep the default: log-normal distributions (“L” letter) for V et k, and no covariance. To start, we also neglect the **covariates**.

Before launching the parameter estimation, it is recommended to find **good initial parameter values with the “Check the initial fixed effects” tool**. In the pop-up window, the simulations with the population parameters are displayed together with the data for each individual. After having increased the number of points per curve in the bottom right corner to smooth the displayed curves (important when the Cmax has to be captured accurately), one can play with the values of V and k. In the present case, because the data is sparse, it is not easy to find good initial values. We thus keep the default V=1 and k=1.

Now that we have specified the data and the model, we save the project as “project_01” (menu: Project > Save). The result text files of all tasks run afterwards will be saved in a folder next to the project file. Graphics figures and specific outputs tables need to be selected in the “Outputs to save” pop-up window in order to be saved as files.

All Monolix project files can be downloaded at the end of this webpage.

### Running tasks

#### Estimation of the population parameters

To run the **maximum likelihood estimation of the population parameters**, click on the “Population” button in the “Tasks” frame. In the pop-up window, you can follow the progression of the parameter estimates over the iterations of the **stochastic approximation expectation-maximization** (**SAEM) algorithm**. This algorithm has been shown to be extremely efficient for a wide variety of possibly complex models. In a first phase (before the red line), the algorithm converges to a neighborhood of the solution, while in the second phase (after the red line) it converges to the local minimum. By default, automatic termination criteria are used. In the current case, even of the initial parameter values were pretty far, the algorithm shows a good convergence:

The **estimated population parameter values** can be accessed by clicking on “last results” in the main window.

#### Estimation of the standard errors, via the Fisher Information Matrix

To calculate **standard errors** of the population parameter estimates, one can calculate the Fisher Information Matrix (FIM). Two options are proposed: via linearization or via stochastic approximation. The linearization method is faster but the stochastic approximation is usually more precise. As our model runs fast, we can afford to select the stochastic approximation option and click on the icon to calculate the FIM. After the calculation, the standard errors are added to the “last estimates” file:

#### Estimation of the individual parameters

The **individual parameter** can also be calculated (via MCMC), either the mean and standard deviation of the conditional probability \(p(\psi_i|y_i;\hat{\theta})\), or only the mode (faster). An approximation of the mean of the conditional probability has already been calculated using the last steps of the SAEM algorithm, but the dedicated estimation via MCMC is more accurate. With the individual parameter estimates, the individual predictions can be compared to the data of each individual to detect possible model mis-specification for instance.

#### Estimation of the log-likelihood

For the comparison of this model to more complex models later on, we can **estimate the log-likelihood**, either via linearization (faster) or via importance sampling (more precise). Clicking on the likelihood estimation icon after having selected the “importance sampling” option, we obtain a -2*log-likelihood of 669, printed in “last estimates”.

#### Generation of the diagnostic plots

To assess the quality of the model, we finally can generate **built-in graphics**. Clicking on “Select plots”, one can see the list of possible graphics, we keep the default “reduced” list. By clicking on the Graphics icon, the figures are automatically generated.

### Model assessment using the diagnostic plots

In the **“individual fits” plot**, it seems that the spikes corresponding to the bolus administrations are not well captured, due to insufficient drawing points. To improve this, go to the menu bar at the top Settings > Results, change the “grid size” to 1000 and generate the graphics again. In the “Individual Fits” plot, open the settings by clicking on the “Settings” menu at the top, in order to browse through all individuals. Using the Zoom (in the menu bar), one can also zoom in to the zone of interest, which is particularly useful for such sparse data sets. For most individuals the fit looks good, no striking structural model mis-specification can be detected. We here for instance display a zoom on the last time points for the individuals 90 and 91:

The **“Obs. vs Pred.” plot** using the predictions with individual parameters looks also good:

In the **“Parameter Distribution”** plot, the theoretical pre-selected parameter distributions (using the estimated population parameters) can be compared to the empirical distribution (histogram). In the “Settings”, if the “simulated parameters” option is chosen (default), the individual parameters for the histogram are drawn from the condition probability \(p(\psi_i|y_i;\hat{\theta})\) to obtain an unbiased estimator of \(p(\psi_i)\). This permit to detect mis-specifications of the parameter distributions. None is detected here (plot on the left). On the contrary, if the empirical distribution of the estimated individual parameters (mean or mode option) is plotted, the eta-shrinkage can be seen (click on “information” in the “Settings” pop-up window to display it). The large eta-shrinkage for V means that the data does not allow us to correctly recover the individual volumes (plot on the right).

In the **“Random Effects (boxplot)”** graphic, the standardized random effects plotted as a boxplot are in agreement with the expected standard normal distribution (pink lines represent the median and quartiles). So there is no reason to reject the model for the random effects:

In the **“Random effects (joint dist.)”** figure, the random effects of the two parameters are plotted against each other. No clear correlation can be seen.

In the **“Covariates”** figure, the transformed parameters (log(V) and log(k) as we have chosen log-normal distributions) are plotted against the covariate values. Notice that we use simulated individual parameters. This means that instead of using the mode or mean from the post-hoc estimation of each individual parameter, we are sampling from its distribution. This is critical because it is known that the plots using the mode or mean of the post-hoc distribution often show non-existing correlations (see Lavielle, M. & Ribba, B. Pharm Res (2016)).

The covariates figure shows that log(k) decreases with AGE and increases with CLCR. It is possible to display the **correlation coefficients** in the Settings pop-up window, by selecting “Informations”. Remember that we had noticed with Datxplore that there was a negative correlation between AGE and CLCR. From a **biological point of view**, a correlation between Tobramycin clearance and creatinine clearance makes complete sense, as tobramycin is mainly cleared via the kidney.

**In summary,** the graphics indicate no mis-specification of the structural model or of the residual error model. Yet, the covariates figure suggests to add CLCR as covariate on the elimination k. So the current model can be rejected based on the graphics only, **without the need to calculate and compare the log-likelihood**.

### Covariate search

In the covariate figure, the increase of log(k) with CLCR is clearly not linear. Clicking on “**Transform**” in the main window, we thus transform the CLCR covariate using a log-transformation centered around the mean, and call it logCLCR (screenshot below).

We then add logCLCR as covariate to k, by clicking on the **covariate matrix**. **To see the equations** associated with the parameter distribution, you can click on the paper clip button next to the “Model file” field:

Note that the transformed covariates are added linearly to the transformed parameters. In comparison to NONMEM, where equations for parameters are written directly, here we need to think in term of transformed parameters. We would like to implement the following parameter distribution for the elimination rate (log-normal distribution as initially, with CLCR as covariate in addition):

$$k_i = k_{pop} (\frac{CLCR_i}{\textrm{mean}(CLCR)})^{\beta} e^{\eta_i}$$

with \(\eta\) drawn from a normal distribution of mean 0 and standard deviation \(\omega_{k}\).

In NONMEM, one would typically write:

TVK = THETA(1)*((CLCR/100)**THETA(2)) K = TVK * EXP(ETA(1))

with THETA(1)=k_pop, THETA(2)=beta and std(ETA(1))= omega_k.

In Monolix, we think in terms of transformed variables:

$$\log(k_i)=\log(k_{pop})+\beta \times \log(CLCR_i/(\textrm{mean}(CLCR))) + \eta_i$$

Because covariate are added linearly to the transformed parameter \(\log(k_i)\), one need to transform to covariate by applying a log transformation too. Without the log-transformation of the covariate, the model would be:

$$\log(k_i)=\log(k_{pop})+\beta \times CRCL_i + \eta_i$$

i.e

$$k_i = k_{pop} e^{\beta \times CLCR_i} e^{\eta_i}$$

After having added logCLCR as covariate on log(k), before running the tasks, we click** “use the last estimates”**, to re-use the last estimated parameter values as starting point for SAEM and save the project under project_02. We then select the following tasks and click on run to launch the sequence of tasks (parameter estimation, FIM, individual parameters, graphics).

Opening the “last results”, we see that the **p-value** associated with the beta_k_logCLCR parameter is inferior to 1e-10, suggesting that this parameter is significantly different from 0. **The effect on CLCR on k is pretty large**: for an individual with CLCR=40mL/min (moderate renal impairment) k=0.12 /min, while for an individual with CLCR=100mL/min (normal renal clearance) k=0.27 /min. In addition, the standard deviation of k has decreased from 0.56±0.05 to 0.32±0.03 as part of the inter-individual variability in k in now explained by the covariate.

In the covariates graphic, we choose to plot the random effects rather than the individual parameters (Settings > Random effects) to see if there are remaining correlations. Note that no correlation is seen anymore between eta_k and AGE, as expected as AGE and CLCR are strongly negatively correlated. It now appears that eta_k is correlated with the weight WT.

As the next step, we can **add either WT or logWT on log(k)**. We choose to log-transform and center WT, and add it as a covariate on log(k). We save the project as project_03 and run the sequence of tasks again. beta_k_logCLCR appears to be significantly different from 0 (p-value 0.0093). The covariate figure now suggests to also add logCLCR on the volume V. We do so and save as project_04. This time the **Wald test** suggests that logCLCR on log(V) is not significant (p-value=0.39). We thus remove logCLCR on log(V). Although the covariate figure doesn’t hint at it, it is common to have the volume V correlated to the weight WT. We try this in project project_05, and the covariate is significant. We thus end with logWT on log(V) and log(k), and logCLCR on log(k). **Note that the statistical tests and graphics are giving indications about which model to choose, but the final decision lies with the modeler.**

Before continuing, let’s **calculate the log-likelihood** (via importance sampling) of this one-compartment model. The results are the following:

### Test of a two-compartment model

While visualizing the data set with Datxplore, we had noticed that some individuals where hinting at a peripheral compartment. We thus **change the structural model to the bolus_2cpt_Vkk12k21 model from the library.** As our data is sparse, we choose to remove the variability on k12 and k21, by clicking on the diagonal elements of the variance-covariance matrix. We save as project_06.

To choose initial values for k12 and k21, the “check initial effects” tool is again not very helpful in this case. At this stage, we would like** to better understand the impact of extending a (V,k) model to a (V,k,k12,k21) model**. We thus open **Mlxplore** by clicking on the blue button next to the “model file” field.

Click here to continue to the Mlxplore part of the case study.

Having better understood the influence of the model parameters, we go back to the Monolix interface and set k12 and k21 initial values to the same order of magnitude than k: k12=k21=0.1, and launch the sequence of tasks, including the log-likelihood.

In the result file, we see that:** (i) the model with two compartments is significantly better (ΔAIC=32 and ΔBIC=27), and (ii) all covariates are still significant.**

**Could we have seen in the diagnostic plots of the one compartment model that it was mis-specified?** Looking at the “Obs. vs Pred.” figure in log-log scale (Axes > log-log), we can see that the one-compartment model under-predicts the low concentrations, which is not the case for the two compartment model any more:

### Identification of the observation error model

Until now, we have used a combined error model. The constant term “a” is estimated to be 0.001, which is very small compared to the smallest observation (around 0.1). It make thus sense to try a model with a proportional model only (project_07). The proportional model appears to be sufficient (ΔAIC=1.47 and ΔBIC=4.04).

### Convergence assessment

As a final check, we can do a **convergence assessment**, to verify that starting from different initial parameters or from different random number seeds leads to the same parameter estimates. For this, we select in the task-bar the following tasks: population parameters, Fisher information matrix and Log-likelihood (but we untick the individual parameter and the graphics as they are not useful here). We then click on “Assessment”, select 5 replicates with randomly picked initial values for all parameters, and run:

The displayed figures show a **good convergence of all replicates to the same minimum**:

### Final model overview

Our final model is the following:

- 2-compartment model, parametrized as (V,k,k12,k21)
- the residual error model is proportional
- k and V have inter-individual variability. Their distributions are defined by:

$$k_i=k_{pop}(\frac{CLCR}{67.87})^{beta_{k,CLCR}}(\frac{WT}{66.47})^{beta_{k,WT}} e^{eta_{k,i}}$$

$$ V_i=V_{pop}(\frac{WT}{66.47})^{beta_{V,WT}} e^{eta_{V,i}}$$

**The** **estimated parameters and their standard errors are**:

Click here to download all Monolix project files.

Click here to continue to Part 5: Simulations for individualized dosing with Simulx and Monolix