buildmlx uses SAMBA (Stochastic Approximation for Model Building Algorithm), an iterative procedure to accelerate and optimize the process of model building by identifying at each step how best to improve some of the model components. This method allows to find the optimal statistical model which minimizes some information criterion in very few steps.
Penalization criterion can be either a custom penalization of the form \(\gamma\)*(number of parameters), AIC (\(\gamma=2\)), BIC (\(\gamma=\log(N)\)), or BICc (\(\gamma=\log(N)\) for parameters related to the individual parameters and \(\gamma=\log(n)\) for parameters related to the observations).
Several strategies can be used for building the covariate model at each iteration of the algorithm: direction=“full” means that all the possible models are compared (default when the number of covariates is less than n.full). Otherwise, direction is the mode of stepwise search of stepAIC {MASS}, can be one of “both”, “backward”, or “forward”, with a default of “both” when there are at least n.full) covariates (n.full=10 by default).
It is possible to add some prior information on the models, either by introducing weights (which will then modify the penalties associated i) with the relationships between covariates and individual parameters, ii) with the correlations between random effects), or by introducing a priori probabilities for each of these components of the model. Note that these two approaches are in fact equivalent. For example, using a BIC (or BICc) penalty is equivalent to introducing an a priori probability equal to \(1/(1+\sqrt{N})\) for each relationship between a covariate and and an individual parameter, and for each correlation.
When the criteria to optimize is computed, i.e. when ll=T, some additional statistical tests are performed in order to look for possible improvements of the covariate model and the correlation model.
buildmlx <- function(project=NULL, final.project=NULL, model="all", prior=NULL, weight=NULL, coef.w1=0.5,
paramToUse="all", covToTest="all", covToTransform="none", center.covariate=FALSE,
criterion="BICc", linearization=FALSE, ll=T, test=T,
direction=NULL, steps=1000, n.full=10,
max.iter=20, explor.iter=2, fError.min=1e-3,
seq.cov=FALSE, seq.cov.iter=0, seq.corr=TRUE,
p.max=0.1, p.min=c(0.075, 0.05, 0.1),
print=TRUE, nb.model=1)
a string: the initial Monolix project
a string: the final Monolix project (default adds "_built" to the original project)
components of the model to optimize c("residualError", "covariate", "correlation"), (default="all")
list of prior probabilities for each component of the model (default=NULL)
list of penalty weights for each component of the model (default=NULL)
multiplicative weight coefficient used for the first iteration only (default=0.5)
list of parameters possibly function of covariates (default="all")
components of the covariate model that can be modified (default="all")
list of (continuous) covariates to be log-transformed (default="none")
TRUE/FALSE center the covariates of the final model (default=FALSE)
penalization criterion to optimize c("AIC", "BIC", "BICc", gamma) (default=BICc)
TRUE/FALSE whether the computation of the likelihood is based on a linearization of the model (default=FALSE)
TRUE/FALSE compute the observe likelihood and the criterion to optimize at each iteration
TRUE/FALSE perform additional statistical tests for building the model (default=TRUE)
method for covariate search c("full", "both", "backward", "forward"), (default="full" or "both")
maximum number of iteration for stepAIC (default=1000)
maximum number of covariates for an exhaustive comparison of all possible covariate models (default=10)
maximum number of iterations (default=20)
number of iterations during the exploratory phase (default=2)
minimum fraction of residual variance for combined error model (default = 1e-3)
TRUE/FALSE whether the covariate model is built before the correlation model
number of iterations before building the correlation model (only when seq.cov=F, default=0)
TRUE/FALSE whether the correlation model is built iteratively (default=TRUE)
maximum p-value used for removing non significant relationships between covariates and individual parameters (default=0.1)
vector of 3 minimum p-values used for testing the components of a new model (default=c(0.075, 0.05, 0.1))
TRUE/FALSE display the results (default=TRUE)
number of models to display at each iteration (default=1)
library(Rsmlx)
We select a Monolix project where the initial statistical model is defined
warfPK.project <- "projects/warfarinPK.mlxtran"
The structural PK model is a one compartment model for oral administration with a lag time. There are three individual covariates in the data: weight, sex and age.
The statistical model is a basic model, without any covariates, without any correlation between random and with a constant error model. The four individual PK parameters \(Tlag\), \(ka\), \(V\) and \(Cl\) are lognormally distributed.
We can now use buildmlx
for building the statistical
model:
warfPK.res1 <- buildmlx(warfPK.project)
##
## --------------------------------------------------
##
## Building:
## - The covariate model
## - The correlation model
## - The residual error model
##
## __________________________________________________
## - - - Initialization - - -
##
## Covariate model:
## sex age wt
## Tlag 0 0 0
## ka 0 0 0
## V 0 0 0
## Cl 0 0 0
##
## Correlation model:
## [1] "NULL"
##
## Residual error model:
## y1
## "combined1"
##
## Estimated criteria (importanceSampling):
## AIC BIC BICc s.e.
## 676.94 691.60 703.86 0.11
## __________________________________________________
## - - - Iteration 1 - - -
##
## Covariate model:
## sex age wt
## Tlag 0 0 0
## ka 0 0 0
## V 1 0 1
## Cl 0 1 1
##
## Correlation model:
## [1] "NULL"
##
## Residual error model:
## y1
## "combined2"
##
## Run scenario for model 1 ...
## Estimation of the population parameters...
## Sampling from the conditional distribution...
## Estimation of the log-likelihood...
##
## Estimated criteria (importanceSampling):
## AIC BIC BICc s.e.
## 642.32 662.84 675.10 0.08
## __________________________________________________
## - - - Iteration 2 - - -
##
## Covariate model:
## sex age wt
## Tlag 0 0 0
## ka 0 0 0
## V 0 0 1
## Cl 0 1 1
##
## Correlation model:
## [1] "NULL"
##
## Residual error model:
## y1
## "combined2"
##
## Run scenario for model 2 ...
## Estimation of the population parameters...
## Sampling from the conditional distribution...
## Estimation of the log-likelihood...
##
## Estimated criteria (importanceSampling):
## AIC BIC BICc s.e.
## 641.91 660.96 673.22 0.11
## __________________________________________________
## - - - Iteration 3 - - -
##
## No difference between two successive iterations
## __________________________________________________
## - - - Further tests - - -
## __________________________________________________
##
## Final statistical model:
##
## Covariate model:
## age sex wt
## Tlag 0 0 0
## ka 0 0 0
## V 0 0 1
## Cl 1 0 1
##
## Correlation model:
## [1] "NULL"
##
## Residual error model:
## y1
## "combined2"
##
## Estimated criteria (importanceSampling):
## AIC BIC BICc s.e.
## 641.91 660.96 673.22 0.11
##
## total time: 25.1s
## __________________________________________________
The output of buildmlx
contains the name of the new
project and the new statistical model
print(names(warfPK.res1))
## [1] "project" "niter" "time"
## [4] "covariate.model" "correlation.model" "error.model"
## [7] "change" "weight" "covToTest"
print(warfPK.res1$project)
## [1] "projects/warfarinPK_built.mlxtran"
print(warfPK.res1$covariate.model)
## $Tlag
## sex age wt
## FALSE FALSE FALSE
##
## $ka
## sex age wt
## FALSE FALSE FALSE
##
## $V
## sex age wt
## FALSE FALSE TRUE
##
## $Cl
## sex age wt
## FALSE TRUE TRUE
print(warfPK.res1$correlation.model)
## NULL
print(warfPK.res1$error.model)
## y1
## "combined2"
print(warfPK.res1$niter)
## [1] 3
print(warfPK.res1$time)
## elapsed
## 25.12
Remarks:
The new Monolix project tt>warfarinPK_built.mlxtran
created by buildmlx
can now be loaded and used in
Monolix.
A subfolder buildmlx
is created in the result folder
warfarinPK
of the original project. All the intermediary
runs and a summary are stored in this folder.
In project warfarinPK_built.mlxtran, volume is function of weight. Since \(V_i\) is log-normally distributed, \(\log(V_i)\) is a linear function of \(W_i\) in this model: \[ \log(V_i) = \log(V_{\rm pop}) + \beta_{V,W} \left({W_i}-{W_{\rm pop}} \right) + \eta_{V,i} \] where \(W_{\rm pop}=\bar{W}\) is the empirical mean weight (in other words, continuous covariates are automatically centered by their mean values). We will see below how to transform continuous covariates automatically.
when using ll = TRUE (default), buildmlx
combines SAMBA with statistical tests for building the statistical
model. When using ll = FALSE, basic SAMBA is used: the
objective function is not computed anymore at each iteration and the
tests are not performed. The convergence is faster but the model
obtained may be slightly less good than the one obtained with ll =
TRUE.
We can select the list of covariates to test (the original covariate model is used for the other ones)
warfPK.res2 <- buildmlx(project=warfPK.project, covToTest = "wt")
##
## --------------------------------------------------
##
## Building:
## - The covariate model
## - The correlation model
## - The residual error model
##
## __________________________________________________
## - - - Initialization - - -
##
## Covariate model:
## sex age wt
## Tlag 0 0 0
## ka 0 0 0
## V 0 0 0
## Cl 0 0 0
##
## Correlation model:
## [1] "NULL"
##
## Residual error model:
## y1
## "combined1"
##
## Estimated criteria (importanceSampling):
## AIC BIC BICc s.e.
## 676.94 691.60 703.86 0.11
## __________________________________________________
## - - - Iteration 1 - - -
##
## Covariate model:
## sex age wt
## Tlag 0 0 0
## ka 0 0 0
## V 0 0 1
## Cl 0 0 1
##
## Correlation model:
## [1] "NULL"
##
## Residual error model:
## y1
## "combined2"
##
## Run scenario for model 1 ...
## Estimation of the population parameters...
## Sampling from the conditional distribution...
## Estimation of the log-likelihood...
##
## Estimated criteria (importanceSampling):
## AIC BIC BICc s.e.
## 644.95 662.54 674.80 0.12
## __________________________________________________
## - - - Iteration 2 - - -
##
## No difference between two successive iterations
## __________________________________________________
## - - - Further tests - - -
## __________________________________________________
##
## Final statistical model:
##
## Covariate model:
## age sex wt
## Tlag 0 0 0
## ka 0 0 0
## V 0 0 1
## Cl 0 0 1
##
## Correlation model:
## [1] "NULL"
##
## Residual error model:
## y1
## "combined2"
##
## Estimated criteria (importanceSampling):
## AIC BIC BICc s.e.
## 644.95 662.54 674.80 0.12
##
## total time: 14.3s
## __________________________________________________
We can select the list of parameters for which a covariate model is built
warfPK.res3 <- buildmlx(project=warfPK.project, paramToUse =c("ka", "Cl"))
##
## --------------------------------------------------
##
## Building:
## - The covariate model
## - The correlation model
## - The residual error model
##
## __________________________________________________
## - - - Initialization - - -
##
## Covariate model:
## sex age wt
## Tlag 0 0 0
## ka 0 0 0
## V 0 0 0
## Cl 0 0 0
##
## Correlation model:
## [1] "NULL"
##
## Residual error model:
## y1
## "combined1"
##
## Estimated criteria (importanceSampling):
## AIC BIC BICc s.e.
## 676.94 691.60 703.86 0.11
## __________________________________________________
## - - - Iteration 1 - - -
##
## Covariate model:
## sex age wt
## ka 0 0 0
## Cl 0 1 1
##
## Correlation model:
## [1] "NULL"
##
## Residual error model:
## y1
## "combined2"
##
## Run scenario for model 1 ...
## Estimation of the population parameters...
## Sampling from the conditional distribution...
## Estimation of the log-likelihood...
##
## Estimated criteria (importanceSampling):
## AIC BIC BICc s.e.
## 665.86 683.45 695.72 0.14
## __________________________________________________
## - - - Iteration 2 - - -
##
## No difference between two successive iterations
## __________________________________________________
## - - - Further tests - - -
## __________________________________________________
##
## Final statistical model:
##
## Covariate model:
## age sex wt
## Tlag 0 0 0
## ka 0 0 0
## V 0 0 0
## Cl 1 0 1
##
## Correlation model:
## [1] "NULL"
##
## Residual error model:
## y1
## "combined2"
##
## Estimated criteria (importanceSampling):
## AIC BIC BICc s.e.
## 665.86 683.45 695.72 0.14
##
## total time: 17.9s
## __________________________________________________
Instead of considering that the individual log-parameters are linear functions of the original covariates, we can consider a linear relationship between log-covariates and log-parameters, such as, for instance: \[ \log(V_i) = \log(V_{\rm pop}) + \beta_{V,W} \log\left({W_i}/{W_{\rm pop}} \right) + \beta_{V,A} \log\left({A_i}/{A_{\rm pop}} \right) + \eta_{V,i} \] Another equivalent representation of this model writes \[ V_i = V_{\rm pop}\left(\frac{W_i}{W_{\rm pop}} \right)^{\beta_{V,W}}\left(\frac{A_i}{A_{\rm pop}} \right)^{\beta_{V,A}}e^{\eta_{V,i}} \]
In this example, additional covariates \({\rm lwt}_i=\log({W_i}/{\bar{W}})\) and \({\rm lage}_i=\log({A_i}/{\bar{A}})\), where \(\bar{W}\) and \(\bar{A}\) are the empirical means of \((W_i)\) and \((A_i)\), are automatically created,
warfPK.res4 <- buildmlx(project=warfPK.project, covToTransform = "all")
##
## --------------------------------------------------
##
## Building:
## - The covariate model
## - The correlation model
## - The residual error model
##
## __________________________________________________
## - - - Initialization - - -
##
## Covariate model:
## sex age wt
## Tlag 0 0 0
## ka 0 0 0
## V 0 0 0
## Cl 0 0 0
##
## Correlation model:
## [1] "NULL"
##
## Residual error model:
## y1
## "combined1"
##
## Estimated criteria (importanceSampling):
## AIC BIC BICc s.e.
## 676.94 691.60 703.86 0.11
##
## Estimation of the population parameters using the transformed covariates ...
## Sampling of the conditional distribution using the the transformed covariates ...
## __________________________________________________
## - - - Iteration 1 - - -
##
## Covariate model:
## sex age wt logtAge logtWt
## Tlag 0 0 0 0 0
## ka 0 0 0 0 0
## V 0 0 0 0 1
## Cl 0 0 0 1 1
##
## Correlation model:
## [1] "NULL"
##
## Residual error model:
## y1
## "combined2"
##
## Run scenario for model 1 ...
## Estimation of the population parameters...
## Sampling from the conditional distribution...
## Estimation of the log-likelihood...
##
## Estimated criteria (importanceSampling):
## AIC BIC BICc s.e.
## 640.48 659.53 671.80 0.11
## __________________________________________________
## - - - Iteration 2 - - -
##
## No difference between two successive iterations
## __________________________________________________
## - - - Further tests - - -
## __________________________________________________
##
## Final statistical model:
##
## Covariate model:
## age logtAge logtWt sex wt
## Tlag 0 0 0 0 0
## ka 0 0 0 0 0
## V 0 0 1 0 0
## Cl 0 1 1 0 0
##
## Correlation model:
## [1] "NULL"
##
## Residual error model:
## y1
## "combined2"
##
## Estimated criteria (importanceSampling):
## AIC BIC BICc s.e.
## 640.48 659.53 671.80 0.11
##
## total time: 27.8s
## __________________________________________________
The selected model \(\hat{\cal M}\) minimizes a penalized version of the log-likelihood: \[ U({\cal M}) = - 2\log (\ell({\cal M})) + {\rm pen}({\cal M})\]
The penalty term \({\rm pen}({\cal M})\) depends on the criterion. Let \(d_{\cal M}\) be the number of parameters used in model \({\cal M}\). Then, \({\rm pen}_{AIC}({\cal M}) = 2 d_{\cal M}\) and \({\rm pen}_{BIC}({\cal M}) = \log(N) d_{\cal M}\), where \(N\) is the number of individuals.
The corrected BIC (BICc) combines the number of individuals \(N\) and the total number of observations \(n\) as follows: \({\rm pen}_{BICc}({\cal M}) = \log(N) d_{\cal M}^{(1)} + \log(n) d_{\cal M}^{(2)}\) where
warfPK.res5 <- buildmlx(project=warfPK.project, criterion="AIC")
##
## --------------------------------------------------
##
## Building:
## - The covariate model
## - The correlation model
## - The residual error model
##
## __________________________________________________
## - - - Initialization - - -
##
## Covariate model:
## sex age wt
## Tlag 0 0 0
## ka 0 0 0
## V 0 0 0
## Cl 0 0 0
##
## Correlation model:
## [1] "NULL"
##
## Residual error model:
## y1
## "combined1"
##
## Estimated criteria (importanceSampling):
## AIC BIC BICc s.e.
## 676.94 691.60 703.86 0.11
## __________________________________________________
## - - - Iteration 1 - - -
##
## Covariate model:
## sex age wt
## Tlag 0 0 0
## ka 0 0 0
## V 1 0 1
## Cl 0 1 1
##
## Correlation model:
## [1] "NULL"
##
## Residual error model:
## y1
## "combined2"
##
## Run scenario for model 1 ...
## Estimation of the population parameters...
## Sampling from the conditional distribution...
## Estimation of the log-likelihood...
##
## Estimated criteria (importanceSampling):
## AIC BIC BICc s.e.
## 642.32 662.84 675.10 0.08
## __________________________________________________
## - - - Iteration 2 - - -
##
## No difference between two successive iterations
## __________________________________________________
## - - - Further tests - - -
## _______________________
## Remove parameters/covariates relationships:
## coefficient p.value
## 2 beta_V_sex_1 0.144528
##
## Run scenario for model 3 ...
## Estimation of the population parameters...
## Sampling from the conditional distribution...
## Estimation of the log-likelihood...
##
## Estimated criteria (importanceSampling):
## AIC BIC BICc s.e.
## 642.03 661.08 673.34 0.09
## __________________________________________________
##
## Final statistical model:
##
## Covariate model:
## age sex wt
## Tlag 0 0 0
## ka 0 0 0
## V 0 0 1
## Cl 1 0 1
##
## Correlation model:
## [1] "NULL"
##
## Residual error model:
## y1
## "combined2"
##
## Estimated criteria (importanceSampling):
## AIC BIC BICc s.e.
## 642.03 661.08 673.34 0.09
##
## total time: 24.7s
## __________________________________________________
Prior information about the statistical model can be introduced either by defining prior probabilities or introducing a weighting for the penalty term.
In this example, relationships between covariates and parameters are strongly penalized (weight=2) while correlations are lightly penalized (weight=0.1).
warfPK.res6 <- buildmlx(project=warfPK.project, weight=list(covariate=2, correlation=0.1), test=F, ll=F)
##
## --------------------------------------------------
##
## Building:
## - The covariate model
## - The correlation model
## - The residual error model
##
## __________________________________________________
## - - - Initialization - - -
##
## Covariate model:
## sex age wt
## Tlag 0 0 0
## ka 0 0 0
## V 0 0 0
## Cl 0 0 0
##
## Correlation model:
## [1] "NULL"
##
## Residual error model:
## y1
## "combined1"
## __________________________________________________
## - - - Iteration 1 - - -
##
## Covariate model:
## sex age wt
## Tlag 0 0 0
## ka 0 0 0
## V 0 0 1
## Cl 0 1 1
##
## Correlation model:
## [[1]]
## [1] "Cl" "ka" "V"
##
##
## Residual error model:
## y1
## "combined2"
##
## Run scenario for model 1 ...
## Estimation of the population parameters...
## Sampling from the conditional distribution...
## __________________________________________________
## - - - Iteration 2 - - -
##
## Covariate model:
## sex age wt
## Tlag 0 0 0
## ka 0 0 0
## V 0 0 1
## Cl 0 0 0
##
## Correlation model:
## [[1]]
## [1] "Cl" "ka" "V"
##
##
## Residual error model:
## y1
## "combined2"
##
## Run scenario for model 2 ...
## Estimation of the population parameters...
## Sampling from the conditional distribution...
## __________________________________________________
## - - - Iteration 3 - - -
##
## No difference between two successive iterations
## __________________________________________________
##
## Final statistical model:
##
## Covariate model:
## age sex wt
## Tlag 0 0 0
## ka 0 0 0
## V 0 0 1
## Cl 0 0 0
##
## Correlation model:
## [[1]]
## [1] "Cl" "ka" "V"
##
##
## Residual error model:
## y1
## "combined2"
##
## total time: 19.7s
## __________________________________________________
`
Different weights can be used by defining weight matrices.
In this example, relationship between V and SEX is privileged while that between V and WT is strongly penalized:
library(lixoftConnectors)
initializeLixoftConnectors()
loadProject(warfPK.project)
w.cov <- do.call(rbind, getIndividualParameterModel()$covariateModel)
w.cov[] <- 1
w.cov['V', 'sex'] <- 0.2
w.cov['V', 'wt'] <- 5
print(w.cov)
## sex age wt
## Tlag 1.0 1 1
## ka 1.0 1 1
## V 0.2 1 5
## Cl 1.0 1 1
Correlations between \(\eta_V\) and \(\eta_{Cl}\) and between \(\eta_{ka}\) and \(\eta_{Tlag}\) are both privileged:
n.param <- names(which(getIndividualParameterModel()$variability$id))
d.param <- length(n.param)
w.cor <- matrix(1, nrow=d.param, ncol=d.param, dimnames=list(n.param, n.param))
w.cor['V','Cl'] <- w.cor['Cl','V'] <- 0.1
w.cor['Tlag','ka'] <- w.cor['ka','Tlag'] <- 0.01
print(w.cor)
## Tlag ka V Cl
## Tlag 1.00 0.01 1.0 1.0
## ka 0.01 1.00 1.0 1.0
## V 1.00 1.00 1.0 0.1
## Cl 1.00 1.00 0.1 1.0
warfPK.res7 <- buildmlx(project=warfPK.project, weight=list(covariate=w.cov, correlation=w.cor), test=F, ll=F)
##
## --------------------------------------------------
##
## Building:
## - The covariate model
## - The correlation model
## - The residual error model
##
## __________________________________________________
## - - - Initialization - - -
##
## Covariate model:
## sex age wt
## Tlag 0 0 0
## ka 0 0 0
## V 0 0 0
## Cl 0 0 0
##
## Correlation model:
## [1] "NULL"
##
## Residual error model:
## y1
## "combined1"
## __________________________________________________
## - - - Iteration 1 - - -
##
## Covariate model:
## sex age wt
## Tlag 0 0 0
## ka 0 0 0
## V 1 0 1
## Cl 0 1 1
##
## Correlation model:
## [[1]]
## [1] "Cl" "V"
##
##
## Residual error model:
## y1
## "combined2"
##
## Run scenario for model 1 ...
## Estimation of the population parameters...
## Sampling from the conditional distribution...
## __________________________________________________
## - - - Iteration 2 - - -
##
## Covariate model:
## sex age wt
## Tlag 0 0 0
## ka 0 0 0
## V 1 0 0
## Cl 0 1 1
##
## Correlation model:
## [[1]]
## [1] "ka" "Tlag"
##
## [[2]]
## [1] "Cl" "V"
##
##
## Residual error model:
## y1
## "combined2"
##
## Run scenario for model 2 ...
## Estimation of the population parameters...
## Sampling from the conditional distribution...
## __________________________________________________
## - - - Iteration 3 - - -
##
## No difference between two successive iterations
## __________________________________________________
##
## Final statistical model:
##
## Covariate model:
## age sex wt
## Tlag 0 0 0
## ka 0 0 0
## V 0 1 0
## Cl 1 0 1
##
## Correlation model:
## [[1]]
## [1] "ka" "Tlag"
##
## [[2]]
## [1] "Cl" "V"
##
##
## Residual error model:
## y1
## "combined2"
##
## total time: 17.4s
## __________________________________________________
Using BIC (or BICc) is equivalent to define a prior distribution for the statistical model, where the probability that a \(\beta\) coefficient or that a correlation is different from 0 is equal to \(1/(1+\sqrt(N))\). Then, the model that minimizes the BIC (or BICc) criteria is the one that maximizes the posterior distribution associated to this prior.
It is then possible to define different prior probabilities for selecting this model. Here, the reference prior probability associated to BIC (or BICc) is \(1/(1+\sqrt(32)) = 0.15)\). Relationships between covariates and parameters are penalized in the following example, while correlations between random effects are favored:
warfPK.res8 <- buildmlx(project=warfPK.project, prior=list(covariate=0.05, correlation=0.45), test=F, ll=F)
##
## --------------------------------------------------
##
## Building:
## - The covariate model
## - The correlation model
## - The residual error model
##
## __________________________________________________
## - - - Initialization - - -
##
## Covariate model:
## sex age wt
## Tlag 0 0 0
## ka 0 0 0
## V 0 0 0
## Cl 0 0 0
##
## Correlation model:
## [1] "NULL"
##
## Residual error model:
## y1
## "combined1"
## __________________________________________________
## - - - Iteration 1 - - -
##
## Covariate model:
## sex age wt
## Tlag 0 0 0
## ka 0 0 0
## V 0 0 1
## Cl 0 1 1
##
## Correlation model:
## [[1]]
## [1] "Cl" "ka"
##
##
## Residual error model:
## y1
## "combined2"
##
## Run scenario for model 1 ...
## Estimation of the population parameters...
## Sampling from the conditional distribution...
## __________________________________________________
## - - - Iteration 2 - - -
##
## Covariate model:
## sex age wt
## Tlag 0 0 0
## ka 0 0 0
## V 0 0 1
## Cl 0 0 1
##
## Correlation model:
## [[1]]
## [1] "Cl" "ka"
##
##
## Residual error model:
## y1
## "combined2"
##
## Run scenario for model 2 ...
## Estimation of the population parameters...
## Sampling from the conditional distribution...
## __________________________________________________
## - - - Iteration 3 - - -
##
## Covariate model:
## sex age wt
## Tlag 0 0 0
## ka 0 0 0
## V 0 0 1
## Cl 0 0 0
##
## Correlation model:
## [[1]]
## [1] "Cl" "ka"
##
##
## Residual error model:
## y1
## "combined2"
##
## Run scenario for model 3 ...
## Estimation of the population parameters...
## Sampling from the conditional distribution...
## __________________________________________________
## - - - Iteration 4 - - -
##
## No difference between two successive iterations
## __________________________________________________
##
## Final statistical model:
##
## Covariate model:
## age sex wt
## Tlag 0 0 0
## ka 0 0 0
## V 0 0 1
## Cl 0 0 0
##
## Correlation model:
## [[1]]
## [1] "Cl" "ka"
##
##
## Residual error model:
## y1
## "combined2"
##
## total time: 21.8s
## __________________________________________________
As we did with the weights, different prior probabilities can be defined for the covariate model and for the correlation model:
N <- 32
p0 <- 1/(1+sqrt(N))
p.cov <- do.call(rbind, getIndividualParameterModel()$covariateModel)
p.cov[] <- p0
p.cov['V', 'sex'] <- p0*5
p.cov['V', 'wt'] <- p0/1000
print(p.cov)
## sex age wt
## Tlag 0.1502211 0.1502211 0.1502211048
## ka 0.1502211 0.1502211 0.1502211048
## V 0.7511055 0.1502211 0.0001502211
## Cl 0.1502211 0.1502211 0.1502211048
p.cor <- matrix(p0, nrow=d.param, ncol=d.param, dimnames=list(n.param, n.param))
p.cor['V','Cl'] <- p.cor['Cl','V'] <- p0*5
print(p.cor)
## Tlag ka V Cl
## Tlag 0.1502211 0.1502211 0.1502211 0.1502211
## ka 0.1502211 0.1502211 0.1502211 0.1502211
## V 0.1502211 0.1502211 0.1502211 0.7511055
## Cl 0.1502211 0.1502211 0.7511055 0.1502211
warfPK.res9 <- buildmlx(project=warfPK.project, prior=list(covariate=p.cov, correlation=p.cor), test=F, ll=F)
##
## --------------------------------------------------
##
## Building:
## - The covariate model
## - The correlation model
## - The residual error model
##
## __________________________________________________
## - - - Initialization - - -
##
## Covariate model:
## sex age wt
## Tlag 0 0 0
## ka 0 0 0
## V 0 0 0
## Cl 0 0 0
##
## Correlation model:
## [1] "NULL"
##
## Residual error model:
## y1
## "combined1"
## __________________________________________________
## - - - Iteration 1 - - -
##
## Covariate model:
## sex age wt
## Tlag 0 0 0
## ka 0 0 0
## V 1 0 1
## Cl 0 1 1
##
## Correlation model:
## [[1]]
## [1] "Cl" "V"
##
##
## Residual error model:
## y1
## "combined2"
##
## Run scenario for model 1 ...
## Estimation of the population parameters...
## Sampling from the conditional distribution...
## __________________________________________________
## - - - Iteration 2 - - -
##
## Covariate model:
## sex age wt
## Tlag 0 0 0
## ka 0 0 0
## V 1 0 0
## Cl 0 1 1
##
## Correlation model:
## [[1]]
## [1] "Cl" "V"
##
##
## Residual error model:
## y1
## "combined2"
##
## Run scenario for model 2 ...
## Estimation of the population parameters...
## Sampling from the conditional distribution...
## __________________________________________________
## - - - Iteration 3 - - -
##
## No difference between two successive iterations
## __________________________________________________
##
## Final statistical model:
##
## Covariate model:
## age sex wt
## Tlag 0 0 0
## ka 0 0 0
## V 0 1 0
## Cl 1 0 1
##
## Correlation model:
## [[1]]
## [1] "Cl" "V"
##
##
## Residual error model:
## y1
## "combined2"
##
## total time: 15.8s
## __________________________________________________
SAMBA implemented in buildmlx
is very powerful when a
large number of covariates are available. In this new example, 40
artificial continuous and categorical covariates have been added to the
original warfarin datafile.
warfPK40.project <- "projects/warfarinPK_40covariates.mlxtran"
The total number of possible covariate models is really huge but
buildmlx
will find the solution very quickly:
warfPK40.res1 <- buildmlx(warfPK40.project, covToTransform = c("age", "wt"))
##
## --------------------------------------------------
##
## Building:
## - The covariate model
## - The correlation model
## - The residual error model
##
## __________________________________________________
## - - - Initialization - - -
##
## Covariate model:
## cat1 cat10 cat11 cat12 cat13 cat14 cat15 cat16 cat17 cat18 cat19 cat2
## Tlag 0 0 0 0 0 0 0 0 0 0 0 0
## ka 0 0 0 0 0 0 0 0 0 0 0 0
## V 0 0 0 0 0 0 0 0 0 0 0 0
## Cl 0 0 0 0 0 0 0 0 0 0 0 0
## cat20 cat3 cat4 cat5 cat6 cat7 cat8 cat9 sex age cont1 cont10 cont11
## Tlag 0 0 0 0 0 0 0 0 0 0 0 0 0
## ka 0 0 0 0 0 0 0 0 0 0 0 0 0
## V 0 0 0 0 0 0 0 0 0 0 0 0 0
## Cl 0 0 0 0 0 0 0 0 0 0 0 0 0
## cont12 cont13 cont14 cont15 cont16 cont17 cont18 cont19 cont2 cont20 cont3
## Tlag 0 0 0 0 0 0 0 0 0 0 0
## ka 0 0 0 0 0 0 0 0 0 0 0
## V 0 0 0 0 0 0 0 0 0 0 0
## Cl 0 0 0 0 0 0 0 0 0 0 0
## cont4 cont5 cont6 cont7 cont8 cont9 wt
## Tlag 0 0 0 0 0 0 0
## ka 0 0 0 0 0 0 0
## V 0 0 0 0 0 0 0
## Cl 0 0 0 0 0 0 0
##
## Correlation model:
## [1] "NULL"
##
## Residual error model:
## y1
## "constant"
##
## Estimated criteria (importanceSampling):
## AIC BIC BICc s.e.
## 715.76 728.95 739.17 0.13
##
## Estimation of the population parameters using the transformed covariates ...
## Sampling of the conditional distribution using the the transformed covariates ...
## __________________________________________________
## - - - Iteration 1 - - -
##
## Covariate model:
## cat1 cat10 cat11 cat12 cat13 cat14 cat15 cat16 cat17 cat18 cat19 cat2
## Tlag 0 0 0 0 0 0 0 0 0 0 0 0
## ka 0 0 0 0 0 0 0 0 0 0 0 0
## V 0 0 0 0 0 0 0 0 0 0 0 0
## Cl 0 0 0 1 0 0 1 0 0 0 0 0
## cat20 cat3 cat4 cat5 cat6 cat7 cat8 cat9 sex age cont1 cont10 cont11
## Tlag 0 0 0 0 0 0 0 0 0 0 0 0 0
## ka 0 0 0 0 0 0 0 0 0 0 0 0 0
## V 0 0 0 0 0 0 1 0 0 0 1 0 0
## Cl 0 0 0 0 0 0 0 0 0 0 0 1 0
## cont12 cont13 cont14 cont15 cont16 cont17 cont18 cont19 cont2 cont20 cont3
## Tlag 0 0 0 0 0 0 0 0 0 0 0
## ka 0 0 0 0 0 0 0 0 0 0 0
## V 0 0 0 0 0 0 0 0 0 0 0
## Cl 0 0 0 0 0 0 0 0 0 1 0
## cont4 cont5 cont6 cont7 cont8 cont9 wt logtAge logtWt
## Tlag 0 0 0 0 0 0 0 0 0
## ka 0 0 0 0 0 0 0 0 0
## V 0 0 0 0 0 0 0 0 1
## Cl 0 0 0 0 0 0 0 1 1
##
## Correlation model:
## [1] "NULL"
##
## Residual error model:
## y1
## "combined2"
##
## Run scenario for model 1 ...
## Estimation of the population parameters...
## Sampling from the conditional distribution...
## Estimation of the log-likelihood...
##
## Estimated criteria (importanceSampling):
## AIC BIC BICc s.e.
## 630.19 660.97 673.23 0.13
## __________________________________________________
## - - - Iteration 2 - - -
##
## Covariate model:
## cat1 cat10 cat11 cat12 cat13 cat14 cat15 cat16 cat17 cat18 cat19 cat2
## Tlag 0 0 0 0 0 0 0 0 0 0 0 0
## ka 0 0 0 0 0 0 0 0 0 0 0 0
## V 0 0 0 0 0 0 0 0 0 0 0 0
## Cl 0 0 0 1 0 0 1 0 0 0 0 0
## cat20 cat3 cat4 cat5 cat6 cat7 cat8 cat9 sex age cont1 cont10 cont11
## Tlag 0 0 0 0 0 0 0 0 0 0 0 0 0
## ka 0 0 0 0 0 0 0 0 0 0 0 0 0
## V 0 0 0 0 0 0 1 0 0 0 0 0 0
## Cl 0 0 0 0 0 0 0 0 0 0 1 1 0
## cont12 cont13 cont14 cont15 cont16 cont17 cont18 cont19 cont2 cont20 cont3
## Tlag 0 0 0 0 0 0 0 0 0 0 0
## ka 0 0 0 0 0 0 0 0 0 0 0
## V 0 0 0 0 0 0 0 0 0 0 0
## Cl 0 0 0 0 0 0 0 0 0 0 0
## cont4 cont5 cont6 cont7 cont8 cont9 wt logtAge logtWt
## Tlag 0 0 0 0 0 0 0 0 0
## ka 0 0 0 0 0 0 0 0 0
## V 0 0 0 0 0 0 0 0 1
## Cl 0 0 0 0 0 0 0 1 1
##
## Correlation model:
## [1] "NULL"
##
## Residual error model:
## y1
## "combined2"
##
## Run scenario for model 2 ...
## Estimation of the population parameters...
## Sampling from the conditional distribution...
## Estimation of the log-likelihood...
##
## Estimated criteria (importanceSampling):
## AIC BIC BICc s.e.
## 628.77 658.08 670.35 0.09
## __________________________________________________
## - - - Iteration 3 - - -
##
## No difference between two successive iterations
## __________________________________________________
## - - - Further tests - - -
## _______________________
## Add parameters/covariates relationships:
## parameter covariate p.value
## 13 Tlag cat20 0.004014
## 32 Tlag cont19 0.036530
## 35 Tlag cont3 0.032310
## 74 ka cont18 0.047310
## 97 V cat19 0.002150
##
## Run scenario for model 4 ...
## Estimation of the population parameters...
## _______________________
## Remove parameters/covariates relationships:
## coefficient p.value
## 3 beta_Tlag_cat20_1 0.844966
## 4 beta_Tlag_cat20_2 0.81646
## 2 beta_Tlag_cont3 0.849055
## 5 beta_ka_cont18 0.273821
## 7 beta_V_cat19_1 0.741462
## 8 beta_V_cat19_2 0.269861
##
## Run scenario for model 5 ...
## Estimation of the population parameters...
## Sampling from the conditional distribution...
## Estimation of the log-likelihood...
##
## Estimated criteria (importanceSampling):
## AIC BIC BICc s.e.
## 622.31 653.09 665.35 0.11
## _______________________
## Add correlation:
## randomEffect.1 randomEffect.2 correlation p.value p.wald_lin p.wald_SA
## 5 eta_ka eta_Cl -0.2915513 0.06859 NaN NaN
## in.model
## 5 FALSE
## [[1]]
## [1] "ka" "Cl"
##
##
## Run scenario for model 6 ...
## Estimation of the population parameters...
## Sampling from the conditional distribution...
## Estimation of the log-likelihood...
##
## Estimated criteria (importanceSampling):
## AIC BIC BICc s.e.
## 616.19 648.44 660.70 0.08
## __________________________________________________
##
## Final statistical model:
##
## Covariate model:
## age cat1 cat10 cat11 cat12 cat13 cat14 cat15 cat16 cat17 cat18 cat19 cat2
## Tlag 0 0 0 0 0 0 0 0 0 0 0 0 0
## ka 0 0 0 0 0 0 0 0 0 0 0 0 0
## V 0 0 0 0 0 0 0 0 0 0 0 0 0
## Cl 0 0 0 0 1 0 0 1 0 0 0 0 0
## cat20 cat3 cat4 cat5 cat6 cat7 cat8 cat9 cont1 cont10 cont11 cont12 cont13
## Tlag 0 0 0 0 0 0 0 0 0 0 0 0 0
## ka 0 0 0 0 0 0 0 0 0 0 0 0 0
## V 0 0 0 0 0 0 1 0 0 0 0 0 0
## Cl 0 0 0 0 0 0 0 0 1 1 0 0 0
## cont14 cont15 cont16 cont17 cont18 cont19 cont2 cont20 cont3 cont4 cont5
## Tlag 0 0 0 0 0 1 0 0 0 0 0
## ka 0 0 0 0 0 0 0 0 0 0 0
## V 0 0 0 0 0 0 0 0 0 0 0
## Cl 0 0 0 0 0 0 0 0 0 0 0
## cont6 cont7 cont8 cont9 logtAge logtWt sex wt
## Tlag 0 0 0 0 0 0 0 0
## ka 0 0 0 0 0 0 0 0
## V 0 0 0 0 0 1 0 0
## Cl 0 0 0 0 1 1 0 0
##
## Correlation model:
## [[1]]
## [1] "Cl" "ka"
##
##
## Residual error model:
## y1
## "combined2"
##
## Estimated criteria (importanceSampling):
## AIC BIC BICc s.e.
## 616.19 648.44 660.70 0.08
##
## total time: 72.7s
## __________________________________________________
Log-weight and log-age are found again by buildmlx
, as
with the original data. There are only 6 additional false discoveries in
the final model (among the 160 possible false discoveries).
Note that buildmlx
performs some additional statistical
tests in order to improve the selected model. You may choose not to
perform these additional tests. The selected model is then ``slightly
worse’’ in terms of BICc:
warfPK40.res2 <- buildmlx(warfPK40.project, covToTransform = c("age", "wt"), test=F)
##
## --------------------------------------------------
##
## Building:
## - The covariate model
## - The correlation model
## - The residual error model
##
## __________________________________________________
## - - - Initialization - - -
##
## Covariate model:
## cat1 cat10 cat11 cat12 cat13 cat14 cat15 cat16 cat17 cat18 cat19 cat2
## Tlag 0 0 0 0 0 0 0 0 0 0 0 0
## ka 0 0 0 0 0 0 0 0 0 0 0 0
## V 0 0 0 0 0 0 0 0 0 0 0 0
## Cl 0 0 0 0 0 0 0 0 0 0 0 0
## cat20 cat3 cat4 cat5 cat6 cat7 cat8 cat9 sex age cont1 cont10 cont11
## Tlag 0 0 0 0 0 0 0 0 0 0 0 0 0
## ka 0 0 0 0 0 0 0 0 0 0 0 0 0
## V 0 0 0 0 0 0 0 0 0 0 0 0 0
## Cl 0 0 0 0 0 0 0 0 0 0 0 0 0
## cont12 cont13 cont14 cont15 cont16 cont17 cont18 cont19 cont2 cont20 cont3
## Tlag 0 0 0 0 0 0 0 0 0 0 0
## ka 0 0 0 0 0 0 0 0 0 0 0
## V 0 0 0 0 0 0 0 0 0 0 0
## Cl 0 0 0 0 0 0 0 0 0 0 0
## cont4 cont5 cont6 cont7 cont8 cont9 wt
## Tlag 0 0 0 0 0 0 0
## ka 0 0 0 0 0 0 0
## V 0 0 0 0 0 0 0
## Cl 0 0 0 0 0 0 0
##
## Correlation model:
## [1] "NULL"
##
## Residual error model:
## y1
## "constant"
##
## Estimated criteria (importanceSampling):
## AIC BIC BICc s.e.
## 715.76 728.95 739.17 0.13
##
## Estimation of the population parameters using the transformed covariates ...
## Sampling of the conditional distribution using the the transformed covariates ...
## __________________________________________________
## - - - Iteration 1 - - -
##
## Covariate model:
## cat1 cat10 cat11 cat12 cat13 cat14 cat15 cat16 cat17 cat18 cat19 cat2
## Tlag 0 0 0 0 0 0 0 0 0 0 0 0
## ka 0 0 0 0 0 0 0 0 0 0 0 0
## V 0 0 0 0 0 0 0 0 0 0 0 0
## Cl 0 0 0 1 0 0 1 0 0 0 0 0
## cat20 cat3 cat4 cat5 cat6 cat7 cat8 cat9 sex age cont1 cont10 cont11
## Tlag 0 0 0 0 0 0 0 0 0 0 0 0 0
## ka 0 0 0 0 0 0 0 0 0 0 0 0 0
## V 0 0 0 0 0 0 1 0 0 0 1 0 0
## Cl 0 0 0 0 0 0 0 0 0 0 0 1 0
## cont12 cont13 cont14 cont15 cont16 cont17 cont18 cont19 cont2 cont20 cont3
## Tlag 0 0 0 0 0 0 0 0 0 0 0
## ka 0 0 0 0 0 0 0 0 0 0 0
## V 0 0 0 0 0 0 0 0 0 0 0
## Cl 0 0 0 0 0 0 0 0 0 1 0
## cont4 cont5 cont6 cont7 cont8 cont9 wt logtAge logtWt
## Tlag 0 0 0 0 0 0 0 0 0
## ka 0 0 0 0 0 0 0 0 0
## V 0 0 0 0 0 0 0 0 1
## Cl 0 0 0 0 0 0 0 1 1
##
## Correlation model:
## [1] "NULL"
##
## Residual error model:
## y1
## "combined2"
##
## Run scenario for model 1 ...
## Estimation of the population parameters...
## Sampling from the conditional distribution...
## Estimation of the log-likelihood...
##
## Estimated criteria (importanceSampling):
## AIC BIC BICc s.e.
## 630.19 660.97 673.23 0.13
## __________________________________________________
## - - - Iteration 2 - - -
##
## Covariate model:
## cat1 cat10 cat11 cat12 cat13 cat14 cat15 cat16 cat17 cat18 cat19 cat2
## Tlag 0 0 0 0 0 0 0 0 0 0 0 0
## ka 0 0 0 0 0 0 0 0 0 0 0 0
## V 0 0 0 0 0 0 0 0 0 0 0 0
## Cl 0 0 0 1 0 0 1 0 0 0 0 0
## cat20 cat3 cat4 cat5 cat6 cat7 cat8 cat9 sex age cont1 cont10 cont11
## Tlag 0 0 0 0 0 0 0 0 0 0 0 0 0
## ka 0 0 0 0 0 0 0 0 0 0 0 0 0
## V 0 0 0 0 0 0 1 0 0 0 0 0 0
## Cl 0 0 0 0 0 0 0 0 0 0 1 1 0
## cont12 cont13 cont14 cont15 cont16 cont17 cont18 cont19 cont2 cont20 cont3
## Tlag 0 0 0 0 0 0 0 0 0 0 0
## ka 0 0 0 0 0 0 0 0 0 0 0
## V 0 0 0 0 0 0 0 0 0 0 0
## Cl 0 0 0 0 0 0 0 0 0 0 0
## cont4 cont5 cont6 cont7 cont8 cont9 wt logtAge logtWt
## Tlag 0 0 0 0 0 0 0 0 0
## ka 0 0 0 0 0 0 0 0 0
## V 0 0 0 0 0 0 0 0 1
## Cl 0 0 0 0 0 0 0 1 1
##
## Correlation model:
## [1] "NULL"
##
## Residual error model:
## y1
## "combined2"
##
## Run scenario for model 2 ...
## Estimation of the population parameters...
## Sampling from the conditional distribution...
## Estimation of the log-likelihood...
##
## Estimated criteria (importanceSampling):
## AIC BIC BICc s.e.
## 628.77 658.08 670.35 0.09
## __________________________________________________
## - - - Iteration 3 - - -
##
## No difference between two successive iterations
## __________________________________________________
##
## Final statistical model:
##
## Covariate model:
## age cat1 cat10 cat11 cat12 cat13 cat14 cat15 cat16 cat17 cat18 cat19 cat2
## Tlag 0 0 0 0 0 0 0 0 0 0 0 0 0
## ka 0 0 0 0 0 0 0 0 0 0 0 0 0
## V 0 0 0 0 0 0 0 0 0 0 0 0 0
## Cl 0 0 0 0 1 0 0 1 0 0 0 0 0
## cat20 cat3 cat4 cat5 cat6 cat7 cat8 cat9 cont1 cont10 cont11 cont12 cont13
## Tlag 0 0 0 0 0 0 0 0 0 0 0 0 0
## ka 0 0 0 0 0 0 0 0 0 0 0 0 0
## V 0 0 0 0 0 0 1 0 0 0 0 0 0
## Cl 0 0 0 0 0 0 0 0 1 1 0 0 0
## cont14 cont15 cont16 cont17 cont18 cont19 cont2 cont20 cont3 cont4 cont5
## Tlag 0 0 0 0 0 0 0 0 0 0 0
## ka 0 0 0 0 0 0 0 0 0 0 0
## V 0 0 0 0 0 0 0 0 0 0 0
## Cl 0 0 0 0 0 0 0 0 0 0 0
## cont6 cont7 cont8 cont9 logtAge logtWt sex wt
## Tlag 0 0 0 0 0 0 0 0
## ka 0 0 0 0 0 0 0 0
## V 0 0 0 0 0 1 0 0
## Cl 0 0 0 0 1 1 0 0
##
## Correlation model:
## [1] "NULL"
##
## Residual error model:
## y1
## "combined2"
##
## Estimated criteria (importanceSampling):
## AIC BIC BICc s.e.
## 628.77 658.08 670.35 0.09
##
## total time: 40.5s
## __________________________________________________
```