library(mrgsolve)
library(dplyr)
library(tidyr)
library(data.table)
library(here)
options(bbr.verbose = TRUE, mrgsolve.project = here("model/pk"))
1 Introduction
We frequently use mrgsolve to simulate from models estimated in NONMEM. This requires translation of the model from the NONMEM control stream into mrgsolve format.
There is a straightforward way to confirm correct coding of the mrgsolve model once the NONMEM model is in hand. This blog post will show you how to do that.
2 Data
The validation of the mrgsolve model coding is driven by PRED
which was tabled when the NONMEM run finished. This is the standard against which we will evaluate the mrgsolve model.
<- fread(here("model/pk/106/106.tab"), skip = 1, na.strings = '.')
tab head(tab)
NUM IPRED NPDE CWRES DV PRED RES
<num> <num> <num> <num> <num> <num> <num>
1: 1 0.00000 0.00000000 0.00000000 0.000 0.00000 0.0000000
2: 2 68.53183 -0.53401190 -0.51213920 61.005 60.58323 0.4217692
3: 3 90.81217 0.27931900 0.12607810 90.976 78.54761 12.4283900
4: 4 97.27639 1.55477400 1.44380800 122.210 83.03988 39.1701200
5: 5 96.73790 1.88079400 1.69356600 126.090 82.22198 43.8680200
6: 6 88.70574 0.06689328 -0.05272759 84.682 75.48693 9.1950690
WRES
<num>
1: 0.0000000
2: -0.5334058
3: 0.1424843
4: 1.6307490
5: 1.9132660
6: -0.0940618
We want to make sure we have the same input data for the validation that we had for the model estimation. So we will read in the analysis data
<- fread(here("data/derived/analysis3.csv"), na.strings = '.')
nmdata <- select(nmdata, -DV) nmdata
and then merge it on by a row counter called NUM
which uniquely connects output rows to input rows
<- left_join(tab, nmdata, by = "NUM") data
Note that in the join, tab
is on the left so that the resulting data
drops the records from nmdata
that didn’t make it into the analysis.
Now, we have the analysis data set with PRED
(from NONMEM) as a column
select(data, ID, NUM, TIME, EVID, AMT, CMT, DV, PRED)
ID NUM TIME EVID AMT CMT DV PRED
<int> <int> <num> <int> <int> <int> <num> <num>
1: 1 1 0.00 1 5 1 0.000 0.00000
2: 1 2 0.61 0 NA 2 61.005 60.58323
3: 1 3 1.15 0 NA 2 90.976 78.54761
4: 1 4 1.73 0 NA 2 122.210 83.03988
5: 1 5 2.15 0 NA 2 126.090 82.22198
---
4288: 160 4356 60.09 0 NA 2 99.059 39.55937
4289: 160 4357 72.03 0 NA 2 64.482 33.24345
4290: 160 4358 84.18 0 NA 2 77.621 27.89484
4291: 160 4359 96.02 0 NA 2 33.907 23.52130
4292: 160 4360 120.09 0 NA 2 36.249 16.63423
This data set contains 3100+ concentrations
count(data, EVID)
EVID n
<int> <int>
1: 0 3142
2: 1 1150
2.1 bbr::nm_join
If you are using bbr as your modeling platform, you can get the same data set with a one-liner call to bbr::nm_join()
<- bbr::nm_join(here("model/pk/106")) data
3 Model
The model is a standard two-compartment PK model from run 106
; we’ve already coded the model here
<- mread_cache("106.txt") mod
[ prob ]
106-104 + COV-effects(CRCL, AGE) on CL
[ pkmodel ] cmt = "GUT,CENT,PERIPH", depot = TRUE
[ param ]
= 70
WT = 90
EGFR = 4.5
ALB = 35
AGE
[ nmxml ]
= "106/106.xml"
path = "cppfile"
root
[ main ]
double V2WT = log(WT/70.0);
double CLWT = log(WT/70.0)*0.75;
double CLEGFR = log(EGFR/90.0)*THETA6;
double CLAGE = log(AGE/35.0)*THETA7;
double V3WT = log(WT/70.0);
double QWT = log(WT/70.0)*0.75;
double CLALB = log(ALB/4.5)*THETA8;
double KA = exp(THETA1+ETA(1));
double V2 = exp(THETA2+V2WT+ETA(2));
double CL = exp(THETA3+CLWT+CLEGFR+CLAGE+CLALB+ETA(3));
double V3 = exp(THETA4+V3WT);
double Q = exp(THETA5+QWT);
double S2 = V2/1000.0; //; dose in mcg, conc in mcg/mL
[ table ]
= CENT/S2;
capture IPRED = IPRED * (1+EPS(1)); capture Y
4 Simulate
Now, simulate from the mrgsolve model
<- mrgsim(zero_re(mod), data, carry_out = "PRED,EVID", digits = 7)
out head(out)
ID TIME EVID PRED GUT CENT PERIPH IPRED Y
1 1 0.00 1 0.00000 5.00000000 0.000000 0.00000000 0.00000 0.00000
2 1 0.61 0 60.58323 1.93352200 2.935854 0.06481082 60.58323 60.58323
3 1 1.15 0 78.54761 0.83383100 3.806404 0.17738390 78.54761 78.54761
4 1 1.73 0 83.03988 0.33787020 4.024099 0.31227170 83.03988 83.03988
5 1 2.15 0 82.22198 0.17565080 3.984464 0.40873200 82.22198 82.22198
6 1 3.19 0 75.48693 0.03476685 3.658084 0.62658760 75.48693 75.48693
Some important points to notice about this simulation
- We wrap the model object in
zero_re()
; this drops all the random effects from the simulation so that simulated concentrations are equivalent toPRED
regardless if we are looking atDV
(orY
) orIPRED
- We bring
PRED
andEVID
into the simulated output; we needPRED
to to compare against andEVID
is needed so we can filter out dosing records - We deliberately set the number of significant
digits
in the simulated result; this is really important if you want a sensitive validation index
5 Compare
Keeping only the observation records, we make a simple summary of the difference between the prediction generated by mrgsolve (Y
) and the value generated by NONMEM (PRED
)
<- filter(out, EVID==0)
obs summary(obs$Y - obs$PRED)
Min. 1st Qu. Median Mean 3rd Qu. Max.
0 0 0 0 0 0
We see there is exact match up to the level of precision that we’ve requested (See Appendix).
This can be presented graphically as well
plot(obs$Y, obs$PRED)
Here we’ve looked at absolute differences between mrgsolve and NONMEM; if you do find a discrepancy, you might evaluate that as a percentage of the value of PRED
(percent difference).
We frequently see exact matches when models are coded with analytical solutions. This is never the case when coding models from ODEs. But even with ODE models you should expect to see only an extremely small discrepancy … out to a very distant digit.
6 Complete workflow
We’ve illustrated this step by step, explaining how it works. But wanted also to show you how little code is required once you have the NONMEM and mrgsolve models.
<- mread_cache("106.txt")
mod <- mrgsim(zero_re(mod), data, carry_out = "PRED,EVID", digits = 7)
out <- filter(out, EVID==0)
obs summary(obs$Y - obs$PRED)
Min. 1st Qu. Median Mean 3rd Qu. Max.
0 0 0 0 0 0
We have this done in 5 lines and objects generated from 2 of those lines will be used to run the production simulation.
7 Discussion
It is important to note that this validation procedure only considers model PRED
given a clinical data set. This setup evaluates dosing events, covariate effects, structural model setup and the like.
Note that this setup does not look at whether or not the random effects are coded correctly. Certainly there can be mistakes in coding the random effects, but in my experience, most errors in translation are related to input data sets, covariate models or errors with the structural model (e.g. coding the ODE correctly).
We also note how critical it is to consider the number of digits and tolerances of ODE solvers when comparing model outputs; when digits and tolerances are ignored, any discrepancy can seem bigger than it should be.
Finally, we include a reminder that mrgsolve is not an exact clone of NONMEM: we don’t and can’t match every behavior. So if you do find a discrepancy in your validation, it could be that you’ve tapped some functionality that is not consistent between NONMEM and mrgsolve. That said … I’ve been using mrgsolve with NONMEM across many years of project work and the vast majority of models we’ve worked with have matched when we are very careful in the translation and in carrying out the validation.
If you are unable to validate your model, please check the code. Then check it again. If you’re sure the model coding is equivalent and you are still seeing discrepancy, we encourage you to open ticket on our GitHub issue tracker here so we can learn more about the discrepancy.
And as always, be sure to visit mrgsolve.org for more resources.
8 Appendix
See what happens when we don’t control digits on the simulation
<- mread_cache("106.txt")
mod <- mrgsim(zero_re(mod), data, carry_out = "PRED,EVID")
out <- filter(out, EVID==0)
obs summary(obs$Y - obs$PRED)
Min. 1st Qu. Median Mean 3rd Qu. Max.
-4.952e-04 -2.714e-05 -3.701e-07 -2.581e-06 2.477e-05 4.991e-04
Here, we’ve used analytical solutions from both mrgsolve and NONMEM. This sort of error is what you might see when comparing two models which require ODEs to generate the solution.
This error is still very small relative to PRED
summary(100*(obs$Y - obs$PRED)/obs$PRED)
Min. 1st Qu. Median Mean 3rd Qu. Max.
-4.602e-05 -6.848e-06 -2.349e-07 -3.855e-07 6.197e-06 4.560e-05
Next, let’s try to validate a model that was coded incorrectly. For the AGE
effect on CL
, we have the reference to be 45 years old but it should be 35.
<- mcode("bad", paste0(code, collapse = "\n")) mod2
: bad.cpp
Model file
$maindouble V2WT = log(WT/70.0);
double CLWT = log(WT/70.0)*0.75;
double CLEGFR = log(EGFR/90.0)*THETA6;
double CLAGE = log(AGE/45.0)*THETA7;
double V3WT = log(WT/70.0);
double QWT = log(WT/70.0)*0.75;
double CLALB = log(ALB/4.5)*THETA8;
double KA = exp(THETA1+ETA(1));
double V2 = exp(THETA2+V2WT+ETA(2));
double CL = exp(THETA3+CLWT+CLEGFR+CLAGE+CLALB+ETA(3));
double V3 = exp(THETA4+V3WT);
double Q = exp(THETA5+QWT);
double S2 = V2/1000.0;
When we try to validate this model against the NONMEM run with different reference AGE
, we can clearly see there’s an issue
<- mrgsim(zero_re(mod2), data, carry_out = "PRED,EVID", digits = 7)
out <- filter(out, EVID==0)
obs summary(obs$Y - obs$PRED)
Min. 1st Qu. Median Mean 3rd Qu. Max.
-25.27000 -5.49420 -1.83945 -3.74970 -0.47830 -0.00678