| Title: | Stratified interval-censored semi-parametric survival models |
|---|---|
| Description: | Semi-parametric survival models for interval censored data with stratification, based on the `icenReg` package, including model fitting and testing. |
| Authors: | Isaac Gravestock [aut, cre] (ORCID: <https://orcid.org/0000-0003-0283-2065>), Clifford Anderson-Bergman [aut] |
| Maintainer: | Isaac Gravestock <[email protected]> |
| License: | LGPL (>= 2.0, < 3) |
| Version: | 0.0.0.9000 |
| Built: | 2026-05-29 19:36:21 UTC |
| Source: | https://github.com/igrave/icsp2 |
This is an internal implementation of right-censored logrank test for use within ic_logrank where many tests need to be performed efficiently for variance calculation.
cpp_logrank(time, event, group)cpp_logrank(time, event, group)
time |
Numeric vector of survival/censoring times. |
event |
Logical vector of event indicators (TRUE = event, FALSE = censored). |
group |
Integer vector of group labels, coded as consecutive integers
(i.e. |
An object of class "cpp_logrank", a list with components:
Numeric vector of observed events per group.
Numeric vector of expected events per group.
Numeric variance matrix
Numeric vector of sample sizes per group.
set.seed(1992) time <- rexp(100) event <- sample(c(TRUE, FALSE), 100, replace = TRUE) group <- sample(1:2, 100, replace = TRUE) cpp_logrank(time, event, group)set.seed(1992) time <- rexp(100) event <- sample(c(TRUE, FALSE), 100, replace = TRUE) group <- sample(1:2, 100, replace = TRUE) cpp_logrank(time, event, group)
Sun's Log-Rank Test for Interval-Censored Data
ic_logrank( formula, data, subset, na.action, B = c(0, 1), n_samples = 1000, type = c("sas", "hly"), ... )ic_logrank( formula, data, subset, na.action, B = c(0, 1), n_samples = 1000, type = c("sas", "hly"), ... )
formula |
A formula with |
data |
A data frame containing the variables in the formula. |
subset |
Optional expression indicating which subset of rows to use. |
na.action |
Function to handle missing values. |
B |
A vector of length 2 giving bounds for observation times. Default is c(0, 1). |
n_samples |
The number of "imputation" samples for the variance calculation. Default is 1000. |
type |
The method for calculating the test statistic and variance. Options are |
... |
Additional arguments (currently unused). |
Performs Sun's (1996) non-parametric log-rank test for comparing survival distributions across groups with interval-censored data. This test compares group-specific NPMLE survival curves without assuming proportional hazards.
The test statistic is , which follows a
chi-squared distribution with k-1 degrees of freedom under the null
hypothesis, where k is the number of groups.
The default test type is "sas", which uses Sun's test statistic combined
with variance estimated based on the Huang, Lee and Yu (2008) procedure
sampling exact observation times from the Turnbull intervals.
Alternatively, the "hly" type calculates the test statistic and variance
using the multiple imputation approach of Huang, Lee and Yu (2008) directly.
Stratified tests are constructed by calculating the and
matrices for each stratum separately and then summing the stratum-specific
matrices to give a global test statistic
.
This is the procedure described in the SAS documentation for PROC ICLIFETEST.
An object of class ic_logrank containing:
logrank |
The log-rank statistics "observed - expected" for all groups and strata |
logrank_overall |
The log-rank statistics "observed - expected" for all groups |
statistic |
The overall chi-squared test statistic based on imputation |
df |
Degrees of freedom |
p.value |
P-value from chi-squared distribution |
var |
Variance-covariance matrix |
groups |
Group levels being compared |
n |
Sample sizes per group |
method |
Description of test method |
data.name |
Name of the data |
call |
The matched call |
Sun, J. (1996). A non-parametric test for interval-censored failure time data with application to AIDS studies. Statistics in Medicine, 15(13), 1387-1395.
Huang, J., Lee, C., and Yu, Q. (2008). A Generalized Log-Rank Test for Interval-Censored Failure Time Data via Multiple Imputation. Statistics in Medicine 27:3217–3226. http://dx.doi.org/10.1002/sim.3211
SAS Institute Inc. (2026). SAS/STAT® 26.03 User's Guide: The ICLIFETEST Procedure. https://documentation.sas.com/doc/en/statug/latest/statug_iclifetest_details01.htm Accessed 14 April 2026.
# Simple two-group comparison data(miceData) ic_logrank(Surv(l, u, type = "interval2") ~ grp, data = miceData)# Simple two-group comparison data(miceData) ic_logrank(Surv(l, u, type = "interval2") ~ grp, data = miceData)
Control ic_sp2 fitting options
ic_sp_control( useGA = TRUE, maxIter = 10000, baseUpdates = 5, regStart = NULL, derivMethod = c(12, 1), updateCovars = TRUE )ic_sp_control( useGA = TRUE, maxIter = 10000, baseUpdates = 5, regStart = NULL, derivMethod = c(12, 1), updateCovars = TRUE )
useGA |
Should constrained gradient ascent step be used? |
maxIter |
Maximum iterations |
baseUpdates |
number of baseline updates (ICM + GA) per iteration |
regStart |
Initial values for regression parameters |
derivMethod |
Method for derivative calculations. |
updateCovars |
Should covariates be updated during fitting? @description
Creates the control options for |
The constrained gradient step, controlled by useGA = TRUE,
is a step that was added to improve the convergence in a special case.
The option to turn it off is only in place to help demonstrate its utility.
regStart also for seeding of initial value of regression parameters.
Intended for use in “warm start" for bootstrap samples
and providing fixed regression parameters.
Fit a semi-parametric model for interval-censored data
ic_sp2( formula, data, weights, subset, na.action, B = c(0, 1), control = ic_sp_control(...), model = c("ph", "po"), profile_ci = 0, ... ) ic_sp_ph( formula, data, weights, subset, na.action, B = c(0, 1), control = ic_sp_control(...), model = c("ph", "po"), profile_ci = 0, ... ) ic_sp_po( formula, data, weights, subset, na.action, B = c(0, 1), control = ic_sp_control(...), model = c("ph", "po"), profile_ci = 0, ... )ic_sp2( formula, data, weights, subset, na.action, B = c(0, 1), control = ic_sp_control(...), model = c("ph", "po"), profile_ci = 0, ... ) ic_sp_ph( formula, data, weights, subset, na.action, B = c(0, 1), control = ic_sp_control(...), model = c("ph", "po"), profile_ci = 0, ... ) ic_sp_po( formula, data, weights, subset, na.action, B = c(0, 1), control = ic_sp_control(...), model = c("ph", "po"), profile_ci = 0, ... )
formula |
A model formula with Surv(l, u, type = 'interval2') response and covariates on the right-hand side. May also contain |
data |
A data frame containing the variables in the formula, including strata terms. |
weights |
Optional vector of weights for each observation, or the name of a variable in |
subset |
Optional expression indicating a subset of the rows of |
na.action |
Optional function to handle missing data. Default is |
B |
A vector of length 2 giving the lower and upper bounds for the observation times. Default is c(0, 1). |
control |
A list of control settings, with defaults created by |
model |
Type of model to fit. Choices are |
profile_ci |
Confidence level for profile likelihood confidence intervals. Default is 0.95. Set to |
... |
Additional arguments passed to control. |
A list containing the fitted model information, including coefficients, variance-covariance matrix, and other details.
Data set contains interval censored survival time for time from onset of
diabetes to to diabetic nephronpathy. Identical to the diabetes
dataset found in the package glrt.
IR_diabetesIR_diabetes
An object of class data.frame with 731 rows and 3 columns.
leftleft side of observation interval
rightright side of observation interval
gendergender of subject
Borch-Johnsens, K, Andersen, P and Decker, T (1985). "The effect of proteinuria on relative mortality in Type I (insulin-dependent) diabetes mellitus." Diabetologia, 28, 590-596.
head(IR_diabetes) diabetes_fit <- ic_sp_po( Surv(left, right, type = 'interval2') ~ gender, data = IR_diabetes )head(IR_diabetes) diabetes_fit <- ic_sp_po( Surv(left, right, type = 'interval2') ~ gender, data = IR_diabetes )
Lines method for ic_sp2 objects
## S3 method for class 'ic_sp2' lines(x, y, type = "s", strata, ...)## S3 method for class 'ic_sp2' lines(x, y, type = "s", strata, ...)
x |
an object of class 'ic_sp2' |
y |
ignored |
type |
the type of plot to produce, default is "l" for lines |
strata |
Optional vector of indices to subset the strata to be plotted. |
... |
additional arguments passed to the lines function |
Due to the interval censoring, the survival curves are not unique, so two lines are drawn for each curve: one for the lower bound and one for the upper bound of the interval. A pair of lines is drawn for each stratum, if strata are present in the model. Therefore to unambiguously specify graphics parameters such as colour or line type, the user should specify them as a vector of length equal to the number of strata (for strata specific pars) or twice the number of strata (for line specific pars, e.g. strata 1 lower, strata 1 upper, strata 2 lower, strata 2 upper).
RFM mice were sacrificed and examined for lung tumors. This resulted in current status interval censored data: if the tumor was present, this implied left censoring and if no tumor was present this implied right censoring. Mice were placed in two different groups: conventional environment or germ free environment.
miceDatamiceData
An object of class data.frame with 144 rows and 3 columns.
lleft side of observation interval
uright side of observation interval
grpGroup for mouse. Either "ce" (conventional environment) or "ge" (germ-free environment)
Hoel D. and Walburg, H.,(1972), Statistical analysis of survival experiments, \emph{The Annals of Statistics},
18, 1259-1294
head(miceData) mice_ph_fit <- ic_sp_ph(Surv(l, u, type = 'interval2') ~ grp, data = miceData) summary(mice_ph_fit)head(miceData) mice_ph_fit <- ic_sp_ph(Surv(l, u, type = 'interval2') ~ grp, data = miceData) summary(mice_ph_fit)
Print method for ic_sp2 objects
## S3 method for class 'ic_sp2' print(x, ...)## S3 method for class 'ic_sp2' print(x, ...)
x |
Fitted model object from |
... |
Additional arguments. |
Refit the model with fixed regression parameters
profile_fit(object, beta = object$coefficients)profile_fit(object, beta = object$coefficients)
object |
Fitted model object from |
beta |
Vector of regression parameters to fix in the refit. Default is the original fitted regression parameters. |
Raw fitted model object with fixed regression parameters.
Simulates interval censored data from a regression model with a Weibull baseline distribution. Used for demonstration
simIC_weib( n = 100, b1 = 0.5, b2 = -0.5, model = "ph", shape = 2, scale = 2, inspections = 2, inspectLength = 2.5, rndDigits = NULL, prob_cen = 1 )simIC_weib( n = 100, b1 = 0.5, b2 = -0.5, model = "ph", shape = 2, scale = 2, inspections = 2, inspectLength = 2.5, rndDigits = NULL, prob_cen = 1 )
n |
Number of samples simulated |
b1 |
Value of first regression coefficient |
b2 |
Value of second regression coefficient |
model |
Type of regression model. Options are 'po' (prop. odds) and 'ph' (Cox PH) |
shape |
shape parameter of baseline distribution |
scale |
scale parameter of baseline distribution |
inspections |
number of inspections times of censoring process |
inspectLength |
max length of inspection interval |
rndDigits |
number of digits to which the inspection time is rounded to,
creating a discrete inspection time. If |
prob_cen |
probability event being censored. If event is uncensored, l == u |
Exact event times are simulated according to regression model: covariate x1
is distributed rnorm(n) and covariate x2 is distributed
1 - 2 * rbinom(n, 1, 0.5). Event times are then censored with a
case II interval censoring mechanism with inspections different inspection times.
Time between inspections is distributed as runif(min = 0, max = inspectLength).
Note that the user should be careful in simulation studies not to simulate data
where nearly all the data is right censored (or more over, all the data with x2 = 1 or -1)
or this can result in degenerate solutions!
Clifford Anderson-Bergman
set.seed(1) sim_data <- simIC_weib(n = 500, b1 = .3, b2 = -.3, model = 'ph', shape = 2, scale = 2, inspections = 6, inspectLength = 1) #simulates data from a cox-ph with beta Weibull distribution. ic_sp_ph(Surv(l, u, type = 'interval2') ~ x1 + x2, data = sim_data) ic_sp_po(Surv(l, u, type = 'interval2') ~ x1 + x2, data = sim_data) #'ph' fit looks better than 'po'; the difference between the transformed survival #function looks more constantset.seed(1) sim_data <- simIC_weib(n = 500, b1 = .3, b2 = -.3, model = 'ph', shape = 2, scale = 2, inspections = 6, inspectLength = 1) #simulates data from a cox-ph with beta Weibull distribution. ic_sp_ph(Surv(l, u, type = 'interval2') ~ x1 + x2, data = sim_data) ic_sp_po(Surv(l, u, type = 'interval2') ~ x1 + x2, data = sim_data) #'ph' fit looks better than 'po'; the difference between the transformed survival #function looks more constant
Profile Likelihood Covariance for Semi-Parametric Models
## S3 method for class 'ic_sp2' vcov(object, type = "oim_curvature", fixed = 5, typical = 1, large = 2, ...)## S3 method for class 'ic_sp2' vcov(object, type = "oim_curvature", fixed = 5, typical = 1, large = 2, ...)
object |
Fitted model object from |
type |
One of |
fixed |
A fixed factor to multiply by |
typical |
A typical value for the regression parameters, used to determine the scale of |
large |
A large value for the regression parameters, used to determine the scale of |
... |
Unused. |
The covariance matrix is calculated using the profile likelihood approach.
(Murphey and Vand Der Vaart 2000).
This method involves perturbing the regression parameters, updating the
baseline hazard estimates using the profile_fit function with the perturbed
parameters, and calculating the change in log-likelihood for each perturbation,
which is then used to compute the covariance matrix.
We borrowing the naming convention from the Stata stintcox manual
https://www.stata.com/manuals/ststintcox.pdf.
Type "oim_curvature" (Boruvka and Cook 2015) uses the curvature of the log-likelihood function to
determine the perturbation size for the profile likelihood calculations, as
described in Boruvka and Cook (2014).
The typical and large parameters are used to determine the scale of the
perturbations.
Type "oim_fixed" (Zeng et al 2016) uses a fixed perturbation size based on the fixed
parameter and the sample size n.
Type "opg_fixed" (Zeng et al 2017) uses a fixed perturbation size based on the fixed
parameter and the sample size n, but uses the outer product of gradients
instead of the observed information matrix for the covariance calculation.
For larged values of fixed the model fitting may fail to converge.
Variance-covariance matrix of the regression parameters.
Murphy, S. A., & Van Der Vaart, A. W. (2000). On Profile Likelihood. Journal of the American Statistical Association, 95(450), 449–465. https://doi.org/10.1080/01621459.2000.10474219
Boruvka, A., and Cook, R. J. (2015), A Cox-Aalen Model for Interval-censored Data. Scand J Statist, 42, 414–426. doi: 10.1111/sjos.12113.
Donglin Zeng, Lu Mao, D. Y. Lin, Maximum likelihood estimation for semiparametric transformation models with interval-censored data, Biometrika, Volume 103, Issue 2, June 2016, Pages 253–271, https://doi.org/10.1093/biomet/asw013
Donglin Zeng, Fei Gao, D. Y. Lin, Maximum likelihood estimation for semiparametric regression models with multivariate interval-censored data, Biometrika, Volume 104, Issue 3, September 2017, Pages 505–525, https://doi.org/10.1093/biomet/asx029