`survive`.Breslow¶

class survive.Breslow(*, conf_type='log', conf_level=0.95, var_type='aalen', tie_break='discrete')[source]¶

Breslow nonparametric survival function estimator.

Parameters:	conf_type : {‘log’, ‘linear’} Type of confidence interval to report. conf_level : float Confidence level of the confidence intervals. var_type : {‘aalen’, ‘greenwood’} Type of variance estimate to compute. tie_break : {‘discrete’, ‘continuous’} Specify how to handle tied event times.

See also

survive.NelsonAalen: Nelson-Aalen cumulative hazard function estimator.

Notes

The Breslow estimator is a nonparametric estimator of the survival function of a time-to-event distribution defined as the exponential of the negative of the Nelson-Aalen cumulative hazard function estimator \(\widehat{A}(t)\):

\[\widehat{S}(t) = \exp(-\widehat{A}(t)).\]

This estimator was introduced in a discussion [1] following [2]. It was later studied by Fleming and Harrington in [3], and it is sometimes called the Fleming-Harrington estimator.

The parameters of this class are identical to the parameters of survive.NelsonAalen. The Breslow survival function estimates and confidence interval bounds are transformations of the Nelson-Aalen cumulative hazard estimates and confidence interval bounds, respectively. The variance estimate for the Breslow estimator is computed using the variance estimate for the Nelson-Aalen estimator using the Nelson-Aalen estimator’s asymptotic normality and the delta method:

\[\widehat{\mathrm{Var}}(\widehat{S}(t)) = \widehat{S}(t)^2 \widehat{\mathrm{Var}}(\widehat{A}(t))\]

Comparisons of the Breslow estimator and the more popular Kaplan-Meier estimator (cf. survive.KaplanMeier) can be found in [3] and [4]. One takeaway is that the Breslow estimator was found to be more biased than the Kaplan-Meier estimator, but the Breslow estimator had a lower mean squared error.

References

[1]	(1, 2) N. E. Breslow. “Discussion of Professor Cox’s Paper”. Journal of the Royal Statistical Society. Series B (Methodological), Volume 34, Number 2 (1972), pp. 216–217.

[2]	(1, 2) D. R. Cox. “Regression Models and Life-Tables”. Journal of the Royal Statistical Society. Series B (Methodological), Volume 34, Number 2 (1972), pp. 187–202. JSTOR.

[3]	(1, 2, 3) Thomas R. Fleming and David P. Harrington. “Nonparametric Estimation of the Survival Distribution in Censored Data”. Communications in Statistics - Theory and Methods, Volume 13, Number 20 (1984), pp. 2469–2486. DOI.

[4]	(1, 2) Xuelin Huang and Robert L. Strawderman. “A Note on the Breslow Survival Estimator”. Journal of Nonparametric Statistics, Volume 18, Number 1 (2006), pp. 45–56. DOI.

Attributes:	`conf_level` Confidence level of the confidence intervals. `conf_type` Type of confidence intervals to report. `data_` Survival data used to fit the estimator. `random_state` Seed for this model’s random number generator. `summary` Get a summary of this estimator. `tie_break` How to handle tied event times. `var_type` Type of variance estimate to compute.

Methods

`check_fitted`()	Check whether this model is fitted.
`fit`(time, **kwargs)	Fit the Breslow estimator to survival data.
`plot`(*groups[, ci, ci_style, ci_kwargs, …])	Plot the estimates.
`predict`(time, *[, return_se, return_ci])	Compute estimates.
`quantile`(prob, *[, return_ci])	Empirical quantile estimates for the time-to-event distribution.
`to_string`([max_line_length])	String representation of this model.

check_fitted()[source]¶: Check whether this model is fitted. If not, raise an exception.

conf_level¶

Confidence level of the confidence intervals.

Returns:	conf_level : float The confidence level.

conf_type¶

Type of confidence intervals to report.

Returns:	conf_type : str The type of confidence interval.

data_¶

Survival data used to fit the estimator.

This property is only available after fitting.

Returns:	data : SurvivalData The `survive.SurvivalData` instance used to fit the estimator.

fit(time, **kwargs)[source]¶

Fit the Breslow estimator to survival data.

Parameters:

time : one-dimensional array-like or str or SurvivalData: The observed times, or all the survival data. If this is a survive.SurvivalData instance, then it is used to fit the estimator and any other parameters are ignored. Otherwise, time and the keyword arguments in kwargs are used to initialize a survive.SurvivalData object on which this estimator is fitted.
**kwargs : keyword arguments: Any additional keyword arguments used to initialize a survive.SurvivalData instance.

Returns:

survive.nonparametric.NelsonAalen: This estimator.

See also

survive.SurvivalData: Structure used to store survival data.
survive.NelsonAalen: Nelson-Aalen cumulative hazard estimator.

plot(*groups, ci=True, ci_style='fill', ci_kwargs=None, mark_censor=True, mark_censor_kwargs=None, legend=True, legend_kwargs=None, colors=None, palette=None, ax=None, **kwargs)[source]¶

Plot the estimates.

Parameters:

*groups : list of group labels: Specify the groups whose curves should be plotted. If none are given, the curves for all groups are plotted.
ci : bool, optional: If True, draw pointwise confidence intervals.
ci_style : {“fill”, “lines”}, optional: Specify how to draw the confidence intervals. If ci_style is “fill”, the region between the lower and upper confidence interval curves will be filled. If ci_style is “lines”, only the lower and upper curves will be drawn (this is inspired by the style of confidence intervals drawn by plot.survfit in the R package survival).
ci_kwargs : dict, optional: Additional keyword parameters to pass to fill_between() (if ci_style is “fill”) or step() (if ci_style is “lines”) when plotting the pointwise confidence intervals.
mark_censor : bool, optional: If True, indicate the censored times by markers on the plot.
mark_censor_kwargs : dict, optional: Additional keyword parameters to pass to scatter() when marking censored times.
legend : bool, optional: Indicates whether to display a legend for the plot.
legend_kwargs : dict, optional: Keyword parameters to pass to legend().
colors : list or tuple or dict or str, optional: Colors for each group. This is ignored if palette is provided. This can be a sequence of valid matplotlib colors to cycle through, or a dictionary mapping group labels to matplotlib colors, or the name of a matplotlib colormap.
palette : str, optional: Name of a seaborn color palette. Requires seaborn to be installed. Setting a color palette overrides the colors parameter.
ax : matplotlib.axes.Axes, optional: The axes on which to plot. If this is not specified, the current axes will be used.
**kwargs : keyword arguments: Additional keyword arguments to pass to step() when plotting the estimates.

Returns:

matplotlib.axes.Axes: The Axes on which the plot was drawn.

predict(time, *, return_se=False, return_ci=False)[source]¶

Compute estimates.

Parameters:

time : array-like: One-dimensional array of times at which to make estimates.
return_se : bool, optional: If True, also return standard error estimates.
return_ci : bool, optional: If True, also return confidence intervals.

Returns:

estimate : pandas.DataFrame: DataFrame of estimates. Each columns represents a group, and each row represents an entry of time.
std_err : pandas.DataFrame, optional: Standard errors of the estimates. Same shape as estimate. Returned only if return_se is True.
lower : pandas.DataFrame, optional: Lower confidence interval bounds. Same shape as estimate. Returned only if return_ci is True.
upper : pandas.DataFrame, optional: Upper confidence interval bounds. Same shape as estimate. Returned only if return_ci is True.

quantile(prob, *, return_ci=False)[source]¶

Empirical quantile estimates for the time-to-event distribution.

Parameters:

prob : array-like: One-dimensional array of values between 0 and 1 representing the probability levels of the desired quantiles.
return_ci : bool, optional: Specify whether to return confidence intervals for the quantile estimates.

Returns:

quantiles : pandas.DataFrame: The quantile estimates. Rows are indexed by the entries of time and columns are indexed by the model’s group labels. Entries for probability levels for which the quantile estimate is not defined are nan (not a number).
lower : pandas.DataFrame, optional: Lower confidence interval bounds for the quantile estimates. Returned only if return_ci is True. Same shape as quantiles.
upper : pandas.DataFrame, optional: Upper confidence interval bounds for the quantile estimates. Returned only if return_ci is True. Same shape as quantiles.

Notes

For a probability level \(p\) between 0 and 1, the empirical \(p\)-quantile of the time-to-event distribution with estimated survival function \(\widehat{S}(t)\) is defined to be the time at which the horizontal line at height \(1-p\) intersects with the estimated survival curve. If such a time is not unique, then instead there is a time interval on which the estimated survival curve is flat and coincides with the horizontal line at height \(1-p\). In this case the midpoint of this interval is taken to be the empirical \(p\)-quantile estimate (this is just one of many possible conventions, and the one used by the R package survival [1]). If the survival function estimate never gets as low as \(1-p\), then the \(p\)-quantile cannot be estimated.

The confidence intervals computed here are based on finding the time at which the horizontal line at height \(1-p\) intersects the upper and lower confidence interval for \(\widehat{S}(t)\). This mimics the implementation in the R package survival [1], which is based on the confidence interval construction in [2].

References

[1]	(1, 2, 3) Terry M. Therneau. A Package for Survival Analysis in S. version 2.38 (2015). CRAN.

[2]	(1, 2) Ron Brookmeyer and John Crowley. “A Confidence Interval for the Median Survival Time.” Biometrics, Volume 38, Number 1 (1982), pp. 29–41. DOI.

random_state¶

Seed for this model’s random number generator. This may not be an numpy.random.RandomState instance. The internal RNG is not a public attribute and should not be used directly.

Returns:	random_state : object The seed for this model’s RNG.

summary¶

Get a summary of this estimator.

Returns:	summary : NonparametricEstimatorSummary The summary of this estimator.

Parameters:	max_line_length : int, optional Specifies the maximum length of a line. If None, everything will be on one line.
Returns:	model_string : str A string representation of this model which should be able to be used to instantiate a new identical model.

survive.Breslow¶

`survive`.Breslow¶