survive.SurvivalData

class survive.SurvivalData(time, *, status=None, entry=None, group=None, data=None, min_time=None, warn=True)[source]

Class representing right-censored and left-truncated survival data.

Parameters:
time : array-like or str

The observed times. If the DataFrame parameter data is provided, this can be the name of a column in data from which to get the observed times. Otherwise this should be a one-dimensional array of positive numbers.

status : array-like or str, optional

Censoring indicators. 0 means a right-censored observation, 1 means a true failure/event. If not provided, it is assumed that there is no censoring. If the DataFrame parameter data is provided, this can be the name of a column in data from which to get the censoring indicators. Otherwise this should be an array of 0’s and 1’s of the same shape as the array of observed times.

entry : array-like or str, optional

Entry/birth times of the observations (for left-truncated data). If not provided, the entry time for each observation is set to 0. If the DataFrame parameter data is provided, this can be the name of a column in data from which to get the entry times. Otherwise this should be an array of non-negative numbers of the same shape as the array of observed times.

group : array-like or string, optional

Group/stratum labels for each observation. If not provided, the entire sample is taken as a single group. If the DataFrame parameter data is provided, this can be the name of a column in data from which to get the group labels. Otherwise this should be an array of the same shape as the array of observed times.

data : pandas.DataFrame, optional

Optional pandas.DataFrame from which to extract the data. If this parameter is specified, then the parameters time, status, entry, and group can be column names of this DataFrame.

min_time : numeric, optional

The minimum observed time to consider part of the sample. This is for conditional inference. Observations with earlier observed event or censoring times are ignored. If not provided, all observations are used.

warn : bool, optional

Indicates whether any warnings should be raised or ignored (e.g., if an individual’s entry time is later than that individual’s event time).

Attributes:
time : numpy.ndarray

Each observed time.

status : numpy.ndarray

Event indicators for each observed time. 1 indicates an event, 0 indicates censoring.

entry : numpy.ndarray

Entry times of the observations (for left truncation).

group : numpy.ndarray

Label for each observation’s group/stratum within the sample.

group_labels : numpy.ndarray

List of the distinct groups in the sample.

n_groups : int

The number of distinct groups in the sample.

events : dict

Mapping of group labels to DataFrames with columns:

time

Distinct event times for that group

n_events

Number of events at each event time.

n_at_risk

Number of individuals at risk at each event time.

censor : dict

Mapping of group labels to DataFrames with columns:

time

Distinct censored times for that group

n_censor

Number of individuals censored at each censored time.

n_at_risk

Number of individuals at risk at each censored time.

Methods

n_at_risk(time) Get the number of individuals at risk (i.e., entered but yet to undergo an event or censoring) at the given times.
n_events(time) Get the number of events at the given times.
plot_at_risk([legend, legend_kwargs, …]) Plot the at-risk process.
plot_lifetimes([legend, legend_kwargs, …]) Plot the observed survival times.
reset_format() Restore string formatting defaults.
set_format(**kwargs) Set string formatting options.
to_string([group, max_line_length, …]) Get a string representation of the survival data within a group.
describe

Get a DataFrame with descriptive statistics about the survival data.

Returns:
pandas.DataFrame

A DataFrame with a row for every group. The columns are

total

The total number of observations within a group

events

The number of events within a group

censored

The number of censored events within a group

n_at_risk(time)[source]

Get the number of individuals at risk (i.e., entered but yet to undergo an event or censoring) at the given times.

Parameters:
time : float or array-like

Times at which to report the risk set sizes.

Returns:
pandas.DataFrame

Number of individuals at risk at the given times within each group. The rows are indexed by the times in time, and the columns are indexed by group.

n_events(time)[source]

Get the number of events at the given times.

Parameters:
time : float or array-like

Times at which to report the numbers of events.

Returns:
pandas.DataFrame

Number of events at the given times within each group. The rows are indexed by the times in time, and the columns are indexed by group.

plot_at_risk(legend=True, legend_kwargs=None, colors=None, palette=None, ax=None, **kwargs)[source]

Plot the at-risk process.

Parameters:
legend : bool, optional

Indicates whether to display a legend for the plot.

legend_kwargs : dict, optional

Keyword parameters to pass to matplotlib.axes.Axes.legend().

colors : list or tuple or dict or str, optional

Colors for each group. This is ignored if palette is provided. This can be a sequence of valid matplotlib colors to cycle through, or a dictionary mapping group labels to matplotlib colors, or the name of a matplotlib colormap.

palette : str, optional

Name of a seaborn color palette. Requires seaborn to be installed. Setting a color palette overrides the colors parameter.

ax : matplotlib.axes.Axes, optional

The axes on which to plot. If this is not specified, the current axes will be used.

**kwargs : keyword arguments

Additional keyword arguments to pass to matplotlib.axes.Axes.step() when plotting the at-risk process.

Returns:
matplotlib.axes.Axes

The axes on which the plot was drawn.

plot_lifetimes(legend=True, legend_kwargs=None, colors=None, palette=None, ax=None, **kwargs)[source]

Plot the observed survival times.

Parameters:
legend : bool, optional

Indicates whether to display a legend for the plot.

legend_kwargs : dict, optional

Keyword parameters to pass to matplotlib.axes.Axes.legend().

colors : list or tuple or dict or str, optional

Colors for each group. This is ignored if palette is provided. This can be a sequence of valid matplotlib colors to cycle through, or a dictionary mapping group labels to matplotlib colors, or the name of a matplotlib colormap.

palette : str, optional

Name of a seaborn color palette. Requires seaborn to be installed. Setting a color palette overrides the colors parameter.

ax : matplotlib.axes.Axes, optional

The axes on which to plot. If this is not specified, the current axes will be used.

**kwargs : keyword arguments

Additional keyword arguments to pass to matplotlib.axes.Axes.plot() when plotting the lifetimes.

Returns:
matplotlib.axes.Axes

The axes on which the plot was drawn.

reset_format()[source]

Restore string formatting defaults.

set_format(**kwargs)[source]

Set string formatting options.

Parameters:
**kwargs : keyword arguments

Formatting options. Allowed arguments:

max_line_length : int

Specify the maximum length of a single line.

separator : str

Specify how to separate individual times.

censor_marker : str

String to mark censored times.

to_string(group=None, *, max_line_length=None, separator=None, censor_marker=None)[source]

Get a string representation of the survival data within a group.

Parameters:
group : group label, optional

Specify a single group to represent. If no group is specified, then the entire sample is treated as one group.

max_line_length : int, optional

Specify the maximum length of a single line.

separator : str, optional

Specify how to separate individual times.

censor_marker : str, optional

String to mark censored times.

Returns:
str

String representation of the observed survival times within a group.