survive
.SurvivalData¶
-
class
survive.
SurvivalData
(time, *, status=None, entry=None, group=None, data=None, min_time=None, warn=True)[source]¶ Class representing right-censored and left-truncated survival data.
Parameters: - time : array-like or str
The observed times. If the DataFrame parameter data is provided, this can be the name of a column in data from which to get the observed times. Otherwise this should be a one-dimensional array of positive numbers.
- status : array-like or str, optional
Censoring indicators. 0 means a right-censored observation, 1 means a true failure/event. If not provided, it is assumed that there is no censoring. If the DataFrame parameter data is provided, this can be the name of a column in data from which to get the censoring indicators. Otherwise this should be an array of 0’s and 1’s of the same shape as the array of observed times.
- entry : array-like or str, optional
Entry/birth times of the observations (for left-truncated data). If not provided, the entry time for each observation is set to 0. If the DataFrame parameter data is provided, this can be the name of a column in data from which to get the entry times. Otherwise this should be an array of non-negative numbers of the same shape as the array of observed times.
- group : array-like or string, optional
Group/stratum labels for each observation. If not provided, the entire sample is taken as a single group. If the DataFrame parameter data is provided, this can be the name of a column in data from which to get the group labels. Otherwise this should be an array of the same shape as the array of observed times.
- data : pandas.DataFrame, optional
Optional
pandas.DataFrame
from which to extract the data. If this parameter is specified, then the parameters time, status, entry, and group can be column names of this DataFrame.- min_time : numeric, optional
The minimum observed time to consider part of the sample. This is for conditional inference. Observations with earlier observed event or censoring times are ignored. If not provided, all observations are used.
- warn : bool, optional
Indicates whether any warnings should be raised or ignored (e.g., if an individual’s entry time is later than that individual’s event time).
Attributes: - time : numpy.ndarray
Each observed time.
- status : numpy.ndarray
Event indicators for each observed time. 1 indicates an event, 0 indicates censoring.
- entry : numpy.ndarray
Entry times of the observations (for left truncation).
- group : numpy.ndarray
Label for each observation’s group/stratum within the sample.
- group_labels : numpy.ndarray
List of the distinct groups in the sample.
- n_groups : int
The number of distinct groups in the sample.
- events : dict
Mapping of group labels to DataFrames with columns:
- time
Distinct event times for that group
- n_events
Number of events at each event time.
- n_at_risk
Number of individuals at risk at each event time.
- censor : dict
Mapping of group labels to DataFrames with columns:
- time
Distinct censored times for that group
- n_censor
Number of individuals censored at each censored time.
- n_at_risk
Number of individuals at risk at each censored time.
Methods
n_at_risk
(time)Get the number of individuals at risk (i.e., entered but yet to undergo an event or censoring) at the given times. n_events
(time)Get the number of events at the given times. plot_at_risk
([legend, legend_kwargs, …])Plot the at-risk process. plot_lifetimes
([legend, legend_kwargs, …])Plot the observed survival times. reset_format
()Restore string formatting defaults. set_format
(**kwargs)Set string formatting options. to_string
([group, max_line_length, …])Get a string representation of the survival data within a group. -
describe
¶ Get a DataFrame with descriptive statistics about the survival data.
Returns: - pandas.DataFrame
A DataFrame with a row for every group. The columns are
- total
The total number of observations within a group
- events
The number of events within a group
- censored
The number of censored events within a group
-
n_at_risk
(time)[source]¶ Get the number of individuals at risk (i.e., entered but yet to undergo an event or censoring) at the given times.
Parameters: - time : float or array-like
Times at which to report the risk set sizes.
Returns: - pandas.DataFrame
Number of individuals at risk at the given times within each group. The rows are indexed by the times in time, and the columns are indexed by group.
-
n_events
(time)[source]¶ Get the number of events at the given times.
Parameters: - time : float or array-like
Times at which to report the numbers of events.
Returns: - pandas.DataFrame
Number of events at the given times within each group. The rows are indexed by the times in time, and the columns are indexed by group.
-
plot_at_risk
(legend=True, legend_kwargs=None, colors=None, palette=None, ax=None, **kwargs)[source]¶ Plot the at-risk process.
Parameters: - legend : bool, optional
Indicates whether to display a legend for the plot.
- legend_kwargs : dict, optional
Keyword parameters to pass to
matplotlib.axes.Axes.legend()
.- colors : list or tuple or dict or str, optional
Colors for each group. This is ignored if palette is provided. This can be a sequence of valid matplotlib colors to cycle through, or a dictionary mapping group labels to matplotlib colors, or the name of a matplotlib colormap.
- palette : str, optional
Name of a seaborn color palette. Requires seaborn to be installed. Setting a color palette overrides the colors parameter.
- ax : matplotlib.axes.Axes, optional
The axes on which to plot. If this is not specified, the current axes will be used.
- **kwargs : keyword arguments
Additional keyword arguments to pass to
matplotlib.axes.Axes.step()
when plotting the at-risk process.
Returns: - matplotlib.axes.Axes
The axes on which the plot was drawn.
-
plot_lifetimes
(legend=True, legend_kwargs=None, colors=None, palette=None, ax=None, **kwargs)[source]¶ Plot the observed survival times.
Parameters: - legend : bool, optional
Indicates whether to display a legend for the plot.
- legend_kwargs : dict, optional
Keyword parameters to pass to
matplotlib.axes.Axes.legend()
.- colors : list or tuple or dict or str, optional
Colors for each group. This is ignored if palette is provided. This can be a sequence of valid matplotlib colors to cycle through, or a dictionary mapping group labels to matplotlib colors, or the name of a matplotlib colormap.
- palette : str, optional
Name of a seaborn color palette. Requires seaborn to be installed. Setting a color palette overrides the colors parameter.
- ax : matplotlib.axes.Axes, optional
The axes on which to plot. If this is not specified, the current axes will be used.
- **kwargs : keyword arguments
Additional keyword arguments to pass to
matplotlib.axes.Axes.plot()
when plotting the lifetimes.
Returns: - matplotlib.axes.Axes
The axes on which the plot was drawn.
-
set_format
(**kwargs)[source]¶ Set string formatting options.
Parameters: - **kwargs : keyword arguments
Formatting options. Allowed arguments:
- max_line_length : int
Specify the maximum length of a single line.
- separator : str
Specify how to separate individual times.
- censor_marker : str
String to mark censored times.
-
to_string
(group=None, *, max_line_length=None, separator=None, censor_marker=None)[source]¶ Get a string representation of the survival data within a group.
Parameters: - group : group label, optional
Specify a single group to represent. If no group is specified, then the entire sample is treated as one group.
- max_line_length : int, optional
Specify the maximum length of a single line.
- separator : str, optional
Specify how to separate individual times.
- censor_marker : str, optional
String to mark censored times.
Returns: - str
String representation of the observed survival times within a group.