The linear_model module
Module contents
Logistic Regression
- class LogisticRegression(X, y, penalty, dual, tol, C, fit_intercept, intercept_scaling, class_weight, random_state, solver, max_iter, multi_class, verbose, warm_start, n_jobs, l1_ratio)
This class implements a logistic regression model. It inherits from the sklearn.linear_model.LogisticRegression class, but adds additional methods for calculating confidence intervals, p-values, and model summaries.
- __init__(penalty, dual, tol, C, fit_intercept, intercept_scaling, class_weight, random_state, solver, max_iter, multi_class, verbose, warm_start, n_jobs, l1_ratio)
- Parameters:
X (Union[DataFrame, ndarray, None]) – A Pandas DataFrame or a NumPy array containing the model predictors.
y (Union[Series, ndarray, None]) – A Pandas Series or a NumPy array containing the model response.
penalty (Literal['l1', 'l2', 'elasticnet']) – The type of penalty to use. Can be one of
"none"
(default)."l1"
,"l2"
, or"elasticnet"
.dual (bool) – Whether to use the dual formulation of the problem.
tol (float) – The tolerance for convergence.
C (int) – The regularization strength.
fit_intercept (bool) – Whether to fit an intercept term.
intercept_scaling (int) – The scaling factor for the intercept term.
class_weight (Union[None, str, dict]) – None (default), “balanced” or a dictionary that maps class labels to weights.
random_state (int) – The random seed.
solver (Literal['lbfgs', 'liblinear', 'newton-cg', 'newton-cholesky', 'sag', 'saga']) – The solver to use. Can be one of
"lbfgs"
(default),"liblinear"
,"newton-cg"
,"newton-cholesky"
,"sag"
, or"saga"
.max_iter (int) – The maximum number of iterations.
multi_class (Literal['auto', 'ovr', 'multinomial']) – The type of multi-class classification to use. Can be one of
"auto"
,"ovr"
, or"multinomial"
.verbose (int) – The verbosity level.
warm_start (bool) – Whether to use the warm start.
n_jobs (int) – The number of jobs to use for parallel processing.
l1_ratio (Union[float, None]) – The l1_ratio parameter for elasticnet regularization.
- fit()
Fits the model to the data.
- predict(new_data: DataFrame)
Predicts the class labels for new data.
- conf_int(conf_level=0.95)
Calculates the confidence intervals for the model coefficients.
- se()
Calculates the standard errors for the model coefficients.
- z_values()
Calculates the z-scores for the model coefficients.
- p_values()
Calculates the p-values for the model coefficients.
- summary(conf_level=0.95)
Prints a summary of the model.
- from_formula(formula, data)
Class method to create an instance from a formula.
- params
Returns the estimated values for model parameters.
- aic
Calculates the Akaike information criterion (AIC) for the model.
- bic
Calculates the Bayesian information criterion (BIC) for the model.
- cov_matrix
Returns the covariance matrix for model parameters.
Examples
import numpy as np import pandas as pd from estyp.linear_model import LogisticRegression np.random.seed(123) data = pd.DataFrame({ "y": np.random.randint(2, size=100), "x1": np.random.uniform(-1, 1, size=100), "x2": np.random.uniform(-1, 1, size=100), }) formula = "y ~ x1 + x2" spec = LogisticRegression.from_formula(formula, data) model = spec.fit() print(model.summary())
Made by Esteban Rucán. Contact me in LinkedIn: https://www.linkedin.com/in/estebanrucan/
Estimate S.E. z Pr(>|z|) [Lower, Upper] Intercept -0.200864 0.202894 -0.989996 0.322176 -0.598530 0.196801 x1 0.032006 0.375254 0.085292 0.932030 -0.703478 0.767490 x2 0.438665 0.344263 1.274215 0.202587 -0.236078 1.113407