Logistic regression: 두 판 사이의 차이

2024년 11월 4일 (월) 11:33 기준 최신판

넘겨줄 대상:

Logistic Regression

@@ 1번째 줄: / 1번째 줄: @@
-'''Logistic Regression''' is a statistical and machine learning algorithm used for binary classification tasks, where the output variable is categorical and typically represents two classes (e.g., yes/no, spam/not spam, fraud/not fraud). Despite its name, Logistic Regression is a classification algorithm, not a regression algorithm, as it predicts probabilities of classes rather than continuous values.
+#REDIRECT [[Logistic Regression]]
-== How It Works ==
-Logistic Regression models the probability of a binary outcome using a logistic function, also known as the sigmoid function. The sigmoid function compresses values to range between 0 and 1, representing the probability of belonging to a particular class. The model predicts the probability that the input belongs to the positive class (1) and classifies it by applying a threshold, often 0.5.
-The logistic function is represented by:
-P(y=1 | X) = 1 / (1 + e<sup>-(b0 + b1X1 + b2X2 + ... + bnXn)</sup>)
-where:
-* '''P(y=1 | X)''' is the probability of the output being 1 given the input features.
-* '''X1, X2, ..., Xn''' are the input features.
-* '''b0''' is the intercept, and '''b1, b2, ..., bn''' are the coefficients of the features.
-== Types of Logistic Regression ==
-* '''Binary Logistic Regression''': Used for binary classification with two possible outcomes (e.g., yes/no).
-* '''Multinomial Logistic Regression''': Used when the outcome variable has more than two categories without any ordering (e.g., classifying types of animals).
-* '''Ordinal Logistic Regression''': Used when the outcome variable has ordered categories (e.g., ranking levels from low to high).
-== Applications of Logistic Regression ==
-Logistic Regression is widely used across industries due to its simplicity, interpretability, and effectiveness in binary classification tasks:
-* '''Healthcare''': Predicting disease outcomes, risk assessments, and patient survival chances.
-* '''Finance''': Credit scoring, fraud detection, and risk analysis.
-* '''Marketing''': Customer churn prediction, targeting potential buyers, and lead qualification.
-* '''Social Sciences''': Survey analysis, where responses fall into categories like agree/disagree or support/oppose.
-== Key Metrics for Evaluating Logistic Regression ==
-To assess the performance of a Logistic Regression model, common metrics include:
-* '''[[Accuracy]]''': The proportion of correct predictions.
-* '''[[Precision]]''': The ratio of true positive predictions to all positive predictions.
-* '''[[Recall]]''': The ratio of true positive predictions to all actual positives.
-* '''[[F1 Score]]''': The harmonic mean of precision and recall, useful when dealing with imbalanced data.
-* '''[[AUC]]-[[ROC Curve]]''': Measures the model’s ability to distinguish between classes, where a higher Area Under the Curve (AUC) indicates better performance.
-== Assumptions of Logistic Regression ==
-Logistic Regression relies on several assumptions for accurate results:
-. '''Linearity of Independent Variables and Log-Odds''': Assumes a linear relationship between the log-odds of the outcome and the independent variables.
-. '''Independence of Observations''': Observations should be independent of each other to avoid biased results.
-. '''No Multicollinearity''': Independent variables should not be highly correlated with each other, which can be checked using Variance Inflation Factor (VIF).
-. '''Sufficient Sample Size''': Logistic Regression requires a large enough sample size, especially for categorical variables, to make accurate predictions.
-== Handling Limitations ==
-Logistic Regression may not perform well if the relationship between variables is highly non-linear. In such cases, transformations, polynomial features, or using a more complex model like Decision Trees or Neural Networks can be considered.
-== See Also ==
-* [[Linear Regression]]
-* [[Support Vector Machine]]
-* [[K-Nearest Neighbor]]
-* [[Decision Tree]]
-* [[Naive Bayes]]

익명 사용자

검색

Logistic regression: 두 판 사이의 차이

이름공간

더 보기

문서 행위

2024년 11월 4일 (월) 11:33 기준 최신판

둘러보기

둘러보기

광고

위키 도구

위키 도구

익명 사용자

검색

Logistic regression: 두 판 사이의 차이

2024년 11월 4일 (월) 11:33 기준 최신판

둘러보기

위키 도구

문서 도구