익명 사용자
로그인하지 않음
토론
기여
계정 만들기
로그인
IT 위키
검색
Underfitting
편집하기
IT 위키
이름공간
문서
토론
더 보기
더 보기
문서 행위
읽기
편집
원본 편집
역사
경고:
로그인하지 않았습니다. 편집을 하면 IP 주소가 공개되게 됩니다.
로그인
하거나
계정을 생성하면
편집자가 사용자 이름으로 기록되고, 다른 장점도 있습니다.
스팸 방지 검사입니다. 이것을 입력하지
마세요
!
Underfitting is a common issue in machine learning where a model is too simple to capture the underlying patterns in the data. As a result, the model performs poorly on both training and test datasets, failing to achieve high accuracy. Underfitting occurs when the model lacks the capacity or complexity needed to represent the relationships within the data. ==Causes of Underfitting== Several factors contribute to underfitting in machine learning models: *'''Over-Simplified Model''': Models with too few parameters or too low complexity, such as linear regression for highly nonlinear data, may be unable to capture complex patterns. *'''Insufficient Training Time''': Models, particularly neural networks, may underfit if they are not trained for enough epochs to learn the data’s patterns. *'''Inadequate Feature Representation''': When important features are missing or irrelevant features are present, the model may struggle to learn. *'''High Regularization''': Excessive regularization can simplify the model too much, reducing its ability to fit the data properly. ==Signs of Underfitting== There are several indicators that a model might be underfitting: *'''Low Accuracy on Training and Test Data''': The model performs poorly on both the training set and new data, indicating it hasn’t learned the underlying relationships. *'''High Bias''': The model makes systematic errors, often resulting in predictions that deviate consistently from the target. *'''Simple Decision Boundaries''': In models like decision trees, overly simplistic boundaries suggest the model hasn’t captured the complexity of the data. ==Techniques to Avoid Underfitting== Various methods are available to mitigate or prevent underfitting in machine learning: *'''Increase Model Complexity''': Choose a more complex model, such as moving from linear regression to polynomial regression or adding layers to a neural network. *'''Feature Engineering''': Add new, relevant features or transform existing ones to provide more information for the model. *'''Reduce Regularization''': Lowering regularization strength (e.g., L1 or L2 penalty) allows the model to learn more complex patterns. *'''Longer Training Duration''': In neural networks, train the model for additional epochs to allow it to learn from the data. *'''Parameter Tuning''': Optimize hyperparameters to increase model capacity, such as increasing tree depth in decision trees or adjusting learning rates in neural networks. ==Examples of Underfitting-Prone Algorithms== Some algorithms are more likely to underfit if not properly tuned: *'''Linear Regression''': Often underfits nonlinear data due to its simplicity. *'''Decision Trees with Shallow Depth''': Trees with very few splits may fail to capture complex relationships. *'''Naïve Bayes''': Due to its independence assumption, it may struggle with data that has dependent features. *'''k-Nearest Neighbors (kNN) with Large k''': High values of k can lead to overly smooth decision boundaries, missing finer details in the data. ==Consequences of Underfitting== Underfitting has several consequences for model performance and usability: *'''Poor Predictive Accuracy''': The model’s low accuracy on both training and test data makes it unsuitable for practical applications. *'''High Bias''': Underfitted models often exhibit high bias, meaning they systematically fail to capture the relationships in the data. *'''Lack of Generalization''': An underfit model fails to generalize, providing inaccurate predictions on unseen data. ==Related Concepts== Understanding underfitting requires familiarity with related concepts: *'''Overfitting''': The opposite problem, where a model is too complex and learns noise or specific patterns in the training data. *'''Bias-Variance Tradeoff''': The balance between bias (error due to overly simplistic models) and variance (error due to overly complex models). *'''Regularization''': Techniques to control model complexity, which, if excessive, can lead to underfitting. *'''Cross-Validation''': A technique for evaluating model performance on unseen data, helping detect underfitting or overfitting. ==See Also== *[[Overfitting]] *[[Bias-Variance Tradeoff]] *[[Regularization]] *[[Cross-Validation]] *[[Feature Engineering]] *[[Linear Regression]] [[Category:Data Science]] [[Category:Artificial Intelligence]]
요약:
IT 위키에서의 모든 기여는 크리에이티브 커먼즈 저작자표시-비영리-동일조건변경허락 라이선스로 배포된다는 점을 유의해 주세요(자세한 내용에 대해서는
IT 위키:저작권
문서를 읽어주세요). 만약 여기에 동의하지 않는다면 문서를 저장하지 말아 주세요.
또한, 직접 작성했거나 퍼블릭 도메인과 같은 자유 문서에서 가져왔다는 것을 보증해야 합니다.
저작권이 있는 내용을 허가 없이 저장하지 마세요!
취소
편집 도움말
(새 창에서 열림)
둘러보기
둘러보기
대문
최근 바뀜
광고
위키 도구
위키 도구
특수 문서 목록
문서 도구
문서 도구
사용자 문서 도구
더 보기
여기를 가리키는 문서
가리키는 글의 최근 바뀜
문서 정보
문서 기록