익명 사용자
로그인하지 않음
토론
기여
계정 만들기
로그인
IT 위키
검색
Principal Component Analysis
편집하기
IT 위키
이름공간
문서
토론
더 보기
더 보기
문서 행위
읽기
편집
원본 편집
역사
경고:
로그인하지 않았습니다. 편집을 하면 IP 주소가 공개되게 됩니다.
로그인
하거나
계정을 생성하면
편집자가 사용자 이름으로 기록되고, 다른 장점도 있습니다.
스팸 방지 검사입니다. 이것을 입력하지
마세요
!
'''Principal Component Analysis (PCA)''' is a statistical technique used for dimensionality reduction by transforming a dataset into a new coordinate system. The transformation emphasizes the directions (principal components) that maximize the variance in the data, helping to reduce the number of features while preserving essential information. ==Key Concepts== *'''Principal Components:''' New orthogonal axes computed as linear combinations of the original features. The first principal component captures the maximum variance, followed by subsequent components with decreasing variance. *'''Explained Variance:''' The proportion of total variance captured by each principal component. *'''Orthogonality:''' Principal components are mutually perpendicular, ensuring no redundancy. ==Steps in PCA== #'''Standardize the Data:''' Center the data by subtracting the mean of each feature and scale it (if necessary). #'''Compute the Covariance Matrix:''' Calculate the covariance matrix of the dataset to understand relationships between features. #'''Calculate Eigenvectors and Eigenvalues:''' Find the eigenvectors and eigenvalues of the covariance matrix to determine the principal components and their variance contribution. #'''Select Principal Components:''' Retain the top k principal components that explain the majority of the variance. #'''Transform the Data:''' Project the original data onto the new feature space defined by the selected principal components. ==Applications of PCA== PCA is widely used in various fields for the following purposes: *'''Dimensionality Reduction:''' Reducing the number of features in datasets for efficient processing. *'''Noise Reduction:''' Removing irrelevant or noisy dimensions to improve data quality. *'''Data Visualization:''' Visualizing high-dimensional data in 2D or 3D for better interpretability. *'''Feature Extraction:''' Creating new features that summarize the original dataset effectively. *'''Anomaly Detection:''' Highlighting deviations by focusing on key patterns in data. ===Example=== Performing PCA using Python's scikit-learn library:<syntaxhighlight lang="python"> from sklearn.decomposition import PCA import numpy as np # Example dataset data = np.array([[2.5, 2.4], [0.5, 0.7], [2.2, 2.9], [1.9, 2.2], [3.1, 3.0]]) # Apply PCA to reduce dimensions to 1 pca = PCA(n_components=1) reduced_data = pca.fit_transform(data) print("Reduced Data:", reduced_data) print("Explained Variance Ratio:", pca.explained_variance_ratio_) </syntaxhighlight> ==Advantages== *'''Dimensionality Reduction:''' Simplifies complex datasets while preserving essential information. *'''Noise Reduction:''' Eliminates redundant features, improving model accuracy. *'''Efficient Data Representation:''' Reduces computation time and storage requirements. ==Limitations== *'''Loss of Interpretability:''' Transformed features (principal components) are linear combinations of original features, making them harder to interpret. *'''Assumption of Linearity:''' PCA assumes that the data's variance is best captured in a linear manner, which may not hold for all datasets. *'''Sensitive to Scaling:''' PCA performance can be affected if the data is not properly standardized. ==Relation to SVD== PCA is closely related to [[Singular Value Decomposition (SVD)]]. In PCA: *Principal components are derived from the eigenvectors of the covariance matrix, which correspond to the left singular vectors in SVD. *Eigenvalues correspond to the squared singular values from SVD. ==Related Concepts and See Also== *[[Singular Value Decomposition]] *[[Dimensionality Reduction]] *[[Latent Semantic Analysis]] *[[Feature Extraction]] *[[Explained Variance]] *[[Data Preprocessing]] *[[Machine Learning]] [[분류:Data Science]]
요약:
IT 위키에서의 모든 기여는 크리에이티브 커먼즈 저작자표시-비영리-동일조건변경허락 라이선스로 배포된다는 점을 유의해 주세요(자세한 내용에 대해서는
IT 위키:저작권
문서를 읽어주세요). 만약 여기에 동의하지 않는다면 문서를 저장하지 말아 주세요.
또한, 직접 작성했거나 퍼블릭 도메인과 같은 자유 문서에서 가져왔다는 것을 보증해야 합니다.
저작권이 있는 내용을 허가 없이 저장하지 마세요!
취소
편집 도움말
(새 창에서 열림)
둘러보기
둘러보기
대문
최근 바뀜
광고
위키 도구
위키 도구
특수 문서 목록
문서 도구
문서 도구
사용자 문서 도구
더 보기
여기를 가리키는 문서
가리키는 글의 최근 바뀜
문서 정보
문서 기록