User contributions for 핵톤
From IT Wiki
14 November 2024
- 05:3005:30, 14 November 2024 diff hist +2,823 N 데이터베이스 후보 키 Created page with "'''후보 키'''(Candidate Key)는 데이터베이스 테이블에서 각 행을 고유하게 식별할 수 있는 속성 또는 속성들의 집합을 의미한다. 후보 키는 테이블 내의 모든 행을 유일하게 구분할 수 있는 최소한의 속성 집합으로, 기본 키(primary key)로 선택될 수 있는 후보가 된다. ==후보 키의 조건== 후보 키가 되기 위해서는 다음 조건을 만족해야 한다. *'''유일성'''(Uniqueness): 후..." current Tag: Visual edit
- 05:2805:28, 14 November 2024 diff hist +43 N 후보 키 Redirected page to 데이터베이스 후보 키 current Tags: New redirect Visual edit
- 05:2805:28, 14 November 2024 diff hist +3,302 N 데이터베이스 보이스-코드 정규형 Created page with "'''보이스-코드 정규형'''(Boyce-Codd Normal Form, BCNF)은 데이터베이스 정규화의 네 번째 단계로, 제3정규형(3NF)을 강화한 형태이다. 보이스-코드 정규형은 제3정규형을 만족하면서, 모든 결정자가 후보 키가 되도록 요구하여 데이터베이스의 설계를 더욱 엄격하게 한다. ==보이스-코드 정규형의 조건== 보이스-코드 정규형을 만족하기 위해서는 다음 조건을 충족해야..." current Tag: Visual edit
- 04:5504:55, 14 November 2024 diff hist +3,346 N 데이터베이스 제3정규형 Created page with "'''Third Normal Form, 3NF''' '''제3정규형'''은 데이터베이스 정규화의 세 번째 단계로, 제2정규형(2NF)을 만족하면서 테이블 내에서 이행적 종속성(transitive dependency)을 제거하는 것을 목표로 한다. 제3정규형은 기본 키에만 종속하도록 설계하여 데이터 중복을 줄이고 데이터 무결성을 더욱 강화한다. ==제3정규형의 조건== 제3정규형을 만족하기 위해..." current Tag: Visual edit
- 04:4804:48, 14 November 2024 diff hist +2,763 N 부분 함수 종속성 Created page with "'''Partial Functional Dependency''' '''부분 함수 종속성'''은 데이터베이스 정규화 과정에서, 합성 키(composite key)를 가진 릴레이션에서 기본 키의 일부에만 종속하는 속성이 존재하는 경우를 의미한다. 부분 함수 종속성은 데이터 중복과 비효율적인 데이터 구조를 초래할 수 있으며, 제2정규형(2NF)에서는 이를 제거하는 것이 목표이다. ==개요== 부분 함수 종속성은 주..." current Tag: Visual edit
- 04:4604:46, 14 November 2024 diff hist +3,025 N 데이터베이스 제2정규형 Created page with "'''Second Normal Form, 2NF''' '''제2정규형'''은 데이터베이스 정규화의 두 번째 단계로, 제1정규형(1NF)을 만족하면서 테이블 내에서 '''부분 함수 종속성(�Partial Functional Dependency)'''을 제거하는 것을 목표로 한다. 제2정규형은 기본 키의 일부에만 종속하는 속성을 제거하여 데이터 중복을 줄이고 데이터 무결성을 향상시킨다. ==제2정규형..." current Tag: Visual edit
- 04:3604:36, 14 November 2024 diff hist +32 데이터베이스 제1정규형 No edit summary current Tag: Visual edit
- 04:3504:35, 14 November 2024 diff hist +2,606 N 데이터베이스 제1정규형 Created page with "'''제1정규형'''(First Normal Form, 1NF)은 데이터베이스 정규화의 첫 번째 단계로, 테이블의 모든 속성이 원자값(atomic value)을 가지도록 설계하는 것을 의미한다. 즉, 테이블 내의 각 열(속성)은 더 이상 나눌 수 없는 단일 값을 가져야 한다. 이를 통해 데이터의 중복을 줄이고 데이터 무결성을 강화할 수 있다. ==제1정규형의 조건== 제1정규형을 만족하기 위해서는 다..." Tag: Visual edit
5 November 2024
- 09:1009:10, 5 November 2024 diff hist +5,902 N Missing Data Created page with "Missing Data refers to the absence of values in a dataset, which can occur due to various reasons such as data entry errors, equipment malfunctions, or privacy concerns. Handling missing data is crucial in data science and machine learning, as it can impact the quality, accuracy, and interpretability of models. Properly addressing missing values ensures that analyses are more reliable and that models generalize well to new data. ==Types of Missing Data== There are three..." current Tag: Visual edit
- 09:0409:04, 5 November 2024 diff hist +5,085 N Normalization (Data Science) Created page with "Normalization in data science is a preprocessing technique used to adjust the values of numerical features to a common scale, typically between 0 and 1 or -1 and 1. Normalization ensures that features with different ranges contribute equally to the model, improving training stability and model performance. It is especially important in machine learning algorithms that rely on distance calculations, such as k-nearest neighbors (kNN) and clustering. ==Importance of Normali..." current Tag: Visual edit
- 08:0908:09, 5 November 2024 diff hist −4 Feature Selection No edit summary current Tag: Visual edit
- 07:4907:49, 5 November 2024 diff hist +6,277 N Bias-Variance Trade-Off Created page with "The Bias-Variance Trade-Off is a fundamental concept in machine learning that describes the balance between two sources of error that affect model performance: bias and variance. The goal is to achieve a balance between bias and variance that minimizes the model’s total error, enabling it to generalize well to new, unseen data. ==Understanding Bias and Variance== *'''Bias''': Refers to the error introduced by approximating a complex real-world problem with a simplified..." current Tag: Visual edit
- 07:1007:10, 5 November 2024 diff hist +4,593 N Decision Tree Prunning Created page with "Pruning is a technique used in decision trees and machine learning to reduce the complexity of a model by removing sections of the tree that provide little predictive power. The primary goal of pruning is to prevent overfitting, ensuring that the model generalizes well to unseen data. Pruning is widely used in decision trees and ensemble methods, such as random forests, to create simpler, more interpretable models. ==Types of Pruning== There are two main types of pruning..." current Tag: Visual edit
- 07:0507:05, 5 November 2024 diff hist +5,458 N N-Fold Cross-Validation Created page with "N-Fold Cross-Validation is a technique used in machine learning to evaluate a model's performance by dividing the dataset into multiple subsets, or "folds." In this method, the dataset is split into N equal parts, where the model is trained on N-1 folds and tested on the remaining fold. This process is repeated N times, each time using a different fold as the test set, and the results are averaged to obtain an overall performance estimate. N-fold cross-validation helps t..." current Tag: Visual edit
- 06:5006:50, 5 November 2024 diff hist +5,426 N Undersampling Created page with "'''Undersampling is a technique used in data science and machine learning to address class imbalance by reducing the number of samples in the majority class'''. Unlike oversampling, which increases the representation of the minority class, undersampling aims to balance the dataset by removing instances from the majority class. This technique is commonly applied in scenarios where the majority class significantly outnumbers the minority class, such as fraud detection..." current Tag: Visual edit
- 06:4706:47, 5 November 2024 diff hist +5,524 N Oversampling Created page with "Oversampling is a technique used in data science and machine learning to address class imbalance by increasing the number of samples in the minority class. In classification tasks with imbalanced datasets, oversampling helps to balance the distribution of classes, allowing the model to learn patterns from both majority and minority classes. Oversampling is commonly used in applications such as fraud detection, medical diagnosis, and other areas where certain classes are..." current Tag: Visual edit
- 06:4306:43, 5 November 2024 diff hist −1 Stratified Sampling No edit summary current Tag: Visual edit
- 06:4206:42, 5 November 2024 diff hist +4,800 N Stratified Sampling Created page with "Stratified Sampling is a sampling technique used to ensure that subsets of data (called “strata”) maintain the same distribution of key characteristics as the original dataset. In data science and machine learning, stratified sampling is often used to create training, validation, and test splits, particularly when dealing with imbalanced datasets. This method ensures that each subset is representative of the entire dataset, improving the model's ability to generalize..." Tag: Visual edit
- 06:3606:36, 5 November 2024 diff hist +5,033 N Data Partition Created page with "'''Data Partition is a process in data science and machine learning where a dataset is divided into separate subsets to train, validate, and test a model'''. Data partitioning ensures that the model is evaluated on data it has not seen before, helping prevent overfitting and ensuring that it generalizes well to new data. Common partitions include training, validation, and test sets, each serving a specific purpose in the model development process. ==Types of Data Partiti..." current Tag: Visual edit
- 06:2706:27, 5 November 2024 diff hist +4,980 N Special:Badtitle/NS102:CRISP-DM Created page with "The Cross-Industry Standard Process for Data Mining (CRISP-DM) is a widely adopted methodology for data mining and analytics projects. Developed in the 1990s, CRISP-DM provides a structured, six-phase approach to guide data scientists and analysts through the process of developing and deploying data mining models. It is industry-agnostic, making it applicable to various fields and data science projects. == Phases of CRISP-DM == CRISP-DM consists of six main phases, each..." current
- 06:2706:27, 5 November 2024 diff hist +1,190 N Special:Badtitle/NS100:CRISP-DM Created page with "분류:데이터 과학 ;데이터 마이닝 기업들이 모여서 공동으로 제정한, 초보자나 전문가가 비즈니스 전문가와 함께 모형을 만들어 내는 포괄적인 방법론이며 어떤 산업 분야에도 적용할 수 있는 표준적 데이터마이닝 프로세스 == 절차 == 파일:CRISP-DM 절차도.png {| class="wikitable" ! ! 절차 ! 세부 활동 ! 비고 |- | ① | Business Understanding | 업무 목표 수립, 현..." current
- 06:1606:16, 5 November 2024 diff hist +5,626 N Feature Selection Created page with "'''Feature Selection is a process in machine learning and data science that involves identifying and selecting the most relevant features (or variables) in a dataset to improve model performance, reduce overfitting, and decrease computational cost'''. By removing irrelevant or redundant features, feature selection simplifies the model, enhances interpretability, and often improves accuracy. ==Importance of Feature Selection== Feature selection is a crucial step in the mo..." Tag: Visual edit
- 06:1406:14, 5 November 2024 diff hist +3,236 Entropy (Data Science) No edit summary current Tag: Visual edit
- 06:1106:11, 5 November 2024 diff hist +55 Information Gain No edit summary current Tag: Visual edit
- 06:1006:10, 5 November 2024 diff hist +20 Information Gain No edit summary Tag: Visual edit
- 06:0606:06, 5 November 2024 diff hist +42 N Gini Impurity Redirected page to Gini Impurity (Data Science) current Tags: New redirect Visual edit
- 06:0606:06, 5 November 2024 diff hist +4,722 N Information Gain Created page with "Information Gain is a metric used in machine learning to measure the effectiveness of a feature in classifying data. It quantifies the reduction in entropy (impurity) achieved by splitting a dataset based on a particular feature. Information gain is widely used in decision tree algorithms to select the best feature for each node split, maximizing the model’s predictive accuracy. ==Definition of Information Gain== Information gain is defined as the difference in entropy..." Tag: Visual edit
- 06:0406:04, 5 November 2024 diff hist +26 Impurity (Data Science) No edit summary current Tag: Visual edit
- 06:0406:04, 5 November 2024 diff hist +4,904 N Impurity (Data Science) Created page with "In data science, impurity refers to the degree of heterogeneity in a dataset, specifically within a group of data points. Impurity is commonly used in decision trees to measure how "mixed" the classes are within each node or split. A high impurity indicates a mix of different classes, while a low impurity suggests that the data is homogenous or predominantly from a single class. Impurity measures guide the decision tree-building process by helping identify the best featu..." Tag: Visual edit
- 06:0206:02, 5 November 2024 diff hist +36 Entropy (Data Science) No edit summary Tag: Visual edit
- 05:5805:58, 5 November 2024 diff hist +36 N Entropy Redirected page to Entropy (Data Science) current Tags: New redirect Visual edit
- 05:5705:57, 5 November 2024 diff hist +394 Main Page No edit summary current Tag: Visual edit: Switched
- 05:5405:54, 5 November 2024 diff hist +6,606 N Clustering Algorithm Created page with "Clustering algorithms are a type of unsupervised learning technique used to group similar data points together based on their features. Unlike classification, clustering does not require labeled data, as the goal is to discover inherent structures within the data. Clustering is widely applied in data exploration, customer segmentation, image processing, and anomaly detection. ==Types of Clustering Algorithms== Several types of clustering algorithms are commonly used, eac..." current Tag: Visual edit
- 05:5205:52, 5 November 2024 diff hist +34 N Clustering Redirected page to Clustering Algorithm current Tags: New redirect Visual edit
- 05:5205:52, 5 November 2024 diff hist +38 N Classification Redirected page to Classification Algorithm current Tags: New redirect Visual edit
- 05:5005:50, 5 November 2024 diff hist +5,617 N Gradient Descent Created page with "'''Gradient Descent''' is an optimization algorithm used to minimize a function by iteratively moving toward the function's minimum. In machine learning, gradient descent is commonly used to minimize the loss function, adjusting model parameters (weights and biases) to improve the model's performance. The algorithm calculates the gradient of the loss function with respect to each parameter and updates the parameters in the opposite direction of the gradient to reduce err..." current Tag: Visual edit
- 05:4605:46, 5 November 2024 diff hist +7,157 N Deep Neural Network Created page with "A Deep Neural Network (DNN) is an artificial neural network with multiple hidden layers between the input and output layers. This deep structure allows the model to learn complex, hierarchical patterns in data by progressively extracting higher-level features from raw inputs. DNNs are foundational to deep learning and have achieved state-of-the-art results in various applications, including image recognition, natural language processing, and robotics. ==Structure of a De..." current Tag: Visual edit
- 05:4305:43, 5 November 2024 diff hist +5,660 N Multi-Layer Perceptron Created page with "A Multi-Layer Perceptron (MLP) is a type of artificial neural network with multiple layers of neurons, including one or more hidden layers between the input and output layers. Unlike single-layer '''perceptrons''', which can only solve linearly separable problems, MLPs can model complex, non-linear relationships, making them suitable for a wide range of machine learning tasks. ==Structure of a Multi-Layer Perceptron== An MLP consists of three main types of..." current Tag: Visual edit
- 05:3605:36, 5 November 2024 diff hist +4,388 N Perceptron Created page with "The Perceptron is a type of artificial neuron and one of the simplest models in machine learning, used for binary classification tasks. It is a linear classifier that learns to separate data into two classes by finding an optimal hyperplane. Originally developed in the 1950s, the perceptron laid the foundation for more complex neural network architectures. ==Structure of a Perceptron== A perceptron consists of several key components: *'''Inputs''': The feature values fro..." current Tag: Visual edit
- 05:3205:32, 5 November 2024 diff hist +6,649 N Neural Network Created page with "A Neural Network is a machine learning model inspired by the structure and functioning of the human brain. Neural networks consist of layers of interconnected nodes, or "neurons," which process data and learn patterns through weighted connections. Neural networks are foundational to deep learning and are used extensively in complex tasks such as image and speech recognition, natural language processing, and robotics. ==Structure of a Neural Network== A typical neural net..." current Tag: Visual edit
- 05:2905:29, 5 November 2024 diff hist +6,227 N Machine Learning Created page with "'''Machine Learning''' is a branch of artificial intelligence (AI) that focuses on building systems that can learn from data, identify patterns, and make decisions with minimal human intervention. By training algorithms on datasets, machine learning enables computers to make predictions, classify data, and detect insights automatically. ==Types of Machine Learning== Machine learning is typically categorized into several types based on the way models learn from data: *'''..." current Tag: Visual edit
- 02:5402:54, 5 November 2024 diff hist +5,443 N Deep Learning Created page with "Deep Learning is a subset of machine learning focused on using neural networks with multiple layers to model complex patterns in large datasets. By learning hierarchies of features directly from data, deep learning can automatically extract representations that are often difficult to engineer manually. It is widely used in applications such as image recognition, natural language processing, and autonomous driving. ==Key Concepts in Deep Learning== Deep learning involves..." current Tag: Visual edit
- 02:5302:53, 5 November 2024 diff hist +5,243 N Similarity (Data Science) Created page with "In data science, similarity refers to a measure of how alike two data points, items, or sets of features are. It is a fundamental concept in various machine learning and data analysis tasks, particularly in clustering, recommendation systems, and classification. Similarity metrics quantify the closeness or resemblance between data points, enabling models to group, rank, or classify them based on shared characteristics. ==Key Similarity Measures== Several similarity metri..." current Tag: Visual edit
- 02:2802:28, 5 November 2024 diff hist +4,467 N Cross-Validation Created page with "Cross-Validation is a technique in machine learning used to evaluate a model’s performance on unseen data. It involves partitioning the dataset into multiple subsets, training the model on some subsets while testing on others. Cross-validation helps detect overfitting and underfitting, ensuring the model generalizes well to new data. ==Key Concepts in Cross-Validation== Cross-validation is based on the following key principles: *'''Training and Validation Splits''': Cr..." current Tag: Visual edit
- 02:2602:26, 5 November 2024 diff hist +4,364 N Underfitting Created page with "Underfitting is a common issue in machine learning where a model is too simple to capture the underlying patterns in the data. As a result, the model performs poorly on both training and test datasets, failing to achieve high accuracy. Underfitting occurs when the model lacks the capacity or complexity needed to represent the relationships within the data. ==Causes of Underfitting== Several factors contribute to underfitting in machine learning models: *'''Over-Simplifie..." current Tag: Visual edit
- 02:2502:25, 5 November 2024 diff hist +63 Overfitting No edit summary current Tag: Visual edit
- 02:2502:25, 5 November 2024 diff hist +4,444 N Overfitting Created page with "'''Overfitting''' is a common issue in machine learning where a model learns the training data too closely, capturing noise and specific patterns that do not generalize well to new, unseen data. This results in high accuracy on the training set but poor performance on test data, as the model fails to generalize and instead memorizes irrelevant details. ==Causes of Overfitting== Several factors contribute to overfitting in machine learning models: *'''Complex Models''': M..." Tag: Visual edit
- 02:2302:23, 5 November 2024 diff hist +37 Unsupervised Learning No edit summary current Tag: Visual edit
- 02:2202:22, 5 November 2024 diff hist +4,865 N Unsupervised Learning Created page with "Unsupervised Learning is a type of machine learning where the model is trained on an unlabeled dataset, meaning the data has no predefined outputs. The goal is for the model to discover hidden patterns, structures, or relationships within the data. Unsupervised learning is widely used for tasks like clustering, dimensionality reduction, and anomaly detection, where understanding the inherent structure of data is valuable. ==Key Concepts in Unsupervised Learning== Several..." Tag: Visual edit
- 02:2102:21, 5 November 2024 diff hist +4,449 N Supervised Learning Created page with "'''Supervised Learning''' is a type of machine learning where the model is trained on a labeled dataset, meaning each input comes with a corresponding output. The goal is to learn a mapping from inputs to outputs, allowing the model to make predictions or classifications based on new, unseen data. Supervised learning is widely used in applications where historical data can be used to predict future outcomes. ==Key Concepts in Supervised Learning== Several key concepts fo..." current Tag: Visual edit