목록머신러닝 (11)
mojo's Blog

Cross-Validation ※ Hold-out Method Divide a given data into a training set and a test set. - The training set and the test set should NOT overlap each other. How to choose a good model? - With the training set, build each model. - With the test set, evaluate each model. - Choose the model which shows the best performance with the test set. 훈련용 데이터와 테스트용 데이터는 중복이 허용되면 안된다. 훈련용 데이터를 통해 학습이 완료된 모델을..

The Overfitting Problem ※ Polynomial Curver Fitting Which order polynomial does best fit for the data?

Multinomial Logistic Regression ※ Multinomial Logistic Regression It is a classification method that generalizes logistic regression to the multiclass problem, i.e. with more than two possible discrete outcomes. -> Also called softmax regression and multinomial logit Example - Which major will a student choose, given a status of the student? - Which blood type does a person have, given the resul..

The Basic Concept of Logistic Regression ※ Problems in Simple Classification Models It is impossible to find a linear classifier in some cases. The loss function is not differential! 위 사진에서 왼쪽같은 경우는 Simple classification 모델을 통해 분류하는 것이 가능하다. 하지만 오른쪽은 어떤 선을 긋더라도 두 개의 클래스를 분류할 수 없는 경우가 존재한다. 이러한 케이스에서 Simple classification 을 통해 문제를 해결할 수 없다. ※ Probabilistic View for a Linear Classifier What if we ..

※ Estimation in Statistics Use sample statistics to estimate population parameters. -> E.g., Sample means are used to estimate population means. A point estimate of a population parameter is a single value of a statistic. An interval estimate is defined by two numbers, between which a population parameter is said to lie. -

※ Example: Linear Regression Fitting a linear model with a set of variables