ML (MachineLearning)

SVM(Support Vector Machine) ์•Œ๊ณ ๋ฆฌ์ฆ˜์œผ๋กœ ์ƒˆ๋กœ์šด ๋ฐ์ดํ„ฐ ์นดํ…Œ๊ณ ๋ฆฌ ๋ถ„๋ฅ˜ํ•˜๊ธฐ

567Rabbit 2024. 4. 15. 15:41

๋จธ์‹ ๋Ÿฌ๋‹์˜ ์ง€๋„ํ•™์Šต์— ์†ํ•˜๋Š”

Classfication(๋ถ„๋ฅ˜)

- Logistic Regression (๋กœ์ง€์Šคํ‹ฑ ํšŒ๊ท€)

- KNN(K nearest neighbor) ์•Œ๊ณ ๋ฆฌ์ฆ˜, 

- SVC(Support Vector Machine) ์•Œ๊ณ ๋ฆฌ์ฆ˜,

- DT(Decision Tree) ์•Œ๊ณ ๋ฆฌ์ฆ˜

 

๋„ค ๊ฐ€์ง€ ๋ฐฉ๋ฒ• ์ค‘์— ์ •ํ™•๋„๊ฐ€ ๋” ๋†’์€ ๋ฐฉ๋ฒ•์œผ๋กœ ์•Œ๊ณ ๋ฆฌ์ฆ˜์„ ์„ ํƒํ•˜์—ฌ ์‚ฌ์šฉํ•œ๋‹ค

 

 

SVM(Support Vector Machine)

  1. SVC (Support Vector Classifier):
    • SVC๋Š” ๋ถ„๋ฅ˜ ๋ฌธ์ œ๋ฅผ ํ•ด๊ฒฐํ•˜๊ธฐ ์œ„ํ•œ SVM์˜ ๋ณ€ํ˜•์ด๋‹ค
    • ์ด๊ฒƒ์€ ์ฃผ์–ด์ง„ ๋ฐ์ดํ„ฐ๋ฅผ ๋ถ„๋ฅ˜ํ•˜๊ธฐ ์œ„ํ•ด ์ตœ์ ์˜ ๋ถ„๋ฆฌ ์ดˆํ‰๋ฉด์„ ์ฐพ๋Š”๋‹ค
    • SVC๋Š” ํด๋ž˜์Šค ๊ฐ„์˜ ๊ฒฝ๊ณ„๋ฅผ ๋ถ„๋ฆฌํ•˜๊ธฐ ์œ„ํ•ด ์ตœ์ ์˜ ์ดˆํ‰๋ฉด์„ ์ฐพ๋Š” ๊ฒƒ์ด ๋ชฉํ‘œ
  2. SVR (Support Vector Regressor):
    • SVR์€ ํšŒ๊ท€ ๋ฌธ์ œ๋ฅผ ํ•ด๊ฒฐํ•˜๊ธฐ ์œ„ํ•œ SVM์˜ ๋ณ€ํ˜•์ด๋‹ค
    • ์ด๊ฒƒ์€ ์ฃผ์–ด์ง„ ๋ฐ์ดํ„ฐ๋ฅผ ์‚ฌ์šฉํ•˜์—ฌ ์—ฐ์†์ ์ธ ๊ฐ’(์ˆซ์ž)์„ ์˜ˆ์ธกํ•˜๋Š” ๋ฐ ์‚ฌ์šฉ๋œ๋‹ค.
    • SVR์€ ์ฃผ์–ด์ง„ ๋ฐ์ดํ„ฐ์˜ ํŒจํ„ด์„ ํ•™์Šตํ•˜๊ณ , ์˜ˆ์ธกํ•˜๋ ค๋Š” ๊ฐ’์— ๋Œ€ํ•œ ์ตœ์ ์˜ ์˜ˆ์ธก ์ดˆํ‰๋ฉด์„ ์ฐพ๋Š”๋‹ค.

 

SVC์™€ SVR ๋ชจ๋‘ SVM์˜ ๊ธฐ๋ณธ ์•„์ด๋””์–ด๋ฅผ ๋”ฐ๋ฅด์ง€๋งŒ, ๋ถ„๋ฅ˜์™€ ํšŒ๊ท€ ๋ฌธ์ œ์— ๋Œ€ํ•ด ๊ฐ๊ฐ ๋‹ค๋ฅธ ๋ชฉํ‘œ๋ฅผ ๊ฐ€์ง€๊ณ  ์žˆ๋‹ค.  ๋ถ„๋ฅ˜ ๋ฌธ์ œ์—์„œ๋Š” ํด๋ž˜์Šค ๊ฐ„์˜ ๊ฒฝ๊ณ„๋ฅผ ๋ถ„๋ฆฌํ•˜๋Š” ์ดˆํ‰๋ฉด์„ ์ฐพ๋Š” ๋ฐ˜๋ฉด, ํšŒ๊ท€ ๋ฌธ์ œ์—์„œ๋Š” ๋ฐ์ดํ„ฐ์˜ ํŒจํ„ด์„ ํ•™์Šตํ•˜๊ณ  ์˜ˆ์ธก๊ฐ’์— ๊ฐ€๊นŒ์šด ์ดˆํ‰๋ฉด์„ ์ฐพ๋Š”๋‹ค.

 

 

 

import numpy as np
import matplotlib.pyplot as plt
import pandas as pd 

df

  User ID Gender Age EstimatedSalary Purchased
0 15624510 Male 19 19000 0
1 15810944 Male 35 20000 0
2 15668575 Female 26 43000 0
3 15603246 Female 27 57000 0
4 15804002 Male 19 76000 0
... ... ... ... ... ...
395 15691863 Female 46 41000 1
396 15706071 Male 51 23000 1
397 15654296 Female 50 20000 1
398 15755018 Male 36 33000 0
399 15594041 Female 49 36000 1

 

๊ตฌ๋งค ํ•œ๋‹ค : 1

๊ตฌ๋งค ์•ˆํ•œ๋‹ค : 0

 

 

์–ด๋А์ชฝ์— ๊ฐ€๊นŒ์šธ์ง€ ์นดํ…Œ๊ณ ๋ฆฌํ•˜๊ธฐ

 

 

 

 

ํŠน์„ฑ์—ด๊ณผ ๋Œ€์ƒ์—ด๋กœ ๋‚˜๋ˆ„๊ธฐ

 

ํŠน์„ฑ ์—ด(X)์€ ๋ฐ์ดํ„ฐ์…‹์—์„œ ๊ฐ๊ฐ์˜ ๊ด€์ธก์น˜์— ๋Œ€ํ•œ ์„ค๋ช…๋ณ€์ˆ˜๋ฅผ ๋‚˜ํƒ€๋‚ธ๋‹ค
๋Œ€์ƒ ์—ด(y)์€ ์˜ˆ์ธกํ•˜๋ ค๋Š” ๊ฐ’์ด ํฌํ•จ๋œ ์—ด์ด๋‹ค

 

y = df['Purchased']

 

X = df.loc[ : , 'Age' : 'EstimatedSalary']

 

 

 

ํ”ผ์ฒ˜์Šค์ผ€์ผ๋ง

from sklearn.preprocessing import StandardScaler

 

X_scaler = StandardScaler()

 

X = X_scaler.fit_transform(X)

 

 

 

 

train๊ณผ test๋กœ ๋‚˜๋ˆ„๊ธฐ

from sklearn.model_selection import train_test_split

 

X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.25, random_state=1)

 

 

 

๋ชจ๋ธ๋งํ•˜๊ธฐ

from sklearn.svm import SVC

 

classifier = SVC()

 

classifier.fit(X_train, y_train)

 

y_pred = classifier.predict(X_test)

 

 

 

confusion_matrix

 

from sklearn.metrics import confusion_matrix, accuracy_score

 

confusion_matrix(y_test, y_pred)

array([[49,  9],
       [ 3, 39]], dtype=int64)

 

 

 

์ •ํ™•๋„ ๊ตฌํ•˜๊ธฐ

accuracy_score(y_test, y_pred)

0.88