๋ณธ๋ฌธ ๋ฐ”๋กœ๊ฐ€๊ธฐ

CS/์ธ๊ณต์ง€๋Šฅ

SVM ์„ ํ™œ์šฉํ•œ ์ŠคํŒธ ๋ถ„๋ฅ˜๊ธฐ ( Spam Classification via SVM )

728x90

SVM(Support Vector Machine)์ด๋ž€?

๊ฒฐ์ • ๊ฒฝ๊ณ„๋ฅผ ํ†ตํ•ด ๋ฐ์ดํ„ฐ๋ฅผ ๋ถ„๋ฅ˜ํ•˜๋Š” ์•Œ๊ณ ๋ฆฌ์ฆ˜์ด๋‹ค. ์ด ๋ชจ๋ธ์€ ๊ฒฐ์ • ์ดˆํ‰๋ฉด(hyperplane)์„ ์ฐพ์•„์„œ, ๋‘ ๊ฐœ์˜ ํด๋ž˜์Šค๋ฅผ ๊ฐ€์žฅ ํฐ ์—ฌ์œ (margin)๋ฅผ ๋‘๊ณ  ๋ถ„๋ฆฌํ•˜๋Š” ๊ฒƒ์„ ๋ชฉํ‘œ๋กœ ํ•œ๋‹ค. ์ฃผ๋กœ ์ด์ง„ ๋ถ„๋ฅ˜ ๋ฌธ์ œ์—์„œ ์‚ฌ์šฉ๋˜๋Š” ๊ฐ•๋ ฅํ•œ ๋ถ„๋ฅ˜ ์•Œ๊ณ ๋ฆฌ์ฆ˜์œผ๋กœ, ์ด๋ฉ”์ผ์˜ ๋‹จ์–ด ๋นˆ๋„๋ฅผ ํŠน์ง•์œผ๋กœ ์‚ฌ์šฉํ•˜์—ฌ ์ŠคํŒธ ๋˜๋Š” ์ŠคํŒธ์ด ์•„๋‹Œ ์ด๋ฉ”์ผ์„ ๋ถ„๋ฅ˜ํ•˜๋Š” ๋ฐ ์‚ฌ์šฉํ•  ์ˆ˜ ์žˆ๋‹ค.

  • ์„ ํ˜• ๋ถ„๋ฅ˜๊ธฐ ๋˜๋Š” ๋น„์„ ํ˜• ๋ถ„๋ฅ˜๊ธฐ๋กœ ์‚ฌ์šฉํ•  ์ˆ˜ ์žˆ๋‹ค.
  • ์ปค๋„ ํŠธ๋ฆญ์„ ์‚ฌ์šฉํ•ด ๊ณ ์ฐจ์› ๊ณต๊ฐ„์œผ๋กœ ๋ฐ์ดํ„ฐ๋ฅผ ๋งคํ•‘ํ•˜์—ฌ ๋น„์„ ํ˜• ๋ถ„๋ฅ˜๋„ ๊ฐ€๋Šฅํ•˜๊ฒŒ ํ•œ๋‹ค.
  • ํ•˜๋“œ ๋งˆ์ง„(hard margin)๊ณผ ์†Œํ”„ํŠธ ๋งˆ์ง„(soft margin)์„ ์‚ฌ์šฉํ•˜์—ฌ ๋ฐ์ดํ„ฐ ๋ถ„๋ฅ˜๋ฅผ ์œ ์—ฐํ•˜๊ฒŒ ์กฐ์ •ํ•  ์ˆ˜ ์žˆ๋‹ค.

๋ชฉํ‘œ

ํ•˜๋“œ ๋งˆ์ง„ SVM, ์†Œํ”„ํŠธ ๋งˆ์ง„ SVM, ๊ฐ€์šฐ์‹œ์•ˆ RBF ์ปค๋„์„ ์‚ฌ์šฉํ•˜๋Š” SVM์„ ๊ตฌํ˜„ํ•˜๊ณ  ๊ฐ๊ฐ์˜ ์„ฑ๋Šฅ์„ ๋น„๊ตํ•ด๋ณผ ๊ฒƒ์ด๋‹ค.

๊ตฌํ˜„

ํ•„์ˆ˜ ๋ผ์ด๋ธŒ๋Ÿฌ๋ฆฌ ๋ฐ ๋ฒ„์ „ ํ™•์ธ

import sys
assert sys.version_info >= (3, 7)

from packaging import version
import sklearn

assert version.parse(sklearn.__version__) >= version.parse("1.0.1")

ํŒŒ์ผ ์ฝ๊ธฐ : ๋ฐ์ดํ„ฐ ์ค€๋น„

ํŒŒ์ผ์„์„ ์ฝ์–ด์™€์„œ ํŠน์ง• ํ–‰๋ ฌ๊ณผ ๋ ˆ์ด๋ธ”์„ ๋ฐ˜ํ™˜ํ•œ๋‹ค.

def svm_readMatrix(file):
    fd = open(file, 'r')
    hdr = fd.readline()
    rows, cols = [int(s) for s in fd.readline().strip().split()]
    tokens = fd.readline().strip().split()
    matrix = np.zeros((rows, cols))
    Y = []
    for i, line in enumerate(fd):
        nums = [int(x) for x in line.strip().split()]
        Y.append(nums[0])
        kv = np.array(nums[1:])
        k = np.cumsum(kv[:-1:2])
        v = kv[1::2]
        matrix[i, k] = v
    category = (np.array(Y) * 2) - 1  # -1๊ณผ 1๋กœ ๋ณ€ํ™˜
    return matrix, tokens, category
  • ๊ฐ ๋ฌธ์„œ๋Š” ์ŠคํŒธ์ธ์ง€ ์•„๋‹Œ์ง€๋ฅผ ๋‚˜ํƒ€๋‚ด๋Š” ๋ ˆ์ด๋ธ”(0 ๋˜๋Š” 1)์„ ๊ฐ€์ง€๊ณ  ์žˆ๋‹ค. ์ด ๋ ˆ์ด๋ธ”์„ SVM์—์„œ ์‚ฌ์šฉํ•˜๋Š” -1๊ณผ 1๋กœ ๋ณ€ํ™˜ํ•˜์—ฌ ๋ฐ˜ํ™˜ํ•œ๋‹ค.
  • matrix๋Š” ๊ฐ ๋ฌธ์„œ์—์„œ ๋‹จ์–ด ๋นˆ๋„์ˆ˜๋ฅผ ๋‚˜ํƒ€๋‚ด๋Š” ํ–‰๋ ฌ, category๋Š” ๊ฐ ๋ฌธ์„œ์˜ ๋ ˆ์ด๋ธ”(-1 ๋˜๋Š” 1)์ด๋‹ค.

SVM ๋ชจ๋ธ ์„ค์ •

์„ธ ๊ฐ€์ง€ SVM ๋ถ„๋ฅ˜๊ธฐ๋ฅผ ์„ค์ •ํ•œ๋‹ค.

  • ํ•˜๋“œ ๋งˆ์ง„ SVM (svm_clf_hard): ์„ ํ˜• ์ปค๋„์„ ์‚ฌ์šฉํ•˜๋ฉฐ, C ๊ฐ’์ด ∞๋กœ ์„ค์ •๋˜์–ด ์žˆ์–ด ๋งค์šฐ ์—„๊ฒฉํ•œ ๋งˆ์ง„์„ ์‚ฌ์šฉ
  • ์†Œํ”„ํŠธ ๋งˆ์ง„ SVM (svm_clf_soft): C ๊ฐ’์„ 1๋กœ ์„ค์ •ํ•˜์—ฌ ์†Œํ”„ํŠธ ๋งˆ์ง„์„ ์‚ฌ์šฉ
  • ๊ฐ€์šฐ์‹œ์•ˆ RBF ์ปค๋„์„ ์‚ฌ์šฉํ•˜๋Š” SVM (svm_clf_rbf): ๋น„์„ ํ˜• ๋ฐ์ดํ„ฐ๋ฅผ ์ฒ˜๋ฆฌํ•  ์ˆ˜ ์žˆ๋„๋ก RBF ์ปค๋„์„ ์‚ฌ์šฉํ•˜๋ฉฐ, gamma์™€ C ๊ฐ’์„ ์กฐ์ •ํ•˜์—ฌ ๋ชจ๋ธ์„ ์„ค์ •

๐Ÿค”โ“ ์ปค๋„ ํŠธ๋ฆญ(Kernel Trick)์ด๋ž€ โ“

๋น„์„ ํ˜• ๋ฐ์ดํ„ฐ๋ฅผ ๋‹ค๋ฃฐ ๋•Œ๋Š”, ๋ฐ์ดํ„ฐ๋ฅผ ๊ณ ์ฐจ์›์œผ๋กœ ๋งคํ•‘ํ•˜์—ฌ ์„ ํ˜• ๋ถ„๋ฆฌ๊ฐ€ ๊ฐ€๋Šฅํ•˜๊ฒŒ ๋งŒ๋“ ๋‹ค. ์ด๋•Œ ์‚ฌ์šฉ๋˜๋Š” ๋ฐฉ๋ฒ•์ด ์ปค๋„ ํŠธ๋ฆญ์ด๋‹ค.

def main():
    # Please set a training file that you want to use for this run below
    trainMatrix, tokenlist, trainCategory = svm_readMatrix('./data/hw2_MATRIX.TRAIN.400')
    testMatrix, tokenlist, testCategory = svm_readMatrix('./data/hw2_MATRIX.TEST')

    # SVM Classifier model

    # Hard margin SVM
    svm_clf_hard = SVC(kernel="linear", C=float("inf"), max_iter=10_000, random_state=42)      

    # Soft margin SVM
    # Find out the best parameters of C, max_iter, and so on
    svm_clf_soft = SVC(kernel="linear", C=1, max_iter=10_000, random_state=42)

    # Gaussian RBF SVM
    # Find out the best parameters of gamma, C, max_iter, and so on
    svm_clf_rbf = SVC(kernel="rbf", gamma=8, C=0.001, max_iter=10_000, random_state=42)

    scaler = StandardScaler()

    # Scaled version for each SVM and we will use these
    scaled_svm_clf_hard = make_pipeline(scaler, svm_clf_hard)
    scaled_svm_clf_soft = make_pipeline(scaler, svm_clf_soft)
    scaled_svm_clf_rbf = make_pipeline(scaler, svm_clf_rbf)
  • ์ด๋•Œ, StandardScaler๋ฅผ ์‚ฌ์šฉํ•˜์—ฌ ๋ฐ์ดํ„ฐ๋ฅผ ํ‘œ์ค€ํ™”ํ•œ๋‹ค. ํ‘œ์ค€ํ™”๋Š” ๋ชจ๋“  ํŠน์„ฑ์„ ํ‰๊ท  0, ํ‘œ์ค€ ํŽธ์ฐจ 1๋กœ ๋ณ€ํ™˜ํ•˜์—ฌ SVM์˜ ์„ฑ๋Šฅ์„ ํ–ฅ์ƒ์‹œํ‚จ๋‹ค.
  • ๊ฐ SVM ๋ชจ๋ธ๊ณผ ํ‘œ์ค€ํ™” ์Šค์ผ€์ผ๋Ÿฌ๋ฅผ make_pipeline()์œผ๋กœ ์—ฐ๊ฒฐํ•˜์—ฌ ํŒŒ์ดํ”„๋ผ์ธ์„ ๋งŒ๋“ ๋‹ค. ์ด๋ฅผ ํ†ตํ•ด ๋ฐ์ดํ„ฐ๋ฅผ ํ‘œ์ค€ํ™”ํ•œ ํ›„ SVM ๋ชจ๋ธ์— ์ „๋‹ฌํ•  ์ˆ˜ ์žˆ๋‹ค.
  • C ํŒŒ๋ผ๋ฏธํ„ฐ: ์˜ค๋ฅ˜๋ฅผ ํ—ˆ์šฉํ•˜๋Š” ์ •๋„๋ฅผ ์กฐ์ •. C๊ฐ€ ํฌ๋ฉด ์˜ค๋ฅ˜๋ฅผ ์ ๊ฒŒ ํ—ˆ์šฉํ•˜๊ณ , ์ž‘์œผ๋ฉด ์˜ค๋ฅ˜๋ฅผ ๋” ๋งŽ์ด ํ—ˆ์šฉํ•˜์—ฌ ์†Œํ”„ํŠธ ๋งˆ์ง„์„ ํ˜•์„ฑ.

RBF SVM ์ตœ์ ์˜ ํŒŒ๋ผ๋ฏธํ„ฐ ์ฐพ๊ธฐ - ๊ทธ๋ฆฌ๋“œ ์„œ์น˜ (grid search)

๊ทธ๋ฆฌ๋“œ ์„œ์น˜(Grid Search)๋Š” ๋‹ค์–‘ํ•œ ํ•˜์ดํผํŒŒ๋ผ๋ฏธํ„ฐ ๊ฐ’์˜ ์กฐํ•ฉ์„ ์‹œ๋„ํ•˜์—ฌ ๊ทธ ์ค‘์—์„œ ์ตœ์ ์˜ ๊ฐ’์„ ์ฐพ๋Š” ๋ฐฉ๋ฒ•์ด๋‹ค.

๊ฐ ํ•˜์ดํผํŒŒ๋ผ๋ฏธํ„ฐ ๊ฐ’์˜ ๋ชจ๋“  ์กฐํ•ฉ์„ ์‹œ๋„ํ•˜๋ฉด์„œ, ๊ฐ ์กฐํ•ฉ์— ๋Œ€ํ•œ ์„ฑ๋Šฅ์„ ํ‰๊ฐ€ํ•œ ํ›„, ์ตœ์ ์˜ ์„ฑ๋Šฅ์„ ๋‚ด๋Š” ํŒŒ๋ผ๋ฏธํ„ฐ๋ฅผ ์„ ํƒํ•œ๋‹ค.

SVM ๋ชจ๋ธ ํ•™์Šต

scikit-learn์˜ SVM ๋ชจ๋ธ์—์„œ๋Š” ํ•™์Šต ๊ณผ์ •์ด fit() ๋ฉ”์„œ๋“œ๋ฅผ ํ†ตํ•ด ์ด๋ฃจ์–ด์ง„๋‹ค.

scaled_svm_clf_hard.fit(trainMatrix, trainCategory)
scaled_svm_clf_soft.fit(trainMatrix, trainCategory)
scaled_svm_clf_rbf.fit(trainMatrix, trainCategory)

ํ•™์Šต ๋ฐ์ดํ„ฐ(X)์™€ ๋ ˆ์ด๋ธ”(y)์„ ์‚ฌ์šฉํ•˜์—ฌ ๋ชจ๋ธ์„ ํ•™์Šตํ•œ๋‹ค.

ํ•™์Šต ๊ณผ์ • ์ค‘์—๋Š” ๋‹ค์Œ ์ž‘์—…์ด ์ˆ˜ํ–‰๋œ๋‹ค.

  • ์ตœ์ ์˜ ๊ฒฐ์ • ๊ฒฝ๊ณ„๋ฅผ ์ฐพ์Œ.
  • ํ•™์Šต ๋ฐ์ดํ„ฐ์—์„œ ์„œํฌํŠธ ๋ฒกํ„ฐ๋ฅผ ์„ ํƒ.
  • ๊ฒฐ์ • ๊ฒฝ๊ณ„๋ฅผ ์ •์˜ํ•˜๋Š” ๊ฐ€์ค‘์น˜(weight)์™€ ์ ˆํŽธ(bias)์„ ํ•™์Šต

ํ…Œ์ŠคํŠธ ๋ฐ์ดํ„ฐ์— ๋Œ€ํ•œ ์˜ˆ์ธก

def svm_test(svm, matrix):
    M, N = matrix.shape
    output = svm.predict(matrix)

    return output
  • ์—ฌ๊ธฐ์„œ๋Š” svm.predict()์™€ ๊ฐ™์€ ๋ฐฉ์‹์œผ๋กœ SVM ๋ชจ๋ธ์„ ์‚ฌ์šฉํ•ด ์˜ˆ์ธก์„ ์ˆ˜ํ–‰ํ•  ์ˆ˜ ์žˆ๋‹ค.
  • matrix๋Š” ํ…Œ์ŠคํŠธ ๋ฐ์ดํ„ฐ์ด๋ฉฐ, ์ด๋ฅผ ์‚ฌ์šฉํ•ด ๊ฐ ๋ฌธ์„œ๊ฐ€ ์ŠคํŒธ์ธ์ง€ ์•„๋‹Œ์ง€ ์˜ˆ์ธกํ•œ ๊ฒฐ๊ณผ๋ฅผ output ๋ฐฐ์—ด์— ์ €์žฅํ•œ๋‹ค.

SVM ๋ชจ๋ธ ์„ฑ๋Šฅ ํ‰๊ฐ€

๋ชจ๋ธ์˜ ์˜ˆ์ธก ๊ฒฐ๊ณผ์™€ ์‹ค์ œ ๋ ˆ์ด๋ธ”์„ ๋น„๊ตํ•˜์—ฌ ์˜ค๋ฅ˜์œจ(error rate) ๋ฅผ ๊ณ„์‚ฐํ•œ๋‹ค.

def svm_evaluate(output, label):
    error = (output != label).sum() * 1. / len(output)
    print('Error: %1.4f' % error)
    return error
    print("\n== compare SVM implementations  ==\n")
    print("Hard margin SVM ",end="")
    svm_evaluate(output_hard, testCategory)

    print("Soft margin SVM ",end="")
    svm_evaluate(output_soft, testCategory)

    print("Gaussian RBF SVM ",end="")
    svm_evaluate(output_rbf, testCategory)

    print("\n=================================\n")

 

๊ฒฐ๊ณผ

Hard, Soft ์—์„œ ์ตœ์ € ์—๋Ÿฌ์œจ์„ ๊ธฐ๋กํ•˜๊ณ , RBF ์—์„œ๋Š” ๊ทธ๋ณด๋‹ค๋Š” ๋” ๋†’์€ ์—๋Ÿฌ์œจ์„ ๊ธฐ๋กํ•œ๋‹ค.

ํ•™์Šต ๋ฐ์ดํ„ฐ ์ˆ˜์— ๋”ฐ๋ฅธ ๋ชจ๋ธ ์„ฑ๋Šฅ ๋น„๊ต

Test Error vs Training Set Size for Three SVM and Naive Bayes์— ๋Œ€ํ•œ ์˜ˆ์ธก:

1. Hard Margin SVM:

  • ์ž‘์€ ํ›ˆ๋ จ ์„ธํŠธ ํฌ๊ธฐ์—์„œ๋Š” ๊ณผ์ ํ•ฉ(overfitting)์ด ๋ฐœ์ƒํ•  ๊ฐ€๋Šฅ์„ฑ์ด ์žˆ๋‹ค. ์™œ๋ƒํ•˜๋ฉด ํ•˜๋“œ ๋งˆ์ง„ SVM์€ ๋ฐ์ดํ„ฐ๋ฅผ ์™„๋ฒฝํ•˜๊ฒŒ ๋ถ„๋ฆฌํ•˜๋ ค ํ•˜๊ธฐ ๋•Œ๋ฌธ์— ์ž‘์€ ๋ฐ์ดํ„ฐ์—์„œ๋Š” ๊ณผํ•˜๊ฒŒ ์ ํ•ฉ๋˜๊ธฐ ์‰ฝ๋‹ค.
  • ํ›ˆ๋ จ ์„ธํŠธ ํฌ๊ธฐ๊ฐ€ ์ปค์ง์— ๋”ฐ๋ผ ๊ณผ์ ํ•ฉ ํ˜„์ƒ์ด ์ค„์–ด๋“ค๊ณ , ํ…Œ์ŠคํŠธ ์˜ค๋ฅ˜์œจ์ด ์•ˆ์ •์ ์œผ๋กœ ๋‚ฎ์€ ์ˆ˜์ค€์œผ๋กœ ์ˆ˜๋ ดํ•œ๋‹ค.

2. Soft Margin SVM:

  • ์ดˆ๊ธฐ ์ž‘์€ ํ›ˆ๋ จ ์„ธํŠธ ํฌ๊ธฐ์—์„œ๋Š” ์ ๋‹นํ•œ ์œ ์—ฐ์„ฑ(soft margin) ๋•๋ถ„์— ํ•˜๋“œ ๋งˆ์ง„ SVM๋ณด๋‹ค๋Š” ๋” ๋‚˜์€ ์„ฑ๋Šฅ์„ ๋ณด์ผ ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.
  • ํ›ˆ๋ จ ์„ธํŠธ ํฌ๊ธฐ๊ฐ€ ์ปค์งˆ์ˆ˜๋ก ์ ์ฐจ ํ…Œ์ŠคํŠธ ์˜ค๋ฅ˜์œจ์ด ๋‚ฎ์•„์ง€๊ณ  ์•ˆ์ •๋  ๊ฒƒ์ด๋‹ค.
  • ๊ทธ๋Ÿฌ๋‚˜ C ๊ฐ’์ด ์ ์ ˆํ•˜์ง€ ์•Š์œผ๋ฉด ์†Œํ”„ํŠธ ๋งˆ์ง„ SVM์ด ์ตœ์ ์˜ ์„ฑ๋Šฅ์„ ๋ฐœํœ˜ํ•˜์ง€ ๋ชปํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.
  • ์ฒ˜์Œ์—๋Š” ํ…Œ์ŠคํŠธ ์˜ค๋ฅ˜์œจ์ด ์ค‘๊ฐ„ ์ˆ˜์ค€์„ ์œ ์ง€ํ•˜๋‹ค๊ฐ€, ์ ์  ๋” ๋งŽ์€ ๋ฐ์ดํ„ฐ๋ฅผ ํ•™์Šตํ•˜๋ฉด์„œ ์˜ค๋ฅ˜์œจ์ด ์ค„์–ด๋“ค์ง€๋งŒ ํ•˜๋“œ ๋งˆ์ง„ SVM๋ณด๋‹ค ์ฒœ์ฒœํžˆ ์ˆ˜๋ ดํ•  ๊ฐ€๋Šฅ์„ฑ์ด ์žˆ๋‹ค.

3. RBF SVM:

  • ์ดˆ๊ธฐ ์ž‘์€ ํ›ˆ๋ จ ์„ธํŠธ ํฌ๊ธฐ์—์„œ๋Š” ๋น„์„ ํ˜• ํŠน์„ฑ์„ ์ž˜ ํฌ์ฐฉํ•˜์ง€ ๋ชปํ•˜๊ณ , ์ž˜๋ชป๋œ ํ•™์Šต์ด ์ด๋ฃจ์–ด์งˆ ์ˆ˜ ์žˆ๋‹ค. ๋”ฐ๋ผ์„œ ์ดˆ๊ธฐ์—๋Š” ๋†’์€ ์˜ค๋ฅ˜์œจ์ด ๋‚˜์˜ฌ ๊ฐ€๋Šฅ์„ฑ์ด ํฌ๋‹ค.
  • ํ›ˆ๋ จ ์„ธํŠธ ํฌ๊ธฐ๊ฐ€ ์ปค์ง€๋ฉด, RBF ์ปค๋„์ด ๋น„์„ ํ˜• ๋ฐ์ดํ„ฐ๋ฅผ ์ž˜ ํ•™์Šตํ•˜๊ธฐ ์‹œ์ž‘ํ•˜๊ณ , ์˜ค๋ฅ˜์œจ์ด ์ค„์–ด๋“ค ๊ฒƒ์ด๋‹ค.

4. Naive Bayes:

  • ๋‚˜์ด๋ธŒ ๋ฒ ์ด์ฆˆ๋Š” ๋น„๊ต์  ๋‹จ์ˆœํ•œ ๋ชจ๋ธ์ด๋ฏ€๋กœ ํ›ˆ๋ จ ๋ฐ์ดํ„ฐ์˜ ํฌ๊ธฐ์— ํฌ๊ฒŒ ์˜ํ–ฅ์„ ๋ฐ›์ง€ ์•Š๋Š”๋‹ค.
  • ์ž‘์€ ๋ฐ์ดํ„ฐ์„ธํŠธ์—์„œ๋„ ๋น„๊ต์  ์•ˆ์ •์ ์ธ ์„ฑ๋Šฅ์„ ๋ณด์ธ๋‹ค.
  • ํ›ˆ๋ จ ์„ธํŠธ ํฌ๊ธฐ๊ฐ€ ์ปค์ ธ๋„ ํฐ ๊ฐœ์„ ์„ ๊ธฐ๋Œ€ํ•  ์ˆ˜ ์—†์ง€๋งŒ, ์ดˆ๊ธฐ๋ถ€ํ„ฐ ๋‚ฎ์€ ์˜ค๋ฅ˜์œจ์„ ์œ ์ง€ํ•  ๊ฐ€๋Šฅ์„ฑ์ด ํฌ๋‹ค.

๋”ฐ๋ผ์„œ, ์ตœ์ข…์ ์œผ๋กœ ํ›ˆ๋ จ ์„ธํŠธ ํฌ๊ธฐ๊ฐ€ ์ฆ๊ฐ€ํ•จ์— ๋”ฐ๋ผ ํ•˜๋“œ ๋งˆ์ง„ SVM๊ณผ ์†Œํ”„ํŠธ ๋งˆ์ง„ SVM์€ ์„ฑ๋Šฅ์ด ํ–ฅ์ƒ๋  ๊ฒƒ์ด๋ฉฐ, Naive Bayes๋Š” ์•ˆ์ •์ ์œผ๋กœ ์ข‹์€ ์„ฑ๋Šฅ์„ ๋ณด์ผ ๊ฒƒ์ด๋‹ค. RBF SVM์€ ๋งค๊ฐœ๋ณ€์ˆ˜์— ๋”ฐ๋ผ ์„ฑ๋Šฅ์ด ๊ฒฐ์ •๋˜์ง€๋งŒ, ์ผ๋ฐ˜์ ์œผ๋กœ ์ดˆ๊ธฐ์—๋Š” ๋‚ฎ์€ ์„ฑ๋Šฅ์„ ๋ณด์ด๋‹ค๊ฐ€ ํ›ˆ๋ จ ์„ธํŠธ ํฌ๊ธฐ๊ฐ€ ์ปค์ง์— ๋”ฐ๋ผ ์„ฑ๋Šฅ์ด ํ–ฅ์ƒ๋  ๊ฒƒ์ด๋‹ค.