(ASCII) Or download ascii files with the vowel training and testing data.
knn.classifier(X.train, y.train, X.test, k.try = 1, pi = rep(1/K, K), CV = F)where
X.train, y.train, X.test have the obvious meanings;
k.try is a vector of neighborhood sizes
pi is a vector of prior probabilities
CV = T if cross-validation is to be used.
CV = T only makes sense if X.train = X.test
The function should return a (n.test x length(k.try)) matrix of predicted
class identities for the n.test test observations and the different values
of $k$ provided in k.try.
k.try = c(1, 3, 7, 11, 15, 21, 27, 35, 43)Which of the spans (i.e. values of k) appears to be best? What is the resubstitution estimate of the risk for this span? What is the cross-validated estimate?
APER( c_n ) <= APER( f ) + ( d / n )
for any linear classifier f.