Consider the following set of two-dimensional records:
Also consider two different clustering schemes: (1) where Cluster 1 contains records {1, 2, 3} and Cluster 2 contains records {4, 5, 6} and (2) where Cluster 1 contains records {1, 6} and Cluster 2 contains records {2, 3, 4, 5}. Which scheme is better and why?
Compare the error of the two clustering schemes. The scheme with the
smallest error is better.
For SCHEME (1) we have C1 = {1,2,3} and C2 = {4,5,6}
M1 = ((8+5+2)/3, (4+4+4)/3) = (5,4)
2 2 2 2 2 2
C1_error = (8-5) + (4-4) + (5-5) (4-4) + (2-5) + (4-4)
= 18
For C2 we have
M2 = ((2+2+8)/3, (6+8+6)/3) = (4,6.66)
2 2 2 2 2 2
C2_error = (2-4) + (6-6.66) + (2-4) (8-6.66) + (8-4) + (6-6.66)
= 26.67
C1_error + C2_error = 44.67
For SCHEME (2) we have C1 = {1,6} and C2 = {2,3,4,5}
M1 = ((8+8)/2, (4+6)/2) = (8,5)
2 2 2 2
C1_error = (8-8) + (4-5) + (8-8) (6-5)
= 2
For C2 we have
M2 = ((5+2+2+2)/4, (4+4+6+8)/4) = (2.75,5.5)
C2_error =
2 2 2 2 2 2 2 2
(5-2.75) +(4-5.5) +(2-2.75) +(4-5.5) +(2-2.75) +(6-5.5) +(2-2.75) +(8-5.5)
= 17.74
C1_error + C2_error = 19.74
SCHEME 2 is better since the error associated with it is less than that
of SCHEME (1).
You might also like to view...
What is wrong with the following function body?
void calculate(int count, float price, float& cost) { if (count < 0) cost=0.0; else cost=count*price; return; } a. void functions may not have a return statement. b. void functions must return a value c. nothing d. can not mix reference and value parameters
An object of class _________________ produces truly random numbers.
Fill in the blank(s) with the appropriate word(s).
To which roles would you assign the following tasks?
What will be an ideal response?
Work with the SmartArt Tools tab on the Ribbon to change ____.
A. layouts B. styles C. colors D. all of the above