Use the similarity matrix in Table 8.1 to perform single and complete link hierarchical clustering. Show your results by drawing a dendrogram. The dendrogram should clearly show the order in which the points are merged.
(b) Do both sets of centroids represent stable solutions; i.e., if the K-means
algorithm was run on this set of points using the given centroids as the
starting centroids, would there be any change in the clusters generated?
(c) What are the two clusters produced by single link?
(d) Which technique, K-means or single link, seems to produce the “most
natural” clustering in this situation? (For K-means, take the clustering
with the lowest squared error.)
(e) What definition(s) of clustering does this natural clustering correspond
to? (Well-separated, center-based, contiguous, or density.)
(f) What well-known characteristic of the K-means algorithm explains the
previous behavior?
The solutions are shown in Figures 8.6(a) and 8.6(b).
total squared error for each set of two clusters. Show both the clusters
and the total squared error for each set of centroids.
i. {18, 45}
First cluster is 6, 12, 18, 24, 30.
Error = 360.
Second cluster is 42, 48.
Error = 18.
Total Error = 378
ii. {15, 40} First cluster is 6, 12, 18, 24 .
Error = 180.
Second cluster is 30, 42, 48.
Error = 168.
Total Error = 348.
(b) Yes, both centroids are stable solutions.
(c) The two clusters are {6, 12, 18, 24, 30} and {42, 48}.
(d) MIN produces the most natural clustering.
(e) MIN produces contiguous clusters. However, density is also an accept-
able answer. Even center-based is acceptable, since one set of centers
gives the desired clusters.
(f) K-means is not good at finding clusters of different sizes, at least when
they are not well separated. The reason for this is that the objective of
m
You might also like to view...
Modify the function you wrote for exercise 20 to descend all subdirectories of the named directory recursively and to find the maximum length of any filename in that hierarchy.
What will be an ideal response?
Cells can be merged vertically and horizontally.
Answer the following statement true (T) or false (F)
What is the major difference between TSL and SSL?
What will be an ideal response?
Design a questionnaire to learn what students think of the registration process at your school. Apply the guidelines you learned in this chapter.
What will be an ideal response?