You are given a data set with 100 records and are asked to cluster the data. You use K-means to cluster the data, but for all values of K, 1 ? K ? 100, the K-means algorithm returns only one non-empty cluster. You then apply an incremental version of K-means, but obtain exactly the same result. How is this possible? How would single link or DBSCAN handle such data?

What will be an ideal response?


(a) The data consists completely of duplicates of one object.
(b) Single link (and many of the other agglomerative hierarchical schemes)
would produce a hierarchical clustering, but which points appear in
which cluster would depend on the ordering of the points and the exact

algorithm. However, if the dendrogram were plotted showing the prox-
imity at which each object is merged, then it would be obvious that the

data consisted of duplicates. DBSCAN would find that all points were
core points connected to one another and produce a single cluster.

Computer Science & Information Technology

You might also like to view...

Application, Security, Setup, and Computer are types of Event Viewer logs

Indicate whether the statement is true or false

Computer Science & Information Technology

Used to present list items but not in a particular order

A. ordered list B. definition list C. unordered list D. numbered list

Computer Science & Information Technology

In the ____ sort, the list at any moment is divided into two sublists, sorted and unsorted, which are divided by an imaginary wall. We select the smallest element from the unsorted sublist and exchange it with the element at the beginning of the unsorted data.

A. shell B. quick C. straight selection D. heap

Computer Science & Information Technology

A(n) ____________________ is a group of images designated to have the same action performed on them simultaneously.

Fill in the blank(s) with the appropriate word(s).

Computer Science & Information Technology