Consider the mean of a cluster of objects from a binary transaction data set. What are the minimum and maximum values of the components of the mean? What is the interpretation of components of the cluster mean? Which components most accurately characterize the objects in the cluster?
What will be an ideal response?
(a) The components of the mean range between 0 and 1.
(b) For any specific component, its value is the fraction of the objects in
the cluster that have a 1 for that component. If we have asymmetric
binary data, such as market basket data, then this can be viewed as
the probability that, for example, a customer in group represented by
the the cluster buys that particular item.
(c) This depends on the type of data. For binary asymmetric data, the
components with higher values characterize the data, since, for most
clusters, the vast majority of components will have values of zero. For
regular binary data, such as the results of a true-false test, the signifi-
cant components are those that are unusually high or low with respect
to the entire set of data.
You might also like to view...
Which function would you use to determine the number of occurrences of a particular value in a dataset?
A) FREQUENCY B) CORREL C) BINS D) DEVP
List four benefits of either replacing the campus or corporate routers with layer 3 switches or adding layer 3 switching to a router-less network.
What will be an ideal response?
Arguments are contained inside ____ after the name of the procedure to be called.
A. braces B. parentheses C. brackets D. quotation marks
A workbook contains sales information for the first quarter and you are interested in finding the totals sales generated in January by the sales rep Jim Anderson. What function is best suited to handle the task?
A) SUMIFS B) AVERAGEIFS C) DCOUNT D) COUNTIFS