Association analysis can be used to find anomalies as follows. Find strong as- sociation patterns, which involve some minimum number of objects. Anoma- lies are those objects that do not belong to any such patterns. To make this more concrete, we note that the hyperclique association pattern discussed in Section 6.8 is particularly suitable for such an approach. Specifically, given a user-selected h-confidence level, maximal hyperclique patterns of objects are found. All objects that do not appear in a maximal hyperclique pattern of at least size three are classified as outliers.
(a) Does this technique fall into any of the categories discussed in this
chapter? If so, which one?
(b) Name one potential strength and one potential weakness of this ap-
proach.
(a) In a hyperclique, all pairs of objects have a guaranteed cosine similar-
ity of the h-confidence or higher. Thus, this approach can be viewed
as a proximity-based approach. However, rather than a condition on
the proximity of objects with respect to a particular object, there is a
requirement on the pairwise proximities of all objects in a group.
(b) Strengths of this approach are that (1) the objects that do not belong to
any size 3 hyperclique are not strongly connected to other objects and
are likely anomalous and (2) it is computationally efficient. Potential
weaknesses are (1) this approach does not assign a numerical anomaly
score, but simply classifies an object as normal or an anomaly, (2)
it is not possible to directly control the number of objects classified as
anomalies because the only parameters are the h-confidence and support
threshold, and (3) the data needs to be discretized.
You might also like to view...
List and explain the four interviewing techniques mentioned in the text for learning about users and their jobs. Explain the advantages and disadvantages of each.
What will be an ideal response?
Using pathname expansion (Sobell, page 150), list the files in the /usr/bin directory that have the characters ab anywhere in their names.
What will be an ideal response?
Lin received 200 new laptops to be issued to company employees. Lin was asked to set them up and distribute them to everyone on a list provided by his supervisor. Lin has completed installing all the software on each computer and is ready to distribute them, but he needs to keep track of who received each laptop. ? How can Lin keep track of all the laptops?
A. Network topology diagram B. Knowledge database C. Inventory management documentation D. Acceptable use documentation
One way to remove a field from a form is to select it and press ____.
A. Copy B. Delete C. [CTRL][F8] D. Remove