Consider the data set shown in Table 7.13. Suppose we are interested in extracting the following association rule:

{?1 ? Age ? ?2,Play Piano = Yes} ?? {Enjoy Classical Music = Yes}

To handle the continuous attribute, we apply the equal-frequency approach
with 3, 4, and 6 intervals. Categorical attributes are handled by introducing
as many new asymmetric binary attributes as the number of categorical val-
ues. Assume that the support threshold is 10% and the confidence threshold
is 70%.
(a) Suppose we discretize the Age attribute into 3 equal-frequency intervals.
Find a pair of values for ?1 and ?2 that satisfy the minimum support
and minimum confidence requirements.
(b) Repeat part (a) by discretizing the Age attribute into 4 equal-frequency
intervals. Compare the extracted rules against the ones you had ob-
tained in part (a).
(c) Repeat part (a) by discretizing the Age attribute into 6 equal-frequency
intervals. Compare the extracted rules against the ones you had ob-
tained in part (a).
(d) From the results in part (a), (b), and (c), discuss how the choice of
discretization intervals will affect the rules extracted by association rule
mining algorithms.


{?1 ? Age ? ?2,Play Piano = Yes} ?? {Enjoy Classical Music = Yes}

To handle the continuous attribute, we apply the equal-frequency approach
with 3, 4, and 6 intervals. Categorical attributes are handled by introducing
as many new asymmetric binary attributes as the number of categorical val-
ues. Assume that the support threshold is 10% and the confidence threshold
is 70%.
(a) Suppose we discretize the Age attribute into 3 equal-frequency intervals.
Find a pair of values for ?1 and ?2 that satisfy the minimum support
and minimum confidence requirements.
(b) Repeat part (a) by discretizing the Age attribute into 4 equal-frequency
intervals. Compare the extracted rules against the ones you had ob-
tained in part (a).
(c) Repeat part (a) by discretizing the Age attribute into 6 equal-frequency
intervals. Compare the extracted rules against the ones you had ob-
tained in part (a).
(d) From the results in part (a), (b), and (c), discuss how the choice of
discretization intervals will affect the rules extracted by association rule
mining algorithms.

Computer Science & Information Technology

You might also like to view...

The weakest aspect of password-based security is

a. the encryption technique used b. the type of key chosen c. the user’s management of their password information d. the algorithm used

Computer Science & Information Technology

Consider a NAT device between a private and the public network. Suppose the private network uses addresses in the range 10.0.1.0-10.0.1.255, and suppose that the interface of the NAT device to the public network has IP address 128.143.136.80.

Write the iptables command so that the addresses in the private network are mapped to the public IP address 128.143.136.80. What will be an ideal response?

Computer Science & Information Technology

You created type on a path. Even though the paragraph is cen tered, the type appears too far to the right. How do you fix the problem?

What will be an ideal response?

Computer Science & Information Technology

? Item 1 in the figure above is an example of a(n) ___________.

Fill in the blank(s) with the appropriate word(s).

Computer Science & Information Technology