Consider the data set shown in Table 7.8. The first attribute is continuous, while the remaining two attributes are asymmetric binary. A rule is consid- ered to be strong if its support exceeds 15% and its confidence exceeds 60%. The data given in Table 7.8 supports the following two strong rules:
(i) {(1 ? A ? 2), B = 1}?{C = 1}
(ii) {(5 ? A ? 8), B = 1}?{C = 1}
(a) Compute the support and confidence for both rules(b) To find the rules using the traditional Apriori algorithm, we need to
discretize the continuous attribute A. Suppose we apply the equal width
binning approach to discretize the data, with bin-width = 2, 3, 4. For
each bin-width, state whether the above two rules are discovered by
the Apriori algorithm. (Note that the rules may not be in the same
exact form as before because it may contain wider or narrower intervals
for A.) For each rule that corresponds to one of the above two rules,
compute its support and confidence.
(c) Comment on the effectiveness of using the equal width approach for
classifying the above data set. Is there a bin-width that allows you to find both rules satisfactorily? If not, what alternative approach can you
take to ensure that you will find both rules?
(a) s({(1 ? A ? 2), B = 1}?{C = 1})=1/6
c({(1 ? A ? 2), B = 1}?{C = 1})=1
s({(5 ? A ? 8), B = 1}?{C = 1})=1/6
c({(5 ? A ? 8), B = 1}?{C = 1})=1
(b) When bin ? width = 2:
Where
A1=1 ? A ? 2; A2=3 ? A ? 4;
A3=5 ? A ? 6; A4=7 ? A ? 8;
A5=9 ? A ? 10; A6 = 11 ? A ? 12;
For the first rule, there is one corresponding rule:
{A1=1, B = 1}?{C = 1}
s(A1=1, B = 1}?{C = 1})=1/6
c(A1=1, B = 1}?{C = 1})=1
Since the support and confidence are greater than the thresholds, the
rule can be discovered.
For the second rule, there are two corresponding rules:
{A3=1, B = 1}?{C = 1}
{A4=1, B = 1}?{C = 1}
For both rules, the support is 1/12 and the confidence is 1. Since
the support is less than the threshold (15%), these rules canno
You might also like to view...
The ____________ identifies the table from which the data-bound control will get its data.
a. Data Source property b. Table property c. primary key d. Display Member property
________ change or transform inputs into outputs.
A) Systems B) Subsystems C) Processes D) Objectives
What happenes if a network has multiple DHCP servers?
What will be an ideal response?
When recording a macro, the Record Macro command stores the macro in a(n) ________
A) event B) file C) module D) sub event