Consider the transactions shown in Table 7.14, with an item taxonomy given in Figure 7.25.



(a) What are the main challenges of mining association rules with item

taxonomy?

(b) Consider the approach where each transaction t is replaced by an ex-

tended transaction t'

that contains all the items in t as well as their re-

spective ancestors. For example, the transaction t = { Chips, Cookies}

will be replaced by t'

= {Chips, Cookies, Snack Food, Food}. Use this

approach to derive all frequent itemsets (up to size 4) with support ?

70%.

(c) Consider an alternative approach where the frequent itemsets are gen-

erated one level at a time. Initially, all the frequent itemsets involving

items at the highest level of the hierarchy are generated. Next, we use

the frequent itemsets discovered at the higher level of the hierarchy to

generate candidate itemsets involving items at the lower levels of the hi-

erarchy. For example, we generate the candidate itemset {Chips, Diet

Soda} only if {Snack Food, Soda} is frequent. Use this ap


(a) Difficulty of deciding the right support and confidence thresholds. Items
residing at higher levels of the taxonomy have higher support than those
residing at lower levels of the taxonomy. Many of the rules may also be
redundant.
(b) There are 8 frequent 1-itemsets, 25 frequent 2-itemsets, 34 frequent
3-itemsets and 20 frequent 4-itemsets. The frequent 4-itemsets are:
{Food, Snack Food, Meat, Soda} {Food, Snack Food, Meat, Chips}
{Food, Snack Food, Meat, Pork} {Food, Snack Food, Meat, Chicken}
{Food, Snack Food, Soda, Chips} {Food, Snack Food, Chips, Pork}
{Food, Snack Food, Chips, Chicken} {Food, Meat, Soda, Chips}
{Food, Meat, Soda, Pork} {Food, Meat, Soda, Chicken}
{Food, Meat, Soda, Ham} {Food, Meat, Chips, Pork}
{Food, Meat, Chips, Chicken} {Food, Meat, Pork, Chicken}
{Food, Meat, Pork, Ham} {Food, Soda, Pork, Ham}
{Snack Food, Meat, Soda, Chips} {Snack Food, Meat, Chips, Pork}
{Snack Food, Meat, Chips, Chicken} {Meat, Soda, Pork, Ham}
(c) There are 8 frequent 1-itemsets, 6 frequent 2-itemsets, and 1 frequent
3-itemset. The frequent 2-itemsets and 3-itemsets are:
{Snack Food, Meat} {Snack Food, Soda}
{Meat, Soda} {Chips, Pork}
{Chips, Chicken} {Pork, Chicken}
{Snack Food, Meat, Soda}
(d) The method in part (b) is more complete but less efficient compared to
the method in part (c). The method in part (c) is more efficient but
may lose some frequent itemsets.

Computer Science & Information Technology

You might also like to view...

Match the following Windows 7 programs or features to their descriptions:

I. Snap A. automatically places a window on the side of the desktop II. Aero Flip 3D B. invite someone to help with a computer issue III. Remote Assistance C. arranges all open windows in a stack IV. Search box D. use to record to-do lists, phone numbers, or other info V. Sticky Notes E. locate files, folder, programs or e-mail messages

Computer Science & Information Technology

________ reports group, filter, and sort large amounts of data so that you can quickly create different views of the same data

Fill in the blank(s) with the appropriate word(s).

Computer Science & Information Technology

Avoid the common mistakes many new presenters make during presentations using PowerPoint, such as ________ from the slides and turning their backs on the audience

a. Speaking too slowly b. Reading from the slides c. Deviating d. None of the above

Computer Science & Information Technology

The most popular database model currently in use is the ____.

a. relational model b. hierarchical model c. network model d. object model

Computer Science & Information Technology