Build a decision tree on the data set. Does the tree capture the “+” and “?” concepts?

Following is a data set that contains two attributes, X and Y , and two class

labels, “+” and “?”. Each attribute can take three different values: 0, 1, or 2.



The concept for the “+” class is Y = 1 and the concept for the “?” class is

X = 0 ? X = 2.


There are 30 positive and 600 negative examples in the data. Therefore,



at the root node, the error rate is



Eorig = 1 ? max(30/630, 600/630) = 30/630.



If we split on X, the gain in error rate is:







If we split on Y , the gain in error rate is:







Therefore, X is chosen to be the first splitting attribute. Since the



X = 1 child node is pure, it does not require further splitting. We may



use attribute Y to split the impure nodes, X = 0 and X = 2, as follows:



• The Y = 0 and Y = 2 nodes contain 100 ? instances.



• The Y = 1 node contains 100 ? and 10 + instances.



In all three cases for Y , the child nodes are labeled as ?. The resulting



concept is



Computer Science & Information Technology

You might also like to view...

The ___________________ attribute is used to uniquely identify an HTML element.

Fill in the blank(s) with the appropriate word(s).

Computer Science & Information Technology

Describe what happens when you insert an element in a vector whose memory is exhausted.

What will be an ideal response?

Computer Science & Information Technology

Prove that the basic two-phase locking protocol guarantees conflict serializability of schedules. (Hint: Show that, if a serializability graph for a schedule has a cycle, then at least one of the transactions participating in the schedule does not obey the two-phase locking protocol.)

What will be an ideal response?

Computer Science & Information Technology

A _____ is an internal or external entity that could endanger an asset.

A. proxy B. facade C. firewall D. threat

Computer Science & Information Technology