Suppose the table of Figure 17.16 is stored in a relational database. Use SQL to compute the probabilities needed to compute the information gain when using the PrevDefault attribute as the topmost attribute of a decision tree based on that table.
What will be an ideal response?
SELECT C.PreviousDefault,
COUNT(DefYes.Count)/COUNT(*),
COUNT(DefNo.Count)/COUNT(*)
FROM Customer C,
(SELECT DISTINCT C1.PrevDefault COUNT(*) AS Count
FROM Customer C1
WHERE C1.Default = ’yes’
GROUP BY C1.PrevDefault) AS DefYes
(SELECT DISTINCT C1.PrevDefault COUNT(*) AS Count
FROM Customer C1
WHERE C1.Default = ’no’
GROUP BY C1.PrevDefault) AS DefNo
WHERE C.PrevDefault = DefYes.PrevDefault
AND C.PrevDefault = DefNo.PrevDefault
GROUP BY C.PrevDefault
Computer Science & Information Technology
You might also like to view...
To search for a three-digit number that starts with "67" you would enter 67*
Indicate whether the statement is true or false
Computer Science & Information Technology
A menu of commands that matches characters you type into the query design grid is ________
Fill in the blank(s) with correct word
Computer Science & Information Technology
This protocol is an alternative to POP3, with special enhancements.
A. SMTP B. POP4 C. IMAP4 D. Web-based e-mail
Computer Science & Information Technology
If a computer has a maximum of 2N memory cells, then each address field in a machine language instruction must be ____ bits wide to enable us to address every cell.
A. N B. 2N C. N2 D. 2N
Computer Science & Information Technology