Pages in a database have P bytes. A table T has R rows and n integer attributes, A1, A2, .An. Each attribute is r bytes long. Assume that r is much greater than the size of a pointer, so the pointer size can be ignored when computing the size of an entry in an index. Assume that the domain of A1 is the integers between 1 and 100 and that the values of this attribute are randomly distributed in

the domain. Consider the following statement:


SELECT A2
FROM T
WHERE A1 <= 50 AND A1 > 25

Evaluate the cost, in terms of the number of I/O operations, of executing the statement under the following conditions. In order to get credit you must explain your work clearly.

(a) The table is stored as a heap (no indexes).
(b) There is a non-integrated (separate index file) clustered index with search key A1.
(c) There is an unclustered index with search key (A2, A1).


(a) The table is stored as a heap (no indexes).
Solution:
scan entire table
size of table = n ? r ? R
number pages in table = (n ? r ? R)/P = cost
(b) There is a non-integrated (separate index file) clustered index with search key A1.
Solution:
Search down the tree using A1 = 25, follow pointer to first qualifying row in data file, scan data file and return all rows until row with A1>50 is found.
fan-out of a page in the index = P/r (or P/2r for a worst case analysis) num. of levels in tree = logP/r R = cost of tree search
cost of file scan = .25((n r R)/P) since approximately a quarter of the rows have to be
returned (values of A1 are randomly distributed) total cost = logP/r R + 1 + .25((n ? r ? R)/P)
(c) There is an unclustered index with search key (A2, A1).
Solution:
Scan leaf level of tree (index covering). An index entry has 2r bytes, hence the (mini- mum) number of leaf level pages in the index = R/(P/2r) (worst case analysis doubles the cost).

Computer Science & Information Technology

You might also like to view...

The maximum speed of a 10BaseFL network is __________________.

Fill in the blank(s) with the appropriate word(s).

Computer Science & Information Technology

A surge is a short transient in the voltage that can be due to a short circuit or power outage

Indicate whether the statement is true or false

Computer Science & Information Technology

Find the mean of the set of data. 2,  4,  6,  5,  2,  4,  9,  4, 8, 12

A. 5.1 B. 4.5 C. 5.6 D. 10 E. 4

Computer Science & Information Technology

Why would a manager prefer a decision tree instead of a decision table?

What will be an ideal response?

Computer Science & Information Technology