Consider the use of unclustered B + trees for external sorting. Let R denote the number of data records per disk block, and let F be the number of blocks in the data ?le. Estimate the cost of such a sorting procedure as a function of R and F. Compare this cost to merge-based external sorting. Consider the cases of R = 1, 10, and 100.

What will be an ideal response?


The cost of using unclustered B+ trees is 2R x F because we have to scan the leaves of the B+ tree and for each entry there to access the corresponding data ?le block twice (to read and then to write). When R increases, the cost increases.
Suppose we have M pages of main memory in which to do the sorting and that we use k-way merge. The cost of merge-based external sorting is 2F logk F /M . This cost does not depend on R.
When R = 1, B+ tree sorting is slightly better. For R = 10, merge-sort is likely to be better even for large ?les. For instance, if M = 10, 000 and R = 100, 000, 000 (i.e., 400G), then we can do 999-way merge and

log999(100000000/10000) < 2 < R = 10

Computer Science & Information Technology

You might also like to view...

Provide the class with a glossary of the most important project management terms and definitions.

What will be an ideal response?

Computer Science & Information Technology

What is the proper syntax when using a message dialog box?

(A) MessageBox.Show("Hi there", "Hi") (B) MessageBox.Show(Hi there, Hi) (C) MessageBox.Show "Hi There", "Hi" (D) MessageBox.Show Hi There, Hi

Computer Science & Information Technology

Recursion is often less efficient than iteration because ________.

a. it can cause an explosion of method calls. b. it is not as intuitive. c. recursive methods are harder to debug. d. recursive methods take longer to program.

Computer Science & Information Technology

What do the following commands do on your UNIX or LINUX system,? Where does the output of these commands go?

a. gcc myProg.c b. gcc myProg.c -o myProg c. gcc -o myProg myProg.c d. gcc myProg.c -o myProg.c e gcc myProg.c -o myProg -lm -lsocket f. gcc -E myProg.c g. gcc -E myProg.c > myProg.i h. gcc -S -DSIZE=256 myProg.c i. gcc -S -DSIZE=256 myProg.c -o myProg

Computer Science & Information Technology