Consider the use of unclustered B + trees for external sorting. Let R denote the number of data records per disk block, and let F be the number of blocks in the data ?le. Estimate the cost of such a sorting procedure as a function of R and F. Compare this cost to merge-based external sorting. Consider the cases of R = 1, 10, and 100.
What will be an ideal response?
The cost of using unclustered B+ trees is 2R x F because we have to scan the leaves of the B+ tree and for each entry there to access the corresponding data ?le block twice (to read and then to write). When R increases, the cost increases.
Suppose we have M pages of main memory in which to do the sorting and that we use k-way merge. The cost of merge-based external sorting is 2F logk F /M . This cost does not depend on R.
When R = 1, B+ tree sorting is slightly better. For R = 10, merge-sort is likely to be better even for large ?les. For instance, if M = 10, 000 and R = 100, 000, 000 (i.e., 400G), then we can do 999-way merge and
log999(100000000/10000) < 2 < R = 10
You might also like to view...
Provide the class with a glossary of the most important project management terms and definitions.
What will be an ideal response?
What is the proper syntax when using a message dialog box?
(A) MessageBox.Show("Hi there", "Hi") (B) MessageBox.Show(Hi there, Hi) (C) MessageBox.Show "Hi There", "Hi" (D) MessageBox.Show Hi There, Hi
Recursion is often less efficient than iteration because ________.
a. it can cause an explosion of method calls. b. it is not as intuitive. c. recursive methods are harder to debug. d. recursive methods take longer to program.
What do the following commands do on your UNIX or LINUX system,? Where does the output of these commands go?
a. gcc myProg.c b. gcc myProg.c -o myProg c. gcc -o myProg myProg.c d. gcc myProg.c -o myProg.c e gcc myProg.c -o myProg -lm -lsocket f. gcc -E myProg.c g. gcc -E myProg.c > myProg.i h. gcc -S -DSIZE=256 myProg.c i. gcc -S -DSIZE=256 myProg.c -o myProg