A company sells merchandise through agents. Each sales agent is assigned one or more cities to cover. Consider the following schema, where the keys are underlined:





Transaction means that item i was sold on date d in the area with zip code z.)



Consider the following query:







Show the three ?most promising? relational algebra expressions that the query optimizer is

likely to consider; then nd the most ecient query plan and estimate its cost.

Assume 50 bu er pages and the following statistics and indices:

 Agent: 1,000 tuples, 5 tuples/page; 200 agents.

Clustered hash index on Name.

Unclustered B+ tree index on City.

 Location: 2500 tuples, 50 tuples/page; 500 cities.

Unclustered hash index on ZIP.

 Transaction: 10,000 tuples, 20 items; 10 tuples/page.

2-level clustered B+ tree index on Item.


Solution:

One most promising expression is where the selection Name='007' is pushed, the other is

where Item='land' is pushed, and the third is where both are pushed.



The most efficient query plan is depicted in the figure.





Sizes:



 Name=00070(Agent): 5 tuples, 1 page, 1 I/O (since Name is clustered).

 Item=0land0(Transaction): 500 tuples, 50 pages. There can be up to 250 ZIP codes among

these tuples, 50 I/Os + 2 to search the index.

 Location: 50 pages

The rst join can potentially create 25 tuples (5 ZIP codes per city, 5 cities selected from

Agent) at the cost of 25 I/O. The result will occupy at most 10 pages (given that tuples can be

twice as long).

In the block-nested loops join, we will scan the selection on Transaction once in the outer

loop (50 pages) and the result of the previous join (10 pages) at most twice in the inner loop.

(Twice because we have 50 bu er pages only and the outer relation is also 50 pages long). So,

the cost of this join is 50+20=70pages.

Total: 1 + 50 + 2 + 25 + 70 = 148 page I/Os.

Computer Science & Information Technology

You might also like to view...

The collection of computer programs that controls the interaction of the user and the computer hardware is called

a. the application program b. the compiler c. the linker d. the operating system

Computer Science & Information Technology

Which operator can be used in string concatenation?

a. * b. += c. ++ d. =+

Computer Science & Information Technology

Competing technologies that use FHSS can create ______________ interference.

Fill in the blank(s) with the appropriate word(s).

Computer Science & Information Technology

Users can add additional tables to databases created from templates

Indicate whether the statement is true or false

Computer Science & Information Technology