Consider the following schema, where the keys are underlined:





Company is an unnormalized relation that contains information on the states where companies

are registered (Name ! RegistrationState is an FD) and the industries where they belong

(company-industry is a many-to-many relationship). The City relation contains information

about the number of people living in each city. Different cities can have the same name

in different states, but city names within the same state are unique. Consider the following

query:



SELECT C.Name, T.Name

FROM Company C, City T

WHERE C.RegistrationState = T.State AND

T.Population > 9000000 AND

C.Industry = 'Apparel'

ORDER BY C.Name



Assume the following statistical information:

 20,000 tuples in Company relation

 5,000 tuples in City relation

 50 tuples/page in each relation

 6-page bu er

 The domain of Population consists of integers in the range of 1000 to 10,000,000

 There are 50 industries represented in the database and 50 states.

 Indices:

? Company relation:

On Industry: Clustered, hash

? City relation:

On State: Clustered, 2-level B+ tree

Question: Find the best execution plan for the above query, draw it, and estimate its cost.


Solution:

The index on Company cannot be used in a join, so it makes sense to push the selection and

the projection to Company. The size of Name;RegistrationState(Industry=Apparel(Company)) is:

20000/50 = 400 tuples (after selection) = 8 pages. Projection further reduces the size by 2/3

to 5.4 (i.e., 6) pages.

If we use the clustered index on City.State in the join, we would have to use the index 400 times

for each tuple in Name;RegistrationState(Industry=Apparel(Company)). But if we do it smartly only for the states that actually occur in that expression, we still would have to use the index 50times. This is cheaper than computing Population>9M(City), because this selection will need toscan City (100 pages).

Since City is clustered on State, this relation is ordered on the State attribute. Therefore,

we can pull Name;RegistrationState(Industry=Apparel(Company)) through the 6-page bu er using just one page and join it with City using the index on City.State.

Thus, the total cost is: 8 + 50 = 58 I/Os.

The query plan is shown below.

Computer Science & Information Technology

You might also like to view...

OpenOffice Writer's ________ feature can be used to create documents such as mailing labels and print envelopes

Fill in the blank(s) with correct word

Computer Science & Information Technology

Which of the following is NOT an advantage of Office 365?

A) Microsoft Office 365 changes a company's expenses for software, because it pays only for the number of users who actually use the software B) Because users can log in from multiple places that may not be secure, their accounts are at a risk of having their passwords hacked or stolen by keyloggers. C) Microsoft guarantees that Office 365 will be available 99.9 percent of the time. D) Office 365 provides secure access to and storage of company records and files using geo-redundant data centers

Computer Science & Information Technology

?When setting the transition timing, it must be ensured that the speed of the transition is constant.

Answer the following statement true (T) or false (F)

Computer Science & Information Technology

An Answer Report shows original and final values of the target cell and changing cells plus ____.

A. cell formatting B. constraints C. previous trial and error solutions D. all scenarios

Computer Science & Information Technology