Design a query evaluation algorithm for the CUBE operator that uses the results of the previously computed aggregations to compute newaggregations. (Hint: Organize the GROUP BY clauses used in the computation of a data cube into a lattice, that is, a partial order with the least upper bound and the greatest lower bound for each pair of elements. Here is an example of such a partial order: GROUP BY A > GROUP BY A,B > GROUP BY A,B,C and GROUP BY B > GROUP BY B,C > GROUP BY A,B,C. Describe howaggregates computed for the lower parts of the lattice can be used in the computation of the upper parts.)

What will be an ideal response?


The idea is basically the same as in the case of ROLLUP: to compute


SELECT A1,..., An aggr(B)
FROM SomeRelation
WHERE ...
GROUP BY A1,..., An

where aggr is min, max, sum, or count, we can first compute the relation
SomeRelation1,...,n,n+1

SELECT A1,..., An, An+1, aggr(B)
FROM SomeRelation
WHERE ...
GROUP BY A1,..., An, An+1

and then compute

SELECT A1,..., An, aggr(B)
FROM SomeRelation1,...,n,n+1
WHERE ...
GROUP BY A1,..., An

In this way, we are reusing the previous aggregation of aggr(B) over the attributes
of SomeRelation minus A1,..., An, An+1, B and nowonly need to aggregate the
values of B in SomeRelation1,...,n,n+1 over the attribute An+1.
If aggr is avg, we need to compute sum and count using the above optimization
and then divide one by the other.
We can thus start computing

GROUP BY CUBE (A1,..., An)

with an aggregation that groups by all the attributes, A1,..., An, then move to
the next layer in the lattice, where we aggregate over all attributes but one etc. For
instance, the second level of aggregation will perform the following groupings:

GROUP BY A2, A3, ..., An
GROUP BY A1, A3,..., An
... ... ...
GROUP BY A1, A2, ..., An?1

Thus, the second layer will have N aggregations. The first GROUP BY above aggregates B in SomeRelation1,...,n over the attribute A1, the second aggregates it over A2, etc.
The third layer will have N (N ? 1)/2! aggregations, etc. The important point is that at layer 2 aggregation is done over the relation computed at layer 1, SomeRelation1,...,n,
and at layer 3 we aggregate over the appropriate relations computed at layer 2. We
continue in this way until we reach the top layer where we have N aggregations, each over a single attribute.

Computer Science & Information Technology

You might also like to view...

To place a new item in a List at a specific index use the method ________.

a) Add b) Insert c) IndexOf d) InsertAt

Computer Science & Information Technology

A layer 2 switch does which of the following? (Select all that apply.)

a. Provides a direct connection for networking devices in a LAN b. Uses MAC addressing from the data link layer c. Uses MAC addressing from the network layer d. Uses IP addressing from the network layer

Computer Science & Information Technology

The ____________________ frame does not change the appearance of the button.

Fill in the blank(s) with the appropriate word(s).

Computer Science & Information Technology

When you place grouped artwork into a layer, a sublayer is automatically created with the name ____.

A. B. C. D.

Computer Science & Information Technology