vault backup: 2024-01-25 14:02:02
This commit is contained in:
@@ -0,0 +1,78 @@
|
||||
# Query Optimisation
|
||||
|
||||
Dominant cost in query processing is secondary storage access. The fewer blocks accessed, the faster the database queries can be.
|
||||
Many tuples will fit into a single block, requiring the query to: find the block, find the tuple, edit the tuple, and place it back in the block. This is inefficient if we are doing many different queries accessing different blocks.
|
||||
|
||||
# Types of Index
|
||||
|
||||
## Primary Index
|
||||
|
||||
Data file sequentially ordered by ordering key field, indexing field is built on the ordering key field. Guaranteed to have unique value for each tuple.
|
||||
|
||||
## Clustering Index
|
||||
|
||||
Data file sequentially ordered on non-key field, indexing field built on same non-key field. Can be more than one tuple corresponding to a value in the indexing field.
|
||||
|
||||
## Primary / Clustered Indices
|
||||
|
||||
Affect the order the data is stored in a file.
|
||||
|
||||
## Secondary Indices
|
||||
|
||||
Give a lookup table to the file.
|
||||
|
||||
# Index Restrictions
|
||||
|
||||
- Table can have 1 primary index OR 1 clustering index.
|
||||
- Most frequently looked up value is often the best choice
|
||||
- Some DBMS' assume PK is primary index, as it is usually used to refer to rows
|
||||
|
||||
# Exercise
|
||||
|
||||
1. What is a Query Tree?
|
||||
A query tree is a visual model used to represent a logical model of database queries. Each leaf node represents a relation. Each internal node is a different query function ex. selection, projection, product, etc.
|
||||
|
||||
2. Write a relational algebra expression for the following query
|
||||
|
||||
```sql
|
||||
SELECT lecName, schedule
|
||||
FROM lecturer, module, enroll, student
|
||||
WHERE lastName=“Burns”
|
||||
AND firstName=“Edward”
|
||||
AND module.moduleNumber=enrol.moduleNumber
|
||||
AND lecturer.lecID=module.lectID
|
||||
AND student.stuID=enrol.stuID;
|
||||
```
|
||||

|
||||
|
||||
pi lecName, schedule
|
||||
( sigma lastName = burns
|
||||
( sigma firstName = edwards
|
||||
( sigma module.moduleNumber=enrol.moduleNumber
|
||||
( sigma (lecturer.ledID=module.lectID (lecturer x module ) x enrol ) x student
|
||||
)
|
||||
)
|
||||
)
|
||||
|
||||
Draw a Query Tree for this SQL query:
|
||||
|
||||
3. Why can we not use SQL for query optimisation?
|
||||
Using relational algebra is easier for us to visualise, and less abstracted than SQL, allowing us to optimise the query flow.
|
||||
|
||||
4. List the heuristics that optimisers use to reduce optimisation cost.
|
||||
- Begin with initial query tree for SQL
|
||||
- Move SELECT operations down the tree
|
||||
- Apply more restrictive SELECT operations first ( eg. equalities before range queries )
|
||||
- Replace Cartesian products followed by selection with theta joins ( eg. *sigma(f) ( RxS )* -> *R theta(f) S* )
|
||||
- Move PROJECT operations down the query tree ( add project operations as inputs to theta joins ).
|
||||
|
||||
5. Draw a near optimal query tree for the following SQL query, and write a relational algebra expression for this tree.
|
||||
```sql
|
||||
SELECT sailors.name
|
||||
FROM sailors, reservations
|
||||
WHERE reservations.sID=sailors.ID
|
||||
AND reservations.bID=100
|
||||
AND sailors.rating=7;
|
||||
```
|
||||
|
||||

|
Reference in New Issue
Block a user