vault backup: 2024-01-25 14:02:02

2024-01-25 14:02:03 +00:00
parent c09a4d605d
commit 14b75ba16c
32 changed files with 19979 additions and 26 deletions
--- a/Systems/Week
+++ b/Systems/Week
@@ -0,0 +1,78 @@
+# Query Optimisation
+
+Dominant cost in query processing is secondary storage access. The fewer blocks accessed, the faster the database queries can be.
+Many tuples will fit into a single block, requiring the query to: find the block, find the tuple, edit the tuple, and place it back in the block. This is inefficient if we are doing many different queries accessing different blocks.
+
+# Types of Index
+
+## Primary Index
+
+Data file sequentially ordered by ordering key field, indexing field is built on the ordering key field. Guaranteed to have unique value for each tuple.
+
+## Clustering Index
+
+Data file sequentially ordered on non-key field, indexing field built on same non-key field. Can be more than one tuple corresponding to a value in the indexing field.
+
+## Primary / Clustered Indices
+
+Affect the order the data is stored in a file.
+
+## Secondary Indices
+
+Give a lookup table to the file.
+
+# Index Restrictions
+
+- Table can have 1 primary index OR 1 clustering index.
+	- Most frequently looked up value is often the best choice
+	- Some DBMS' assume PK is primary index, as it is usually used to refer to rows
+
+# Exercise
+
+1. What is a Query Tree?
+A query tree is a visual model used to represent a logical model of database queries. Each leaf node represents a relation. Each internal node is a different query function ex. selection, projection, product, etc.
+
+2. Write a relational algebra expression for the following query
+
+```sql
+SELECT lecName, schedule 
+FROM lecturer, module, enroll, student 
+WHERE lastName=“Burns” 
+AND firstName=“Edward” 
+AND module.moduleNumber=enrol.moduleNumber 
+AND lecturer.lecID=module.lectID 
+AND student.stuID=enrol.stuID;
+```
+![](Pasted%20image%2020231121134251.png)
+
+pi lecName, schedule 
+( sigma lastName = burns
+	( sigma firstName = edwards 
+		( sigma module.moduleNumber=enrol.moduleNumber 
+			( sigma (lecturer.ledID=module.lectID (lecturer x module ) x enrol ) x student
+		)
+	)
+)
+
+Draw a Query Tree for this SQL query:
+
+3. Why can we not use SQL for query optimisation?
+Using relational algebra is easier for us to visualise, and less abstracted than SQL, allowing us to optimise the query flow.
+
+4. List the heuristics that optimisers use to reduce optimisation cost.
+- Begin with initial query tree for SQL
+- Move SELECT operations down the tree
+- Apply more restrictive SELECT operations first ( eg. equalities before range queries )
+- Replace Cartesian products followed by selection with theta joins ( eg. *sigma(f) ( RxS )* -> *R theta(f) S* )
+- Move PROJECT operations down the query tree ( add project operations as inputs to theta joins ).
+
+5. Draw a near optimal query tree for the following SQL query, and write a relational algebra expression for this tree.
+```sql
+SELECT sailors.name 
+FROM sailors, reservations 
+WHERE reservations.sID=sailors.ID 
+AND reservations.bID=100 
+AND sailors.rating=7;
+```
+	
+![](Pasted%20image%2020240123164006.png)