vault backup: 2024-10-17 23:28:49

2024-10-17 23:28:49 +01:00
parent df5e8dd1bf
commit cf0301a749
11 changed files with 233 additions and 30 deletions
--- a/Mining/Week
+++ b/Mining/Week
@@ -0,0 +1,95 @@
+# Covering Algorithms
+
+- Each class in turn; find set of rules covering all examples
+- At each stage, rule is identified that covers some examples
+- Example is covered if satisfying conditions in the antecedent (LHS) of the rule
+- Consider dataset with 2 predicting numeric attributes, and two class values.
+- ![](Pasted%20image%2020241017130933.png)
+
+# PRISM: Simple Covering Algorithm
+
+- Generates rule by adding tests that maximise probability of desired class
+- Similar to situation in decision trees; problem of selecting attribute to split on
+- Each new test reduces rules coverage
+- Rule becomes more specific as tests are added
+- Search strategy is general-to-specific
+- ![](Pasted%20image%2020241017131053.png)
+
+## Selecting a Test
+
+Goal: Maximise probability of desired class
+- $t$ = total number of examples covered by rule
+- $p$ = number of positive examples of the class covered by rule
+- $t - p$ = number of errors made by rule
+- => Select test that maximises ratio $p/t$
+Stop Condition: $t-p=0$
+- $p = t$, $p/t=1$
+- Or, set of examples cannot be split further.
+
+## Example: Contact Lenses Dataset | Class = Hard
+
+### Selecting 1st Test of 1st Rule
+
+- Rule to Seek: If ? { then recommendation = Hard }
+![](Pasted%20image%2020241017131525.png)
+
+#### Modified Rule and Coverage
+
+- Rule with best test added: If astigmatism = Yes { then recommendation = Hard }
+![](Pasted%20image%2020241017131740.png)
+
+### Selecting 2nd Test of 1st Rule
+
+- If astigmatism = Yes and ? { then recommendation = Hard }
+
+![](Pasted%20image%2020241017131912.png)
+
+#### Modified Rule and its Coverage
+
+- Rule with best test added: If astigmatism = Yes and tear rate = Normal { then recommendation = Hard }
+![](Pasted%20image%2020241017132019.png)
+
+### Selecting 3rd Test of 1st Rule
+
+- If astigmatism = Yes and tear rate = Normal and ? { then recommendation = Hard }
+![](Pasted%20image%2020241017132059.png)
+- PRISM will use test with highest sample size, therefore using Myope.
+
+### 1st Rule for Class = Hard
+
+- Final Rule:
+If astigmatism = Yes
+	and tear rate = Normal
+	and spectacle prescription = Myope
+	then recommendation = Hard
+
+$p/t = 3/3 = 1$
+
+# Pseudo-code for PRISM
+
+For each class C
+	Init E to set of training examples
+	While E contains examples in class C
+		Create rule R with empty LHS predicting class C
+		Until p/t=1, do
+			For each attribute A not mentioned in R, and each value v
+				Consider adding condition A=v to LHS of R
+				Select A and v to maximise p/t
+					Break Ties by choosing largest sample
+			Add A=v to R
+		Remove examples covered by R from E
+
+# Separate and Conquer
+
+- PRISM with outer loop removed generates list of rules for one class
+- PRISM with outer loop removed is separate and conquer algorithm
+	- Identify useful rule
+	- Separate examples covered
+	- Conquer remaining examples
+
+# Rule Execution
+
+- Default Rule
+	- If no rules cover example, prediction is the majority class (most frequent in training data)
+- Conflict Resolution Strategy
+	- If more than one rule covers an example, select predicted class with highest recurrance in training data