Covering Algorithms

Each class in turn; find set of rules covering all examples
At each stage, rule is identified that covers some examples
Example is covered if satisfying conditions in the antecedent (LHS) of the rule
Consider dataset with 2 predicting numeric attributes, and two class values.

PRISM: Simple Covering Algorithm

Generates rule by adding tests that maximise probability of desired class
Similar to situation in decision trees; problem of selecting attribute to split on
Each new test reduces rules coverage
Rule becomes more specific as tests are added
Search strategy is general-to-specific

Selecting a Test

Goal: Maximise probability of desired class

t = total number of examples covered by rule
p = number of positive examples of the class covered by rule
t - p = number of errors made by rule
=> Select test that maximises ratio p/t Stop Condition: t-p=0
p = t, p/t=1
Or, set of examples cannot be split further.

Example: Contact Lenses Dataset | Class = Hard

Selecting 1st Test of 1st Rule

Rule to Seek: If ? { then recommendation = Hard }

Modified Rule and Coverage

Rule with best test added: If astigmatism = Yes { then recommendation = Hard }

Selecting 2nd Test of 1st Rule

If astigmatism = Yes and ? { then recommendation = Hard }

Modified Rule and its Coverage

Rule with best test added: If astigmatism = Yes and tear rate = Normal { then recommendation = Hard }

Selecting 3rd Test of 1st Rule

If astigmatism = Yes and tear rate = Normal and ? { then recommendation = Hard }
PRISM will use test with highest sample size, therefore using Myope.

1st Rule for Class = Hard

Final Rule: If astigmatism = Yes and tear rate = Normal and spectacle prescription = Myope then recommendation = Hard

p/t = 3/3 = 1

Pseudo-code for PRISM

For each class C Init E to set of training examples While E contains examples in class C Create rule R with empty LHS predicting class C Until p/t=1, do For each attribute A not mentioned in R, and each value v Consider adding condition A=v to LHS of R Select A and v to maximise p/t Break Ties by choosing largest sample Add A=v to R Remove examples covered by R from E

Separate and Conquer

PRISM with outer loop removed generates list of rules for one class
PRISM with outer loop removed is separate and conquer algorithm
- Identify useful rule
- Separate examples covered
- Conquer remaining examples

Rule Execution

Default Rule
- If no rules cover example, prediction is the majority class (most frequent in training data)
Conflict Resolution Strategy
- If more than one rule covers an example, select predicted class with highest recurrance in training data

3.0 KiB Executable File Raw Blame History