vault backup: 2024-10-16 09:12:37
This commit is contained in:
@@ -0,0 +1,214 @@
|
||||
# Assessment
|
||||
|
||||
## T1
|
||||
|
||||
- Exam (50%)
|
||||
|
||||
## T2
|
||||
|
||||
- Coursework (50%)
|
||||
|
||||
# Resources
|
||||
|
||||
Data Mining: Practical Machine Learning Tools and Techniques (Witten, Frank, Hall & Pal) 4th Edition 2016
|
||||
|
||||
Scientific Calculator
|
||||
|
||||
# Data Vs Information
|
||||
|
||||
- Too much data
|
||||
- Valuable resource
|
||||
- Raw data less important, need to develop techniques to extract information
|
||||
- Data: recorded facts
|
||||
- Information: patterns underlying data
|
||||
|
||||
# Philosophy
|
||||
|
||||
## Cow Culling
|
||||
|
||||
- Cows described by 700 features about certain variables
|
||||
- Problem is the selection of cows of which to cull
|
||||
- Data is historical records, and farmer decisions
|
||||
- Machine Learning used to ascertain which factors taken into account by farmers, rather than automating the decision making process.
|
||||
|
||||
# Definition of Data Mining
|
||||
|
||||
- The extraction of:
|
||||
- Implicit,
|
||||
- Previously unknown,
|
||||
- Potentially useful data
|
||||
- Programs that detect patterns and regularities are needed
|
||||
- Strong patterns => good predictions
|
||||
- Issues:
|
||||
- Most patterns not interesting
|
||||
- Patterns may be inexact
|
||||
- Data may be garbled or missing
|
||||
|
||||
# Machine Learning Techniques
|
||||
|
||||
- Algorithms for acquiring structural descriptions from examples
|
||||
- Structural descriptions represent patterns, explicitly.
|
||||
- Predict outcome in new situation
|
||||
- Understand and explain how prediction derived.
|
||||
- Methods originate from AI, statistics and research on databases.
|
||||
|
||||
# Can Machines Learn?
|
||||
|
||||
- By definition, sort of. The ability to obtain knowledge by study, experience or being taught, is very difficult to measure.
|
||||
- Does learning imply intention?
|
||||
|
||||
# Terminology
|
||||
|
||||
- Concept - Thing to be learned
|
||||
- Example / Instance - Individual, independent examples of a concept
|
||||
- Attributes / Features - Measuring aspects of an example / instance
|
||||
- Concept description (pattern, model, hypothesis) - Output for data mining algorithms.
|
||||
|
||||
# Famous Small Datasets
|
||||
|
||||
- Will be used in module
|
||||
- Unrealistically simple
|
||||
|
||||
## Weather Dataset - Nominal
|
||||
|
||||
Concept: conditions which are suitable for a game.
|
||||
Reference: Quinlan, J.R. (1986)
|
||||
Induction of decision trees. Machine
|
||||
Learning, 1(1), 81-106.
|
||||
|
||||
### Attributes
|
||||
|
||||
3\*3\*2\*2 = 36 possible combinations of values.
|
||||
Outlook
|
||||
|
||||
- sunny, overcast, rainy
|
||||
Temperature
|
||||
- hot, mild, cool
|
||||
Humidity
|
||||
- high, normal
|
||||
Windy
|
||||
- yes, no
|
||||
Play
|
||||
- Class
|
||||
- yes, no
|
||||
|
||||
### Dataset
|
||||
|
||||

|
||||

|
||||
|
||||
Rules ordered, higher = higher priority
|
||||
|
||||
### Weather Dataset - Mixed
|
||||
|
||||

|
||||

|
||||
|
||||
## Contact Lenses Dataset
|
||||
|
||||
Describes conditions under which an optician might want to prescribe soft, hard or no contact lenses.
|
||||
Grossly over-simplified.
|
||||
Reference: Cendrowska, J. (1987). Prism: an algorithm
|
||||
for inducing module rules. Journal of Man-Machine
|
||||
Studies, 27(4), 349–370.
|
||||
|
||||
### Attributes
|
||||
|
||||
3\*2\*2\*2 = 24 possibilities
|
||||
Dataset is exhaustive, which is unusual.
|
||||
|
||||
Age
|
||||
|
||||
- young, pre-presbyopic, presbyopic
|
||||
Spectacle Prescription
|
||||
- myope (short), hypermetrope (long)
|
||||
Astigmatism
|
||||
- yes, no
|
||||
Tear Production Rate
|
||||
- reduced, normal
|
||||
Recommended Lenses
|
||||
- class
|
||||
- hard, soft, none
|
||||
|
||||
### Dataset
|
||||
|
||||

|
||||
|
||||
## Iris Dataset
|
||||
|
||||
Used in many statistical experiments
|
||||
Contains numeric attributes of 3 different types of iris.
|
||||
Created in 1936 by Sir Ronald Fisher
|
||||
|
||||
### Dataset
|
||||
|
||||

|
||||
|
||||
# Styles of Learning
|
||||
|
||||
- Classification Learning: Predicting a **nominal** class
|
||||
- Numeric Prediction (Regression): Predicting a **numeric** quantity
|
||||
- Clustering: Grouping similar examples into clusters
|
||||
- Association Learning: Detecting associations between attributes
|
||||
|
||||
## Classification Learning
|
||||
|
||||
- Nominal
|
||||
- Supervised
|
||||
- Provided with actual value of the class
|
||||
- Measure success on fresh data for which class labels are known (test data)
|
||||
|
||||
## Numeric Prediction (Regression)
|
||||
|
||||
- Numeric
|
||||
- Supervised
|
||||
- Test Data
|
||||
|
||||

|
||||
|
||||
Example uses a linear regression function to provide an estimated performance value based on attributes.
|
||||
|
||||
## Clustering
|
||||
|
||||
- Finding similar groups
|
||||
- Unsupervised
|
||||
- Class of example is unknown
|
||||
- Success measured **subjectively**
|
||||
|
||||
## Association Learning
|
||||
|
||||
- Applied if no class specified, and any kind of structure is interesting
|
||||
- Difference to Classification Learning:
|
||||
- Predicts any attribute's value, not just class.
|
||||
- More than one attribute's value at a time
|
||||
- Far more association rules than classification rules.
|
||||
|
||||
## Classification Vs Association Rules
|
||||
|
||||
Classification Rule:
|
||||
|
||||
- Predicts value of a given attribute (class of example)
|
||||
- ``If outlook = sunny and humidity = high, then play = no``
|
||||
|
||||
Association Rule:
|
||||
|
||||
- Predicts value of arbitrary attribute / combination
|
||||
|
||||
```If temperature = cool, humidity = normal
|
||||
If humidity = normal and windy = false, play = yes
|
||||
If outlook = sunny and play = no, humidity = high
|
||||
If windy = false and play = no, then outlook = sunny and humidity = high
|
||||
```
|
||||
|
||||
# Data Mining and Ethics
|
||||
|
||||
- Ethical Issues arise in practical applications
|
||||
- Data mining often used to discriminate
|
||||
- Ethical situation depends on application
|
||||
- Attributes may contain problematic information
|
||||
- Does ownership of data bestow right to use it in other ways than those purported when it was originally collected?
|
||||
- Who is permitted to access the data?
|
||||
- For what purpose was the data collected?
|
||||
- What conclusions can sensibly be drawn?
|
||||
- Caveats must be attached to results
|
||||
- Purely statistical arguments never sufficient
|
135
AI & Data Mining/Week 1/Lecture 2 - Input and Output.md
Normal file
135
AI & Data Mining/Week 1/Lecture 2 - Input and Output.md
Normal file
@@ -0,0 +1,135 @@
|
||||
# Attributes
|
||||
|
||||
- Each example described by fixed pre-defined set of features (attributes)
|
||||
- Number of attributes may vary
|
||||
- ex. Transportation Vehicles
|
||||
- no. wheels not applicable to ships
|
||||
- no. masts not applicable to cars
|
||||
- Possible solution: "irrelevant value" flag
|
||||
- Attributes may be dependent on other attributes
|
||||
|
||||
# Taxonomy of Data Types
|
||||
|
||||

|
||||
|
||||
# Nominal Attributes
|
||||
|
||||
- Distinct symbols
|
||||
- Serve as labels or names
|
||||
- No relation implied among nominal values
|
||||
- Only equality tests can be performed
|
||||
- ex. outlook = sunny
|
||||
|
||||
# Sources of Missing Values
|
||||
|
||||
- Malfunctioning / Misconfigured Equipment
|
||||
- Changes in design
|
||||
- Collation of different datasets
|
||||
- Data not collected for mining
|
||||
- Errors and omissions dont affect purpose of data
|
||||
- ex. Banks do not need to know age in banking datasets, DOB may contain missing values
|
||||
- Missing value may have significance
|
||||
- ex. medical diagnoses can be made from tests a doctor decides, rather than the outcome.
|
||||
- Most DM algos assume this is not the case, hence "missing" may need to be coded as an additional nominal value.
|
||||
|
||||
# Inaccurate Values
|
||||
|
||||
- Typographical errors in nominal attributes
|
||||
- Typographical and measurement errors in numeric attributes
|
||||
- Deliberate errors
|
||||
- ex. Incorrect ZIP codes, unsanitised inputs
|
||||
- Duplicate examples
|
||||
|
||||
# Weka and ARFF
|
||||
|
||||
## Weather Dataset in ARFF
|
||||
|
||||

|
||||
|
||||
### Getting to Know the Data
|
||||
|
||||
- First task, get to know data
|
||||
- Simple visualisations useful:
|
||||
- Nominal: bar graph
|
||||
- Numeric: histograms
|
||||
- 2D and 3D plots show dependencies
|
||||
- Need to consult experts
|
||||
- Too much data? Take sample.
|
||||
|
||||
# Concept Descriptions
|
||||
|
||||
- Output of DM algorithm
|
||||
- Many ways of representing:
|
||||
- Decision Trees
|
||||
- Rules
|
||||
- Linear Regression Functions
|
||||
|
||||
## Decision Trees
|
||||
|
||||
- Divide-and-Conquer approach
|
||||
- Trees drawn upside down
|
||||
- Node at top is root
|
||||
- Edges are branches
|
||||
- Rectangles represent leaves
|
||||
- Leaves assign classification
|
||||
- Nodes involve testing attribute
|
||||
|
||||
### Decision Tree with Nominal Attributes
|
||||
|
||||

|
||||
|
||||
- Number of branches usually equal to number values
|
||||
- Attribute not tested more than once.
|
||||
|
||||
### Decision Tree with Numeric Attributes
|
||||
|
||||

|
||||
|
||||
- Test whether value is greater or less than constant
|
||||
- Attribute may be tested multiple times
|
||||
|
||||
### Decision Trees with Missing Values
|
||||
|
||||
- Not clear which branch should be taken when node tests attribute with missing value
|
||||
- Does absence of a value have significance?
|
||||
- Yes => Treat as separate value during training
|
||||
- No => Treat in special way during testing
|
||||
- Assign sample to most popular branch
|
||||
|
||||
# Classification Rules
|
||||
|
||||
- Popular alternative to decision tree
|
||||
- Antecedent (pre-condition) - series of tests
|
||||
- Tests usually logically ANDed together
|
||||
- Consequent (conclusion) - usually a class
|
||||
- Individual rules often logically ORed together
|
||||
|
||||
## If-Then Rules for Contact Lenses
|
||||
|
||||

|
||||
|
||||
# Nuggets
|
||||
|
||||
- Are rules independent
|
||||
- Problem: Ignores process of executing rules
|
||||
- Ordered set (decision list)
|
||||
- Order important for interpretation
|
||||
- Unordered set
|
||||
- Rules may overlap and lead to different conclusions for the same example
|
||||
- Needs conflict resolution
|
||||
|
||||
## Executing Rules
|
||||
|
||||
- What if $\geq$ 2 rules conflict?
|
||||
- Give no conclusion?
|
||||
- Go with the rule that covers largest no. training samples?
|
||||
- What is no rule applies to test example?
|
||||
- Give no conclusion?
|
||||
- Go with class that is most frequent?
|
||||
|
||||
## Special Case: Boolean Classes
|
||||
|
||||
- Assumption: if example does not belong to class "yes", belongs to "no"
|
||||
- Solution: only learn rules for class "yes", use default rule for "no"
|
||||

|
||||
- Order is important, no conflicts.
|
85
AI & Data Mining/Week 3/Lecture 5 - Naive Bayes.md
Normal file
85
AI & Data Mining/Week 3/Lecture 5 - Naive Bayes.md
Normal file
@@ -0,0 +1,85 @@
|
||||
# Statistical Modelling
|
||||
|
||||
- Using statistical modelling for classification
|
||||
- Bayesian techniques adopted by machine learning community in the 90s
|
||||
- Opposite of 1R, uses all attributes
|
||||
- Assume:
|
||||
- Attributes equally important
|
||||
- Statistically independent
|
||||
- Independence assumption never correct
|
||||
- Works in practice
|
||||
|
||||
# Weather Dataset
|
||||
|
||||

|
||||

|
||||
|
||||
# Bayes' Rule of Conditional Probability
|
||||
|
||||
- Probability of event H given evidence E:
|
||||
|
||||
# $Pr[H|E] = \frac{Pr[E|H]\times Pr[H]}{Pr[E]}$
|
||||
|
||||
- H may be ex. Play = Yes
|
||||
- E may be particular weather for new day
|
||||
- A priori probability of H: $Pr[H]$
|
||||
- Probability before evidence
|
||||
- A posteriori probability of H: $Pr[H|E]$
|
||||
- Probability after evidence
|
||||
|
||||
## Naive Bayes for Classification
|
||||
|
||||
- Classification Learning: what is the probability of class given instance?
|
||||
- Evidence $E$ = instance
|
||||
- Event $H$ = class for given instance
|
||||
- Naive assumption: evidence splits into attributes that are independent
|
||||
|
||||
# $Pr[H|E] = \frac{Pr[E_1|H] \times Pr[E_2|H]… Pr[E_n|H] \times Pr[H]}{Pr[E]}$
|
||||
|
||||
- Denominator cancels out during conversion into probability by normalisation
|
||||
|
||||
### Weather Data Example
|
||||
|
||||

|
||||
|
||||
# Laplace Estimator
|
||||
|
||||
- Remedy to Zero-frequency problem: Add 1 to the count for every attribute value-class combination (laplace estimator)
|
||||
- Result: probabilities will never be 0 (also stabilises probability estimates)
|
||||
- Simple remedy is one which is often used in practice when zero frequency problem arises.
|
||||
|
||||
## Example
|
||||
|
||||

|
||||
|
||||
# Modified Probability Estimates
|
||||
|
||||
- Consider attribute *outlook* for class *yes*
|
||||
# $\frac{2+\frac{1}{3}\mu}{9+\mu}$
|
||||
Sunny
|
||||
|
||||
# $\frac{4+\frac{1}{3}\mu}{9+\mu}$
|
||||
Overcast
|
||||
|
||||
# $\frac{3+\frac{1}{3}\mu}{9+\mu}$
|
||||
Rainy
|
||||
|
||||
- Each value treated the same way
|
||||
- Prior to seeing training set, assume each value is equally likely, ex. prior probability is $\frac{1}{3}$
|
||||
- When decided to add 1 to counts, we implicitly set $\mu$ to 3.
|
||||
- However, no particular reason to add 1 to the count, we could increment by 0.1 instead, setting $\mu$ to 0.3.
|
||||
- A large value of $\mu$ indicates prior probabilities are very important compared to evidence in training set.
|
||||
|
||||
## Fully Bayesian Formulation
|
||||
|
||||
# $\frac{2+\frac{1}{3}\mu p_1}{9+\mu}$
|
||||
Sunny
|
||||
|
||||
# $\frac{4+\frac{1}{3}\mu p_2}{9+\mu}$
|
||||
Overcast
|
||||
|
||||
# $\frac{3+\frac{1}{3}\mu p_3}{9+\mu}$
|
||||
Rainy
|
||||
|
||||
- Where $p_1 + p_2 + p_3 = 1$
|
||||
- $p_1, p_2, p_3$ are prior probabilities of outlook being sunny, overcast or rainy before seeing the training set. However, in practice it is not clear how these prior probabilities should be assigned.
|
51
AI & Data Mining/Week 3/Tutorial 3.md
Normal file
51
AI & Data Mining/Week 3/Tutorial 3.md
Normal file
@@ -0,0 +1,51 @@
|
||||
| Temperature | Skin | Blood Pressure | Blocked Nose | Diagnosis |
|
||||
| ----------- | ------ | -------------- | ------------ | --------- |
|
||||
| Low | Pale | Normal | True | N |
|
||||
| Moderate | Pale | Normal | True | B |
|
||||
| High | Normal | High | False | N |
|
||||
| Moderate | PaleFF | Normal | False | B |
|
||||
| High | Red | High | False | N |
|
||||
| High | Red | High | True | N |
|
||||
| Moderate | Red | High | False | B |
|
||||
| Low | Normal | High | False | B |
|
||||
| Low | Pale | Normal | False | B |
|
||||
| Low | Normal | Normal | False | B |
|
||||
| High | Normal | Normal | True | B |
|
||||
| Moderate | Normal | High | True | B |
|
||||
| Moderate | Red | Normal | False | B |
|
||||
| Low | Normal | High | True | N |
|
||||
|
||||
| | Temperature | | | Skin | | | Pressure | | | Blocked | | Diag | |
|
||||
| -------- | ----------- | --- | ------ | ---- | --- | ------ | -------- | --- | ----- | ------- | --- | ---- | ---- |
|
||||
| | N | B | | N | B | | N | B | | N | B | N | B |
|
||||
| Low | 2 | 3 | Pale | 1 | 3 | Normal | 1 | 6 | True | 3 | 3 | 5 | 9 |
|
||||
| Moderate | 0 | 5 | Normal | 2 | 4 | High | 4 | 3 | False | 2 | 6 | | |
|
||||
| High | 3 | 1 | Red | 2 | 2 | | | | | | | | |
|
||||
| | Temperature | | | Skin | | | Pressure | | | Blocked | | | Diag |
|
||||
| Low | 2/5 | 3/9 | Pale | 1/5 | 3/9 | Normal | 1/5 | 6/9 | True | 3/5 | 3/9 | 5/14 | 9/14 |
|
||||
| Moderate | 0/5 | 5/9 | Normal | 2/5 | 4/9 | High | 4/5 | 3/9 | False | 2/5 | 6/9 | | |
|
||||
| High | 2/5 | 1/9 | Red | 3/5 | 2/9 | | | | | | | | |
|
||||
|
||||
# Problem 1
|
||||
# $Pr[Diagnosis=N|E] = \frac{2}{5} \times \frac{2}{5} \times \frac{4}{5} \times \frac{3}{5} \times \frac{5}{14} = 0.027428571$
|
||||
# $Pr[Diagnosis = B|E] = \frac{3}{9} \times \frac{4}{9} \times \frac{3}{9} \times \frac{3}{9} \times \frac{9}{14} = 0.010582011$
|
||||
|
||||
# $p(B) = \frac{0.0106}{0.0106+0.0274} = 0.2789$
|
||||
|
||||
# $p(N) = \frac{0.0274}{0.0106+0.0274} = 0.7211$
|
||||
|
||||
Diagnosis N is much more likely than Diagnosis B
|
||||
|
||||
# Problem 2
|
||||
|
||||
# $Pr[Diagnosis = N|E] = \frac{2}{5} \times \frac{1}{5} \times \frac{3}{5} \times \frac{5}{14} = 0.0171$
|
||||
# $Pr[Diagnosis = B|E] = \frac{3}{9} \times \frac{6}{9} \times \frac{3}{9} \times \frac{9}{14} = 0.0476$
|
||||
# $p(N) = \frac{0.0171}{0.0171+0.0476} = 0.2643$
|
||||
# $p(B) = \frac{0.0474}{0.0476+0.0171} = 0.7357$
|
||||
|
||||
Diagnosis B is much more likely than Diagnosis N
|
||||
|
||||
# Problem 3
|
||||
|
||||
# $Pr[Diagnosis = N|E] = \frac{0}{5} \times \frac{2}{5} \times \frac{4}{5} \times \frac{3}{5} \times \frac{5}{14} = 0$
|
||||
# $Pr[Diagnosis = B|E] = \frac{5}{9} \times \frac{4}{9} \times \frac{3}{9} \times \frac{3}{9} \times \frac{9}{14} = 0.018$
|
277
AI & Data Mining/Week 3/Workshop 3.md
Normal file
277
AI & Data Mining/Week 3/Workshop 3.md
Normal file
@@ -0,0 +1,277 @@
|
||||
# Weather Dataset
|
||||
|
||||
## Dataset
|
||||
|
||||
```
|
||||
% This is a comment about the data set.
|
||||
% This data describes examples of whether to play
|
||||
% a game or not depending on weather conditions.
|
||||
@relation letsPlay
|
||||
@attribute outlook {sunny, overcast, rainy}
|
||||
@attribute temperature real
|
||||
@attribute humidity real
|
||||
@attribute windy {TRUE, FALSE}
|
||||
@attribute play {yes, no}
|
||||
|
||||
@data
|
||||
sunny,85,FALSE,no
|
||||
sunny,90,TRUE,no
|
||||
overcast,86,FALSE,yes
|
||||
rainy,96,FALSE,yes
|
||||
rainy,80,FALSE,yes
|
||||
rainy,70,TRUE,no
|
||||
overcast,65,TRUE,yes
|
||||
sunny,95,FALSE,no
|
||||
sunny,70,FALSE,yes
|
||||
rainy,80,FALSE,yes
|
||||
sunny,70,TRUE,yes
|
||||
overcast,90,TRUE,yes
|
||||
overcast,75,FALSE,yes
|
||||
rainy,91,TRUE,no
|
||||
```
|
||||
|
||||
## Output
|
||||
|
||||
```
|
||||
=== Run information ===
|
||||
|
||||
Scheme: weka.classifiers.bayes.NaiveBayes
|
||||
Relation: letsPlay
|
||||
Instances: 14
|
||||
Attributes: 5
|
||||
outlook
|
||||
temperature
|
||||
humidity
|
||||
windy
|
||||
play
|
||||
Test mode: evaluate on training data
|
||||
|
||||
=== Classifier model (full training set) ===
|
||||
|
||||
Naive Bayes Classifier
|
||||
|
||||
Class
|
||||
Attribute yes no
|
||||
(0.63) (0.38)
|
||||
===============================
|
||||
outlook
|
||||
sunny 3.0 4.0
|
||||
overcast 5.0 1.0
|
||||
rainy 4.0 3.0
|
||||
[total] 12.0 8.0
|
||||
|
||||
temperature
|
||||
mean 72.9697 74.8364
|
||||
std. dev. 5.2304 7.384
|
||||
weight sum 9 5
|
||||
precision 1.9091 1.9091
|
||||
|
||||
humidity
|
||||
mean 78.8395 86.1111
|
||||
std. dev. 9.8023 9.2424
|
||||
weight sum 9 5
|
||||
precision 3.4444 3.4444
|
||||
|
||||
windy
|
||||
TRUE 4.0 4.0
|
||||
FALSE 7.0 3.0
|
||||
[total] 11.0 7.0
|
||||
|
||||
Time taken to build model: 0 seconds
|
||||
|
||||
=== Evaluation on training set ===
|
||||
|
||||
Time taken to test model on training data: 0.01 seconds
|
||||
|
||||
=== Summary ===
|
||||
|
||||
Correctly Classified Instances 13 92.8571 %
|
||||
Incorrectly Classified Instances 1 7.1429 %
|
||||
Kappa statistic 0.8372
|
||||
Mean absolute error 0.2798
|
||||
Root mean squared error 0.3315
|
||||
Relative absolute error 60.2576 %
|
||||
Root relative squared error 69.1352 %
|
||||
Total Number of Instances 14
|
||||
```
|
||||
|
||||
# Medical Dataset
|
||||
|
||||
## Dataset
|
||||
|
||||
```
|
||||
```@relation medical
|
||||
@attribute Temperature {Low,Moderate,High}
|
||||
@attribute Skin {Pale,Normal,Red}
|
||||
@attribute BloodPressure {Normal,High}
|
||||
@attribute BlockedNose {True,False}
|
||||
@attribute Diagnosis {N,B}
|
||||
|
||||
@data
|
||||
Low, Pale, Normal, True, N
|
||||
Moderate, Pale, Normal, True, B
|
||||
High, Normal, High, False, N
|
||||
Moderate, Pale, Normal, False, B
|
||||
High, Red, High, False, N
|
||||
High, Red, High, True, N
|
||||
Moderate, Red, High, False, B
|
||||
Low, Normal, High, False, B
|
||||
Low, Pale, Normal, False, B
|
||||
Low, Normal, Normal, False, B
|
||||
High, Normal, Normal, True, B
|
||||
Moderate, Normal, High, True, B
|
||||
Moderate, Red, Normal, False, B
|
||||
Low, Normal, High, True, N```
|
||||
```
|
||||
|
||||
## Output
|
||||
|
||||
```
|
||||
=== Run information ===
|
||||
|
||||
Scheme: weka.classifiers.bayes.NaiveBayes
|
||||
Relation: diagnosis
|
||||
Instances: 14
|
||||
Attributes: 5
|
||||
Temperature
|
||||
Skin
|
||||
BloodPressure
|
||||
BlockedNose
|
||||
Diagnosis
|
||||
Test mode: evaluate on training data
|
||||
|
||||
=== Classifier model (full training set) ===
|
||||
|
||||
Naive Bayes Classifier
|
||||
|
||||
Class
|
||||
Attribute N B
|
||||
(0.38) (0.63)
|
||||
==============================
|
||||
Temperature
|
||||
Low 3.0 4.0
|
||||
Moderate 1.0 6.0
|
||||
High 4.0 2.0
|
||||
[total] 8.0 12.0
|
||||
|
||||
Skin
|
||||
Pale 2.0 4.0
|
||||
Normal 3.0 5.0
|
||||
Red 3.0 3.0
|
||||
[total] 8.0 12.0
|
||||
|
||||
BloodPressure
|
||||
Normal 2.0 7.0
|
||||
High 5.0 4.0
|
||||
[total] 7.0 11.0
|
||||
|
||||
BlockedNose
|
||||
True 4.0 4.0
|
||||
False 3.0 7.0
|
||||
[total] 7.0 11.0
|
||||
|
||||
Time taken to build model: 0 seconds
|
||||
|
||||
=== Evaluation on training set ===
|
||||
|
||||
Time taken to test model on training data: 0 seconds
|
||||
|
||||
=== Summary ===
|
||||
|
||||
Correctly Classified Instances 12 85.7143 %
|
||||
Incorrectly Classified Instances 2 14.2857 %
|
||||
Kappa statistic 0.6889
|
||||
Mean absolute error 0.2635
|
||||
Root mean squared error 0.3272
|
||||
Relative absolute error 56.7565 %
|
||||
Root relative squared error 68.2385 %
|
||||
Total Number of Instances 14
|
||||
```
|
||||
|
||||
# Using Test Data
|
||||
|
||||
## Test Data
|
||||
|
||||
```
|
||||
@relation medical
|
||||
@attribute Temperature {Low,Moderate,High}
|
||||
@attribute Skin {Pale,Normal,Red}
|
||||
@attribute BloodPressure {Normal,High}
|
||||
@attribute BlockedNose {True,False}
|
||||
@attribute Diagnosis {N,B}
|
||||
@data
|
||||
Low,Normal,High,True,N
|
||||
Low,?,Normal,True,B
|
||||
Moderate,Normal,High,True,B
|
||||
```
|
||||
|
||||
## Output
|
||||
|
||||
```
|
||||
=== Run information ===
|
||||
|
||||
Scheme: weka.classifiers.bayes.NaiveBayes
|
||||
Relation: medical
|
||||
Instances: 14
|
||||
Attributes: 5
|
||||
Temperature
|
||||
Skin
|
||||
BloodPressure
|
||||
BlockedNose
|
||||
Diagnosis
|
||||
Test mode: user supplied test set: size unknown (reading incrementally)
|
||||
|
||||
=== Classifier model (full training set) ===
|
||||
|
||||
Naive Bayes Classifier
|
||||
|
||||
Class
|
||||
Attribute N B
|
||||
(0.38) (0.63)
|
||||
==============================
|
||||
Temperature
|
||||
Low 3.0 4.0
|
||||
Moderate 1.0 6.0
|
||||
High 4.0 2.0
|
||||
[total] 8.0 12.0
|
||||
|
||||
Skin
|
||||
Pale 2.0 4.0
|
||||
Normal 3.0 5.0
|
||||
Red 3.0 3.0
|
||||
[total] 8.0 12.0
|
||||
|
||||
BloodPressure
|
||||
Normal 2.0 7.0
|
||||
High 5.0 4.0
|
||||
[total] 7.0 11.0
|
||||
|
||||
BlockedNose
|
||||
True 4.0 4.0
|
||||
False 3.0 7.0
|
||||
[total] 7.0 11.0
|
||||
|
||||
Time taken to build model: 0 seconds
|
||||
|
||||
=== Predictions on test set ===
|
||||
|
||||
inst# actual predicted error prediction
|
||||
1 1:N 1:N 0.652
|
||||
2 2:B 2:B 0.677
|
||||
3 2:B 2:B 0.706
|
||||
|
||||
=== Evaluation on test set ===
|
||||
|
||||
Time taken to test model on supplied test set: 0 seconds
|
||||
|
||||
=== Summary ===
|
||||
|
||||
Correctly Classified Instances 3 100 %
|
||||
Incorrectly Classified Instances 0 0 %
|
||||
Kappa statistic 1
|
||||
Mean absolute error 0.3215
|
||||
Root mean squared error 0.3223
|
||||
Relative absolute error 70.1487 %
|
||||
Root relative squared error 68.0965 %
|
||||
Total Number of Instances 3
|
||||
```
|
112
AI & Data Mining/Week 4/Lecture 7 - Nearest Neighbor.md
Normal file
112
AI & Data Mining/Week 4/Lecture 7 - Nearest Neighbor.md
Normal file
@@ -0,0 +1,112 @@
|
||||
- Instance Based
|
||||
- Solution to new problem is solution to closest example
|
||||
- Must be able to measure distance between pair of examples
|
||||
- Normally euclidean distance
|
||||
|
||||
# Normalisation of Numeric Attributes
|
||||
|
||||
- Attributes measured on different scales
|
||||
- Larger scales have higher impacts
|
||||
- Must normalise (transform to scale [0, 1])
|
||||
|
||||
# $a_i = \frac{v_i - minv_i}{maxv_i - minv_i}$
|
||||
|
||||
Where:
|
||||
- $a_i$ is normalised value for attribute $i$
|
||||
- $v_i$ is the current value for attribute $i$
|
||||
- $maxv_i$ is largest value of attribute $i$
|
||||
- $minv_i$ is smallest value of attribute $i$
|
||||
|
||||
## Example
|
||||
|
||||
# $maxv_{humidity} = 96$
|
||||
# $minv_{humidity} = 65$
|
||||
# $v_{humidity} = 80.5$
|
||||
|
||||
# $a_i = \frac{80.5-65}{96-55} = \frac{15.5}{31} = 0.5$
|
||||
|
||||
## Example (Transport Dataset)
|
||||
|
||||
# $maxv_{doors} = 5$
|
||||
# $minv_{doors} = 2$
|
||||
# $v_{doors} = 3$
|
||||
# $a_i = \frac{3-2}{5-2} = \frac{1}{3}$
|
||||
|
||||
# Nearest Neighbor Applied (Transport Dataset)
|
||||
|
||||
- Last row is new vehicle to be classified
|
||||
- N denotes normalised
|
||||
- Right most column shows euclidean distances between each vehicle and new vehicle
|
||||
- New vehicle is closest to the 1st example, a taxi, NN predicts taxi
|
||||

|
||||
# $vmin_{doors} = 2$
|
||||
# $vmax_{doors} = 5$
|
||||
# $vmin_{seats} = 7$
|
||||
# $vmax_{seats} = 65$
|
||||
|
||||
# Missing Values
|
||||
|
||||
## Missing Nominal Values
|
||||
|
||||
- Assume missing feature is maximally different from any other value
|
||||
- Distance is:
|
||||
- 0 if identical and not missing
|
||||
- 1 if otherwise
|
||||
|
||||
## Missing Numeric Values
|
||||
|
||||
- 1 if both missing
|
||||
- Assume maximum distance if one missing. Largest of:
|
||||
- (normalised) size of known value or
|
||||
- 1 - (normalised) size of known value
|
||||
|
||||
## Example (Weather Data)
|
||||
|
||||
- Humidity of one example = 76
|
||||
- Normalised = 0.36
|
||||
- One missing
|
||||
- Max distance = 1 - 0.36 = 0.64
|
||||
|
||||
## Example (Transport Data)
|
||||
|
||||
- Number of seats of one example = 16
|
||||
- Normalised = 9/58
|
||||
- One missing
|
||||
- 1 - 9/58 = 49/58
|
||||
|
||||
## Normalised Transport Data with Missing Values
|
||||
|
||||
- Last row to be classified
|
||||
- N denotes normalised
|
||||
- Right most column is euclidean values
|
||||

|
||||
|
||||
# Definitions of Proximity
|
||||
|
||||
## Euclidean Distance
|
||||
|
||||
# $\sqrt{(a_1-a_1')^2) + (a_2-a_2')^2 + ... + (a_n-a_n')^2}$
|
||||
|
||||
Where $a$ and $a'$ are two examples with $n$ attributes and $a'$ is the value of attribute $i$ for $a$
|
||||
|
||||
## Manhattan Distance
|
||||
|
||||
# $|a_1-a_1'|+|a_2-a_2'|+...+|a_n-a_n'|$
|
||||
|
||||
Vertical bar means absolute value
|
||||
Negative becomes positive
|
||||
|
||||
Another distance measure could be cube root of sum of cubes.
|
||||
Higher the power, greater influence of large differences
|
||||
Euclidean distance is generally a good compromise
|
||||
|
||||
# Problems with Nearest Neighbor
|
||||
|
||||
- Slow since every example must be compared with new
|
||||
- Assumes all attributes are equal
|
||||
- Only use important attributes to compute distance
|
||||
- Weight attributes according to importance
|
||||
- Does not detect noise
|
||||
- Use k-NN, get k closest examples and take majority vote on solutions
|
||||

|
||||
|
36
AI & Data Mining/Week 4/Tutorial 4 - Nearest Neighbor.md
Normal file
36
AI & Data Mining/Week 4/Tutorial 4 - Nearest Neighbor.md
Normal file
@@ -0,0 +1,36 @@
|
||||
|
||||

|
||||
|
||||
## Normalisation Equation
|
||||
# $a_i = \frac{v_i - minv_i}{maxv_i - minv_i}$
|
||||
## Euclidean Distance Equation
|
||||
# $\sqrt{(a_1-a_1')^2) + (a_2-a_2')^2 + ... + (a_n-a_n')^2}$
|
||||
|
||||
|
||||
# $vmax_{temp} = 85$
|
||||
# $vmin_{temp} = 64$
|
||||
|
||||
# $a_{temp} = \frac{v_{temp} - 64}{21}$
|
||||
|
||||
# $vmax_{humidity} = 96$
|
||||
# $vmin_{humidity} = 65$
|
||||
|
||||
# $a_{humidity} = \frac{v_{humidity} - 65}{31}$
|
||||
|
||||
| outlook | temp | NT | humidity | NH | windy | play | Euclidean Distance to a' Calculation | Euclidean Distance |
|
||||
| -------- | ---- | ---- | -------- | ---- | ----- | ---- | -------------------------------------------------- | ------------------ |
|
||||
| sunny | 85 | 1 | 85 | 0.65 | F | N | $\sqrt{(85-72)^2 + (85-76)^2 + (2-2)^2 + (0-1)^2}$ | 15.84 |
|
||||
| sunny | 80 | 0.76 | 90 | 0.81 | T | N | $\sqrt{(80-72)^2 + (90-76)^2+ (2-2)^2 + (1-1)^2}$ | 16.12 |
|
||||
| overcast | 83 | 0.90 | 68 | 0.68 | F | Y | $\sqrt{(83-72)^2 + (68-76)^2+ (1-2)^2 + (0-1)^2}$ | 13.67 |
|
||||
| rainy | 70 | 0.29 | 96 | 1 | F | Y | $\sqrt{(70-72)^2 + (96-76)^2+ (0-2)^2 + (0-1)^2}$ | 20.22 |
|
||||
| rainy | 68 | 0.19 | 80 | 0.48 | F | Y | $\sqrt{(68-72)^2 + (80-76)^2+ (0-2)^2 + (0-1)^2}$ | 25 |
|
||||
| rainy | 65 | 0.05 | 70 | 0.16 | T | N | $\sqrt{(65-72)^2 + (70-76)^2+ (0-2)^2 + (1-1)^2}$ | |
|
||||
| overcast | 64 | 0 | 65 | 0 | T | Y | $\sqrt{(64-72)^2 + (65-76)^2+ (1-2)^2 + (1-1)^2}$ | |
|
||||
| sunny | 72 | 0.38 | 95 | 0.97 | F | N | $\sqrt{(72-72)^2 + (95-76)^2+ (2-2)^2 + (0-1)^2}$ | |
|
||||
| sunny | 69 | 0.24 | 70 | 0.16 | F | Y | $\sqrt{(69-72)^2 + (70-76)^2+ (2-2)^2 + (0-1)^2}$ | |
|
||||
| rainy | 75 | 0.52 | 80 | 0.48 | F | Y | $\sqrt{(75-72)^2 + (80-76)^2+ (0-2)^2 + (0-1)^2}$ | |
|
||||
| sunny | 75 | 0.52 | 70 | 0.16 | T | Y | $\sqrt{(75-72)^2 + (70-76)^2+ (2-2)^2 + (1-1)^2}$ | |
|
||||
| overcast | 72 | 0.38 | 90 | 0.81 | T | Y | $\sqrt{(72-72)^2 + (90-76)^2+ (1-2)^2 + (1-1)^2}$ | |
|
||||
| overcast | 81 | 0.81 | 75 | 0.32 | F | Y | $\sqrt{(81-72)^2 + (75-76)^2+ (1-2)^2 + (0-1)^2}$ | |
|
||||
| rainy | 71 | 0.33 | 91 | 0.84 | T | N | $\sqrt{(71-72)^2 + (91-76)^2+ (0-2)^2 + (1-1)^2}$ | |
|
||||
| sunny | 72 | 0.38 | 76 | 0.35 | T | ?? | | |
|
42
AI & Data Mining/Week 4/Workshop 4 - Nearest Neighbor.md
Normal file
42
AI & Data Mining/Week 4/Workshop 4 - Nearest Neighbor.md
Normal file
@@ -0,0 +1,42 @@
|
||||
```
|
||||
=== Run information ===
|
||||
|
||||
Scheme: weka.classifiers.lazy.IBk -K 3 -W 0 -A "weka.core.neighboursearch.LinearNNSearch -A \"weka.core.EuclideanDistance -R first-last\""
|
||||
Relation: letsPlay
|
||||
Instances: 14
|
||||
Attributes: 5
|
||||
outlook
|
||||
temperature
|
||||
humidity
|
||||
windy
|
||||
play
|
||||
Test mode: user supplied test set: size unknown (reading incrementally)
|
||||
|
||||
=== Classifier model (full training set) ===
|
||||
|
||||
IB1 instance-based classifier
|
||||
using 3 nearest neighbour(s) for classification
|
||||
|
||||
Time taken to build model: 0 seconds
|
||||
|
||||
=== Predictions on test set ===
|
||||
|
||||
inst# actual predicted error prediction
|
||||
1 1:yes 1:yes 0.659
|
||||
2 1:yes 1:yes 0.659
|
||||
|
||||
=== Evaluation on test set ===
|
||||
|
||||
Time taken to test model on supplied test set: 0 seconds
|
||||
|
||||
=== Summary ===
|
||||
|
||||
Correctly Classified Instances 2 100 %
|
||||
Incorrectly Classified Instances 0 0 %
|
||||
Kappa statistic 1
|
||||
Mean absolute error 0.3409
|
||||
Root mean squared error 0.3409
|
||||
Relative absolute error 90.9091 %
|
||||
Root relative squared error 90.9091 %
|
||||
Total Number of Instances 2
|
||||
```
|
Reference in New Issue
Block a user