vault backup: 2025-03-16 18:59:42

This commit is contained in:
boris
2025-03-16 18:59:42 +00:00
parent 6befcc90d4
commit ae837183f1
188 changed files with 17794 additions and 409 deletions

View File

@@ -12,6 +12,7 @@
# $a_i = \frac{v_i - minv_i}{maxv_i - minv_i}$
Where:
- $a_i$ is normalised value for attribute $i$
- $v_i$ is the current value for attribute $i$
- $maxv_i$ is largest value of attribute $i$
@@ -19,8 +20,10 @@ Where:
## Example
# $maxv_{humidity} = 96$
# $maxv_{humidity} = 96$
# $minv_{humidity} = 65$
# $v_{humidity} = 80.5$
# $a_i = \frac{80.5-65}{96-55} = \frac{15.5}{31} = 0.5$
@@ -28,8 +31,11 @@ Where:
## Example (Transport Dataset)
# $maxv_{doors} = 5$
# $minv_{doors} = 2$
# $v_{doors} = 3$
# $a_i = \frac{3-2}{5-2} = \frac{1}{3}$
# Nearest Neighbor Applied (Transport Dataset)
@@ -39,14 +45,18 @@ Where:
- Right most column shows euclidean distances between each vehicle and new vehicle
- New vehicle is closest to the 1st example, a taxi, NN predicts taxi
![](Pasted%20image%2020241010133818.png)
# $vmin_{doors} = 2$
# $vmax_{doors} = 5$
# $vmin_{seats} = 7$
# $vmax_{seats} = 65$
# Missing Values
## Missing Nominal Values
## Missing Nominal Values
- Assume missing feature is maximally different from any other value
- Distance is:
@@ -72,7 +82,7 @@ Where:
- Number of seats of one example = 16
- Normalised = 9/58
- One missing
- 1 - 9/58 = 49/58
- 1 - 9/58 = 49/58
## Normalised Transport Data with Missing Values
@@ -85,13 +95,13 @@ Where:
## Euclidean Distance
# $\sqrt{(a_1-a_1')^2) + (a_2-a_2')^2 + ... + (a_n-a_n')^2}$
# $\sqrt{(a_1-a_1')^2) + (a_2-a_2')^2 + + (a_n-a_n')^2}$
Where $a$ and $a'$ are two examples with $n$ attributes and $a'$ is the value of attribute $i$ for $a$
## Manhattan Distance
# $|a_1-a_1'|+|a_2-a_2'|+...+|a_n-a_n'|$
# $|a_1-a_1'|+|a_2-a_2'|++|a_n-a_n'|$
Vertical bar means absolute value
Negative becomes positive
@@ -109,4 +119,3 @@ Euclidean distance is generally a good compromise
- Does not detect noise
- Use k-NN, get k closest examples and take majority vote on solutions
![](Pasted%20image%2020241011131542.png)

View File

@@ -1,18 +1,21 @@
![](Pasted%20image%2020241011131844.png)
## Normalisation Equation
# $a_i = \frac{v_i - minv_i}{maxv_i - minv_i}$
## Euclidean Distance Equation
# $\sqrt{(a_1-a_1')^2) + (a_2-a_2')^2 + ... + (a_n-a_n')^2}$
# $a_i = \frac{v_i - minv_i}{maxv_i - minv_i}$
## Euclidean Distance Equation
# $\sqrt{(a_1-a_1')^2) + (a_2-a_2')^2 + … + (a_n-a_n')^2}$
# $vmax_{temp} = 85$
# $vmin_{temp} = 64$
# $a_{temp} = \frac{v_{temp} - 64}{21}$
# $vmax_{humidity} = 96$
# $vmin_{humidity} = 65$
# $a_{humidity} = \frac{v_{humidity} - 65}{31}$

View File

@@ -39,4 +39,4 @@ Root mean squared error 0.3409
Relative absolute error 90.9091 %
Root relative squared error 90.9091 %
Total Number of Instances 2
```
```