vault backup: 2025-03-16 18:59:42

2025-03-16 18:59:42 +00:00
parent 6befcc90d4
commit ae837183f1
188 changed files with 17794 additions and 409 deletions
--- a/Mining/Week
+++ b/Mining/Week
@@ -1,21 +1,21 @@
-1)
 a) Binomial Distribution
 b) Measures dispersion of probabilities with respect to a mean average value. Each possible value of S from 0 to N, the probability of observing S correct predictions given a sample of N independent examples of true accuracy P
-
-2)
 a) (150 + 180 + 420) / (150 + 180 + 420 + 30 + 50 + 50 + 40 + 50 + 30) = 0.75

-# Variance of S $\sigma^2_S = N_p(1-p)$ 
+# Variance of S $\sigma^2_S = N_p(1-p)$
+
 # Std Dev of S $\sigma_S = \sqrt{N_p(1-p)}$
+
 # Variance in F $\sigma_f = \frac{\sigma_S}{N} = \sqrt{\frac{N_p(1-p)}{N^2}} = \sqrt{\frac{p(1-p)}{N}}$

 # Estimate of Predictive Accuracy $\mu_f = \frac{S}{N}$
+
 # Successful Trials $S$
+
 # Number of Trials $N$

-
 750 Successes 1000 Trials
-S = 750 
+S = 750
 N = 1000
 $\mu_f$ = 0.75
 $\sqrt{(0.75 \times 0.25)/1000} = 0.0137$
@@ -25,9 +25,7 @@ $\mu_f \pm z \times \sigma_f = 0.75 \pm (1.28 \times 0.0137)$
 $= 0.75 \pm 0.0175$

 p lies between 73.25% and 76.75%, with 80% confidence.
-
-3)
 a)
 Stratified Holdout, data split to guarantee same distribution of class values in training and test set
 b)
-Repeated Holdout, training and testing done several times with different splits. Overall estimate of predictive accuracy is average of predicted accuracy in different iteration
+Repeated Holdout, training and testing done several times with different splits. Overall estimate of predictive accuracy is average of predicted accuracy in different iteration