The Normal Distribution f(x) µ s x It is bell-shaped Mean = Median = Mode It is symmetrical around the mean The random variable has an in nite theoretical range: 1 to +1 1
If random variable X has a normal distribution with and variance 2, then it is shown as X N(; 2 ) where the probability density function is f(x) = 1 1 p 2 e 2 (x )2 2
The cumulative distribution function is f(x) F (x 0 ) = P (X x 0 ) = Z x0 1 f(x)dx 0 x 0 x 3
The total area under the curve is 1.0, and the curve is symmetric, so half is above the mean, half is below f(x) P( < X < μ) = 0.5 P(μ < X < ) = 0.5 0.5 0.5 µ P( < X < ) = 1.0 X 4
The Standardized Normal (Standart Normal Da¼g l m) Any normal distribution can be transformed into the standardized normal distribution (Z), with mean 0 and variance 1 Z = X and Z N(0; 1) It obtains the following f(z) Z ~ N(01), 0 1 Z 5
Note that the distribution is the same, only the scale is standardized a b x f(x) = < < = < < σ μ a F σ μ b F σ μ b Z σ μ a P b) X P(a σ b μ σ a μ Z µ 0 6
The Standardized Normal Table gives cumula- 7
tive probability for any value of z 8
Ex: X N(8; 25) ) P (X < 8:6) =? Z = X = 8:6 8 5 = 0:12, P (Z < 0:12) = 0:5478 µ = 8 s = 10 µ = 0 s = 1 8 8.6 X 0 0.12 Z P(X < 8.6) P(Z < 0.12) X rassal de¼gişkeninin alabilece¼gi de¼gerlerin %54.78 i 8.6 n n alt ndad r 9
For negative Z-values, use the fact that it is symmetric distribution Ex: P (Z < 2:00) =? = P (Z > 2:00) = 1 P (Z < 2:00) ) P (Z < 2:00) = 1 0:9772 = 0:0228.9772.0228.0228.9772 Z Z 10
Ex: Finding the X value for a Known Probability X N(8; 25) ise X in hangi de¼geri X in alabilece¼gi tüm de¼gerlerin %20 sinin üstündedir?.20.80? 8.0 0.84 0 11
Z de¼geri için bahsi geçen de¼gerin 0.84 oldu¼gunu standart normal tablosundan biliyoruz. O halde Z = X ) X = + Z = 8 + ( 0:84)5 = 3:8 12
Lognormal Distribution If X (= ln Y ) is normally distributed with and, then Y has a log-normal distribution ln(x) N(; 2 ) The lognormal distribution is used to model continuous random quantities when the distribution is believed to be skewed, such as certain income and lifetime variables 13
The lognormal is skewed to the right (ln100 = 4:6 ln10 = 2:3) 14
DISTRIBUTION OF SAMPLE STATISTICS Sampling from a Population Örnek: 2, 4, 6, 6, 7, 8 say lar ndan oluşan bir populasyonumuz olsun Bu say lardan 3 elemanl bir örneklem (sample) seçebiliriz. Bu elemanlar da 2, 6, 7 olsun. Bu 3 say n n ortalamas 5 tir. Di¼ger yandan populasyonumuzun ortalamas 5.5 tir. 15
Örneklemler seçmeye devam edersek Örneklem Ortalama 2, 6, 7 5 2, 7, 8 5.7 4, 7, 8 6.33 2, 4, 7 4.33 Burada 3 elemanl örneklemlerin ortalamalar n n ne kadar de¼gişebilece¼gi (4.33, 5,..., 5.66) hakk nda kir sahibi olduk (distribution of sample means) 16
Sampling Distribution of Sample Means Central Limit Theorem: As n becomes large, the distribution of X Z = X = X = p n approaches the standard normal distribution regardless of the underlying probability distribution. That is X N(; 2 n ) 17
The standard deviation of the distribution of X decreases when sample size, n; increases 18
Law of large numbers: Central limit theorem states that X N(; 2 =n). Hence, as n become large, the mean of the samples, X, converges to the population mean, : 19
CONFIDENCE INTERVAL ESTIMATION: ONE POPULA- TION A point estimator of a population parameter is a function of the sample information that yields a single number An interval estimator of a population parameter is a rule for determining (based on the sample information) a range, or interval, in which the parameter is likely to fall 20
Interval Estimation Assume is a random variable P (a < < b) = 1 the quantity 100(1 )% is called the con - dence level of the interval the interval from a to b is called the 100(1 )% con ence interval of 21
Con dence Interval Estimation for the Mean of a Normal Distribution: Population Variance Known Örnek: Ortalamas, standart sapmas olan bir populasyondan n elemanl bir X örneklemi seçip bununla populasyonun ortalamas n aral k tahmini ile bulmak istersek Örne¼gin bu da¼g l m n sadece ortadaki %90 l k bölümüyl ilgilendi¼gimizde, iki kenardan da %5 lik bölümü at yoruz Sa¼g taraftan att ¼g m zda ilgilendi¼gimiz Z de¼gerinin 1:645 oldu¼gunu, sol taraftan att ¼g m zda ise bunun 22
simetri¼gi olan 1:645 olaca¼g n bulabiliriz 23
%90 güven aral ¼g şu şekilde bulunabilir 0:90 = P ( 1:645 < Z < 1:645) 1:645 < x = p n < 1:645 x 1:645 p < x < 1:645 p n n 1:645 p < < x + 1:645 p n n Örneklem ortalamas ndan 1.645 standart sapma sa¼ga ve sola gitti¼gimizde populasyon ortalamas için %90 güven aral ¼g n elde etmiş oluyoruz 24
Farkl örneklemler kullan ld ¼g nda () için aşa¼g daki gibi güven aral klar elde edilebilecektir Bu güven aral klar n n %90 yü içerecektir 25
Güven aral klar n n genel şekli %90 un d ş nda en çok kullan lan güven aral klar %95 ve %99 dur 26
Bunlar için de¼gerleri s ras yla %5 ve %1 dir z de¼gerleri ise F (z =2 ) = F (z 0:025 ) = 1:96 F (z =2 ) = F (z 0:005 ) = 2:575 27
Con dence Interval Estimation for the Mean of a Normal Distribution: Population Variance Unknown: The t Distribution For a random sample from a normal pupulation with mean and variance 2, the random variable X has a normal distribution with mean and variance 2 =n; i.e. X Z = = p n has the standard normal distribution. 28
But if is unknown, usually sample estimate is used; X t = s x = p n In this case the random variable t follows the Student s t distribution with (n 1) degrees of freedom 29
A random variable having the Student s t distribution with degrees of freedom will be denoted t. Then t ; is the number for which P (t > t ; ) = 30
31
A 100(1 )% con dence interval for the population mean, variance unknown, given by x t n 1;=2 s x p n < < x + t n 1;=2 s x p n 32
Örnek: Rassal bir şekilde seçilmiş 6 araban n galon/mil cinsinden yak t tüketimlerişu şekildedir: 18.6, 18.4, 19.2, 20.8, 19.4 ve 20.5. E¼ger bu arabalar n seçildi¼gi populasyona ait arabalar n yak t tüketimi normal da¼g l yorsa, bu populasyonun ortalama yak t tüketimi için %90 güven aral ¼g n bulunuz Populasyon varyans verilmedi¼ginden önce örneklem varyans n hesaplay p önceki sayfadaki for- 33
mülü kullanabiliriz. Örneklem varyans için i x i x 2 i 1 18.6 345.96 2 18.4 338.56 3 19.2 368.64 4 20.8 432.64 5 19.4 376.36 6 20.5 420.25 Sums 116.9 2,282 34
Dolay syla örneklem ortalamas np x i i=1 x = n örneklem varyans np (x i x) 2 i=1 s 2 = = n 1 ve standart sapmas = 116:9 6 np i=1 x 2 i x 2 n 1 s x = p :96 = :98 = 19:5 = 22822 6 19:5 2 5 = 35
Arad ¼g m z güven aral ¼g x t n 1;=2 s x p n < < x + t n 1;=2 s x p n where n = 6 =2 = :10=2 = :05 ) t 5;:05 = 2:015 19:48 2:015 :98 p 6 < < 19:48 + 2:015 :98 p 6 dolay s yla 18:67 < < 20:29 36
Farkl güven aral klar n n sonucu ise aşa¼g daki gibidir 37
HYPOTHESIS TESTING We test validity of a claim about a population parameter by using a sample data Null Hypothesis: The hypothesis that is maintained unless there is strong evidence against it Alternative Hypothesis: The hypothesis that is accepted when the null hypothesis is rejected Note: If you do not reject the null hypothesis, it does not mean that you accept it. You just fail to reject it 38
Simple Hypothesis: A hypothesis that population parameter,, is equal to a speci c value, 0 H 0 : = 0 Composite Hypothesis: A hypothesis that population parameter is equal to a range of values 39
Hypothesis Test Decisions: Type I Error: Rejecting a true null hypothesis Type II Error: The failure to reject a false null hypothesis Signi cance Level of a Test: The probability of making Type I error, which is often denoted in percentage and by : Power of a Test: The probability of not making Type II error 40
Null is True Null is False Reject Null Type I Error Correct Fail to Reject Null Correct Type II Error Type I and Type II errors are inversely related: As one increases, the other decreases (but not one to one) 41
Tests of the Mean of a Normal Distribution: Population Variance Known A random sample of n observations was obtained from a normally distributed population with mean and known variance 2. We know that this sample mean has a standard normal distribution X Z = = p n with mean 0 and variance 1 42
A test with signi cance level of the null hypothesis H 0 : = 0 against the alternative H 1 : > 0 is obtained by using the following decision rule x Reject H 0 if : 0 = p n > z or equivalently x > 0 + z = p n 43
If we use a gure 44
In this case is the signi cance level of the test (Probability of rejecting a true null hypothesis) If it was two-sided test, the signi cance level of the test would had been 2 Yet, the power of the test (The probability of not rejecting a false null hypothesis) is not 1 2: Because, if null hypothesis is wrong, then you hold the alternative hypothesis. It means the underlying distribution is di erent 45
Örnek: Bir mal n üretim sistemi do¼gru olarak çal şt ¼g zaman, ürünlerin a¼g rl ¼g n n ortalamas n n 5 kg, standart sapmas n n da 0.1 kg oldu¼gu, ve bu a¼g rl klar n normal bir da¼g l ma sahip oldu¼gu görülmüştür. Üretim müdürü taraf ndan yap lan bir de¼gişiklik sonucunda, ortalama ürün a¼g rl ¼g n n artmas, ama standart sapmas n n de¼gişmemesi amaçlanm şt r. Bu de¼gişiklikten sonra 16 elemanl rassal bir örneklem seçildi¼gi zaman, bu örneklemdeki ürünlerin ortalama a¼g rl ¼g 5.038 kg olarak bulunmuştur. Son populasyondaki ürün 46
a¼g rl ¼g n n 5 kg olmas null hipotezini, alternatif hipotez olan 5 kg dan büyük olmas hipotezine göre %5 ve %10 önem derecesinde (signi cance level) test ediniz Biz aşa¼g daki hipotezi H 0 : = 5 şu alternetif hipoteze göre test etmek istiyoruz H 1 : > 5 Aşa¼g daki koşul sa¼gland ¼g zaman H 0 H 1 a 47
karş reddedebiliriz X = p n > z Soruda verilenler: x = 5:038 0 = 5 n = 16 = :1; dolay s yla X 0 = p n = 5:038 5 :1= p 16 = 1:52 Önem derecesi %5 ise; standart normal tablosundan %5 e denk gelen z de¼geri z 0:05 = 1:645 48
dolay s yla 1.52 bu say dan daha büyük olmad ¼g ndan null hipotezini %5 önem seviyesinde reddedemiyoruz (fail to reject) Önem derecesi %10 ise; standart normal tablosundan %10 e denk gelen z de¼geri z 0:1 = 1:28 bu sefer 1.52 bu say dan daha büyük oldu¼gundan null hipotezini %10 önem düzeyinde reddedebiliyoruz 49
Probability Value (p-value)*: In the previous example we have seen that we could not reject a test at %5 signi cance level, but at %10. Hence it is possible to nd the smallest signi cance level at which the null hypothesis is rejected, this is called p-value of a test. Formally, if random sample of n observations was obtained from a normally distributed population with mean and known variance 2, and if the observed sample mean is x, the null hypothesis H 0 : = 0 50
is tested against the alternative H 1 : > 0 The p-value of the test is p value = P ( x = p n z p j H 0 : = 0 ) 51
Örnek: Bir önceki örnekte X 0 = p n = 5:038 5 :1= p 16 = 1:52 bulunmuştu. Bu eşitli¼gi sa¼glayan de¼geri standart normal tablosundan 0.643 olarak bulunabilir, testin p-de¼geridir. Şekille gösterirsek 52
Simple Null Against Two-Sided Alternative To test the null hypothesis H 0 : = 0 against the alternative at signi cance level H 1 : 6= 0 use the following decision rule Reject H 0 if : or X 0 = p n < z =2 X 0 = p n > z =2 53
Şekille gösterirsek 54
Tests of the Mean of a Normal Distribution: Population Variance Unknown We are given a random sample of n observations was obtained from a normally distributed population with mean. Using the sample mean and sample standart deviation, x and s respectively, we can use the following tests with signi cance level 55
1. To test the null hypothesis H 0 : = 0 or H 0 : 6 0 against the alternative H 1 : > 0 the decision rule is as follows x Reject H 0 if : 0 s x = p n > t n 1; 56
2. To test the null hypothesis H 0 : = 0 or H 0 : > 0 against the alternative H 1 : < 0 the decision rule is as follows x Reject H 0 if : 0 s x = p n < t n 1; 57
3. To test the null hypothesis against the alternative H 0 : = 0 H 1 : 6= 0 the decision rule is as follows x Reject H 0 if : 0 s x = p n < t n 1;=2 x or 0 s x = p n > t n 1;=2 58
bunu şekille gösterirsek 59
Assessing the Power of a Test Determining the Probability of Type II Error Consider the test H 0 : = 0 against the alternative H 1 : > 0 using the decision rule Reject H 0 if : x 0 = p n > z 60
Now suppose the null hypothesis is wrong and the population mean,, is in the region of H 1. Type II error is the failure to reject a false null hypothesis. Thus, we consider a = such that > 0. Then the probability of making Type II error is = P (z < x = p n ) therefore the Power of a Test (the probability of not making Type II error) 1 61
Örnek: Daha önce verdi¼gimiz örnekte, 16 elemanl rassal bir örneklem seçildi¼gi zaman, bu örneklemdeki ürünlerin ortalama a¼g rl ¼g n n 5 kg olmas null hipotezini, alternatif hipotez olan 5 kg dan büyük olmas hipotezine göre %5 önem derecesinde test etmiştik Biz aşa¼g daki hipotezi H 0 : = 5 şu alternatif hipoteze göre test etmek istiyoruz H 1 : > 5 62
Soruda verilenler: 0 = 5 n = 16 2 = :1 z = z :05 = 1:645; dolay s yla H 0 H 1 a karş reddetmek için karar kural (decision rule) x 0 = p n = x 5 :1=4 > 1:645 ya da x > 1:645 (:1=4) + 5 = 5:041 bu da demek oluyor ki örneklem ortalamas 5.041 den küçük oldu¼gunda null hipotezimizi reddedemiyor olaca¼g z 63
Diyelim ki populasyon ortalamas 5.05 olsun (yani alternatif hipotez do¼gru olsun), ve null hipotezimizi reddetmeyerek Type II Error yapma ihtimalimizi bulal m. Yani populasyon ortalamas 5.05 iken örneklem ortalamas n n 5.041 den küçük olma ihtimalini P ( X 5:041) = P (Z 5:041 = p n ) 5:041 5:05 = P (Z ) = P (Z :36) :1=4 = 1 :64 = 0:36 64
dolay s yla testimizin gücü P ower = 1 = :64 65
Şekille gösterirsek 66