Fitting Distributions

Continuous Distributions

  • Assessing Distributions Visually
  • Formal Tests for Distribution Fit
  • Maximum Likelihood calculation

Normal Distribution

QQ plot indicates the data might be normal by remaining close to the line. The Box plot, histogram, and density curve all support this assumption.

Formal tests all agree that the data are from the normal distribution. Shapiro Wilk is considered the best for test for testing normality.


    Shapiro-Wilk normality test

data:  x1
W = 0.95643, p-value = 0.4753

    Anderson-Darling normality test

data:  x1
A = 0.27523, p-value = 0.6216

    One-sample Kolmogorov-Smirnov test

data:  x1
D = 0.13528, p-value = 0.8109
alternative hypothesis: two-sided

Chi-squared Distribution

      df    
  1.8140625 
 (0.3244201)

    One-sample Kolmogorov-Smirnov test

data:  x2
D = 0.15608, p-value = 0.6584
alternative hypothesis: two-sided

Calculating the MLE manually

[1] 276565.8
[1] 0.2

    One-sample Kolmogorov-Smirnov test

data:  X
D = 0.1475, p-value = 0.7232
alternative hypothesis: two-sided


Discrete Distributions

  • Fitting a Binomial model
  • Chi-squared goodness of fit test
  • Producing a markdown table

A rare but fatal disease of genetic origin occurring chiefly in infants and children is under investigation. An experiment was conducted on a 100 couples who are both carriers of the disease and have 5 children. A researcher recorded the number of children having the disease for each couple.

  Diseased Count
1        0    21
2        1    42
3        2    24
4        3     8
5        4     4
6        5     1
[1] 135
[1] 500
[1] 0.27
  Diseased Count Exp.Prob Exp.Diseased
1        0    21   0.2073        20.73
2        1    42   0.3834        38.34
3        2    24   0.2836        28.36
4        3     8   0.1049        10.49
5        4     4   0.0194         1.94
6        5     1   0.0014         0.14
  Diseased Count Exp.Prob Exp.Diseased    X.2
1        0    21   0.2073        20.73 0.0035
2        1    42   0.3834        38.34 0.3494
3        2    24   0.2836        28.36 0.6703
4        3     8   0.1049        10.49 0.5910
5   4 or 5     5   0.0208         2.08 4.0992
[1] 5.7134
[1] 0.2215985

Diseased Count Exp.Prob Exp.Diseased X.2
0 21 0.2073 20.73 0.0035
1 42 0.3834 38.34 0.3494
2 24 0.2836 28.36 0.6703
3 8 0.1049 10.49 0.5910
4 or 5 5 0.0208 2.08 4.0992
[1] 0.1829839 0.3570161
[1] 0.7926928