problem #3
test=matrix(scan("prob3-spr07.txt"), ncol=3, byrow=T)

x=test[,2]	#contamination for site 1 for the 20-day period
y=test[,3]	#contamination for site 2 for the 20-day period

# Check Normality..
#histogram
hist(x)
#QQ plot
qqnorm(x)
qqline(x)
> ks.test(x, "pnorm", mean=mean(x), sd=sd(x))

        One-sample Kolmogorov-Smirnov test

data:  x
D = 0.0924, p-value = 0.9956
alternative hypothesis: two.sided

#histogram
hist(y)
#QQ plot
qqnorm(y)
qqline(y)

ks.test(y, "pnorm", mean=mean(y), sd=sd(y), exact=TRUE)

        One-sample Kolmogorov-Smirnov test

data:  y
D = 0.1215, p-value = 0.9294
alternative hypothesis: two.sided
-------
While the p-value from the K-S test is high indicating that the data is
Normally distributed, the qqplot and the histogram indicate there might be 
slight non-Normality. 
---------------
(i) Parametric  test
two sample t-test  Equation 10-15 with unequal variance and 
Paired t-test Equation 10-22 

two sample t-test 
-------------------
t.test(x,y, var.equal=FALSE, paired=TRUE)

        Paired t-test

data:  x and y
t = 3.5992, df = 19, p-value = 0.001912
alternative hypothesis: true difference in means is not equal to 0
95 percent confidence interval:
 1.255446 4.744554
sample estimates:
mean of the differences
                      3

> t.test(x,y, var.equal=FALSE)

        Welch Two Sample t-test

data:  x and y
t = 2.8498, df = 37.59, p-value = 0.00706
alternative hypothesis: true difference in means is not equal to 0
95 percent confidence interval:
 0.8681746 5.1318254
sample estimates:
mean of x mean of y
    17.85     14.85
-----------
Both the tests indicate that the mean contamination between the two sites 
significantly (at 95% confidence) different.
--------

Nonparametric test
-------------------
Wilcox Rank Sum test or Signed Rank test are appropriate as it is for 
data assuming continuous  distributions differing in their mean.  We have to
assume the same variance.
Wilcoxon Rank Sum Test
-----------------------
wilcox.test(x,y, exact=FALSE)

        Wilcoxon rank sum test with continuity correction

data:  x and y
W = 294, p-value = 0.01114
alternative hypothesis: true mu is not equal to 0

This test suggests that the mean contamination is different between the sites
just about at 90% confidence.
----

Wilcoxon Signed Rank Test  (assuming a paired situation)
--------------------------

wilcox.test(x,y, paired=TRUE, exact = FALSE)

        Wilcoxon signed rank test with continuity correction

data:  x and y
V = 180, p-value = 0.005294
alternative hypothesis: true mu is not equal to 0

This test (from the p-value) also suggests that the mean contamination is 
different between the sites at  greater than 95% confidence.
----------

A really general test with absolutely no assumption on the distribution 
of the data is the Sign Test
sign.test - available in the library BSDA in R.
Download the library BSDA and the function 'sign.test' will be available.
sign.test(x,y)
$rval
       Dependent-samples Sign-Test

data:  test[, 2] and test[, 3]
S = 16, p-value = 0.01182
alternative hypothesis: true median difference is not equal to 0
95 percent confidence interval:
1.116471 6.000000
sample estimates:
median of x-y
         3.5

Clearly, the null hypothesis is rejected at just about 90% confidence (from
the p-value).
----------------------------------------------------------------------------
(ii)
While the p-value from the K-S test is high indicating that the data is
Normally distributed, the qqplot and the histogram indicate there might be 
slight non-Normality.  Hence, nonparametric tests are to be preferred,
as they don't make assumption of the distribution of the data.
Wilcox Signed Rank or the Sign test are preferred.
--
(iii)
delta = 0.42
N = 10
d = delta/sd(x)  = 0.12
From Chart VII (g) for N = 10 and d=0.12
Beta = 0.15
Probability of rejecting Ho Mu = 2.03 is 1-0.15 = 0.85 = Power of Test
Clearly, the sample size is barely adequate to achieve a power of 0.8.
More samples will increase the power of the test.
----------------