a) Thecontingencytabletodisplaytherelationshipbetween‘Treatmentcode’and‘Timeperiod’to

first infection for patients in this study is given below:

Table 1: Treatment code * Time_period Crosstabulation

Time_period

short time medium time long time Total

Treatment

code

Gamma

Interferon

Count 5 39 32 76

%withinTreatment

code

6.6% 51.3% 42.1% 100.0%

% within Time_period 21.7% 44.8% 53.3% 44.7%

% of Total 2.9% 22.9% 18.8% 44.7%

Placebo Count 18 48 28 94

%withinTreatment

code

19.1% 51.1% 29.8% 100.0%

% within Time_period 78.3% 55.2% 46.7% 55.3%

% of Total 10.6% 28.2% 16.5% 55.3%

Total Count 23 87 60 170

%withinTreatment

code

13.5% 51.2% 35.3% 100.0%

% within Time_period 100.0% 100.0% 100.0% 100.0%

% of Total 13.5% 51.2% 35.3% 100.0%

b) Outofatotalof170patients,5havereceivedGammainterferonandhaveexperienceda

short time to first infection

= 5/170 = 2.9%

c) 28 out of 94 receiving placebo, experience a long time to first infection, which is 29.8%

d) Null hypothesis : There is no association between the two variables

Alternative: There is some association between the two variables

Todisproveindependence,ifitisprovedthatforany2eventsAandB,P(A|B)isnotequalto

P(A) , then A and B are not independent

LetA:NumberofPatientstakingGammaInterferonandBbenumberofpatientsexperiencinga

short time to first infection

P(A|B) =0.217

P(A)=0.447

SinceP(A|B)isnotequaltoP(A),itimpliesthatthereappearstobeanassociation

between‘Treatmentcode’andthe‘Timeperiod’tofirstinfectionofthepatientssufferingfrom

chronic granulotomous disease (cgd )

) A histogram with normality curve is given below for the variable:

b) Fromthehistogramwithnormalcurveoverfitted,thedistributionisclosetonormalasthe

shapeisprettyclosetothebellshape,ifnotperfect.Therearesomeoutlierstowardsthefag

endofthecurveandsomehighvaluesatthestarting,duetowhichthedistributionisnot

perfectly symmetric.

c) The sample size is 170, mean is 84.69 mmHg and Standard dev = 13.019

d) The median is 83 mmHg and IQR is 15

e) Sincethedistributionisfairlynormal,meanisalmostequaltomedian,meanwouldbean

appropriatemeasureofcentraltendencyandstandarddeviation,anappropriatemeasureof

spread, as it is a deviation from the mean

a) The two variables are height and weight and these are continuous variables

b) The scatter plot is :

c) Fromthescatter,itisclearthatthereisalinearpositiverelationshipbetweenthe two

variablesandit isstrong.Thereseemstobenooutliersorleveragepointsinthisscatter

) Thecorrelationcoefficientisusedtomeasurethestrengthanddirection.Thecorrelation

coefficient value is:

Correlations

Height Weight

Height Pearson Correlation 1 .922**

Sig. (2-tailed) .000

N 170 170

Weight Pearson Correlation .922** 1

Sig. (2-tailed) .000

N 170 170

a) The variable is Age and the unit is years

b) Let X be the random variable denoting Age

X~N(14,81)

P(X>25)

ð Standardizing

P(Z>(25-14)/9)

=P(Z>1.222)

=0.111

c) Let x be the age,

P(X

=>P(Z<(x-14)/9) =0.85

a) Thisisanexperimentalstudyasthesubjectsarebeingappliedtotreatmentandthen

observedoveracertainlengthoftime.19subjectsweretreatedtodifferentlevelsofcranberry

juice over the period of three months

b)

i) The response variables are: Antioxidant level , total cholesterol, HDL, and triglycerides

ii) Factoriscranberryjuiceatthreedifferentlevels

iii) The sample size is 19(11 women and 8 men)

c) The 4 principles of experimental design are:

2) Randomize:Therewasrandomizationasthe19subjectsweresplitinto2groupsof

10and9randomly.

3) Replicate: Therearenoreplicateshereasthewholeexperimentwasperformedjust

once

4) Block: Therewerenohomogeneousblocksassigned,justthattherewere2different

randomly assigned groups with different treatments applied.

d) Aconfoundingvariableisonewhichisdirectlyorindirectlyrelatedtotheindependent

variableorthedependentvariable.Aplausibleconfoundingvariablecouldbenumberofhours

ofwalkingduringtheexperimentalstudy.Itisdirectlylinkedtocholesterollevelandthe

resultinglossofcholesterolcanbeconfoundedwithkeepingnumberofhoursofwalkingat

check

Question 6

a) Anappropriatemodelherewould beBinomial.LetXbethenumberofchildrenaged5-17

whoareoverweightorobese.X~Bin(20,0.25),where20isthesamplesizeand0.25isthe

probability of success of the event.

b) The conditions are:

i) The sample size is fixed,n=20

ii) The observations are independent as these are 20 independent 5-17 years old

iii) Eachofthechildiseitherobeseornot

iv) The probability of a child being obese or overweight is 0.25 for each child

c) Mean = 0.25*20 = 5

Variance:0.25*0.75*20=3.75

d) Required probability: P(X>=5)

=1-P(X<=4)

=1- ∑=

−

0 4

20 20 *0.25 *0.75 x to

=1-0.4148

=0.585

e) In this case, X~Bin(100,0.25)

np=25>=10

np(1-p) = 18.75>=10

Sincenpandnp(1-p)>=10,bytheruleofthumb,anormalapproximationtoBinomialis

appropriate here.

Let be the normal counterpart to Binomial X.Then

Y~N(25,18.75)

Required probability: P(X>29.5) (After applying continuity correction)

Standardizing:

P(Z>(29.5-25)/SQRT(18.75)

x x

1) Control: The factors like diet, exercise regime, etc were not controlled in the study