asks: Most of your statistical calculations should be carried out using Excel only and you will use Microsoft Word and Excel to complete this assignment.
1.Select a Random Sample: Select a random sample of size 50 from the given 1000 cases. You will use this sample data to complete tasks 2 to 6. Since repeat cases may occur when you draw random numbers and each case can be included only once in your sample, some of you may end up with a sample size less than 50. In such a situation do not draw another sample to acquire 50 values. You will continue to work on your sample only, even though your sample is smaller than 50.
2.Descriptive Statistics: Use appropriate data summary methods to describe the bank data in your sample using the six variables. Use an appropriate graphical and/or summary statistical technique, chosen according to the type of variable These techniques will be chosen from:
Tabular Techniques:Frequency tables and Grouped frequency tables.
Graphical Techniques:Pie chart, Bar graph, Histogram, Frequency Polygon.
Summary Statistics:Mode, Median, Mean, Standard Deviation, Range, Coefficient of Variation and Interquartile Range.
NB: You will need to choose the most appropriate technique(s) for each variable being analysed. Less appropriate/inappropriate techniques will receive fewer/no marks. Do not draw Ogive curve, Stem plot or a box plot in this assignment.
There are six variables all together to be analysed.
For a nominal or an ordinal variable draw a graph and present a frequency table in percentages.
For a ratio or an interval variable draw a graph and a summary statistics table.
Try to use variation in drawing graphs eg pie chart/ bar chart or histogram/polygon as much as possible. Do not draw two different graphs for the same variable.
3.Confidence intervals: Estimate the following quantities, using 95% confidence intervals. Explain the meaning of your confidence intervals.
(i)The average age of account holders for the open accounts only.
(ii)The average total transaction dollars for all accounts.
Compare both intervals with their respective true means. . The computations and output should be placed in an appendix.
(a)It is often felt that female account holders have different account balances on average than male account holders. For this hypothesis only consider OPEN accounts. Investigate this contention by carrying out an appropriate hypothesis test.
(b)It is often felt that the average transaction dollars differ for Visa and non-visa accounts. Use ACCOUNT TYPE for this test.
Only report a non-technical explanation of your methodology and your findings in the main section of the report. The computations and output should be placed in an appendix.
5.Correlation and Regression
In this section you will investigate the relationship between the transaction dollars and age of the account holders for OPEN accounts only.
Using these two variables develop a regression model to predict average transaction dollar from their age. Make sure that you undertake a full regression analysis, with appropriate discussion and include:
•a scattergram and a brief discussion
•an estimate of the linear regression model
•the coefficients of correlation and determination
•a test of the hypothesis that there is no linear relationship between transaction dollars and the age of an account holder.
.The computations and output should be placed in an appendix.