Visit the Australian Stock Exchange website, www.asx.com.au and from “Prices and research” drop-down menu, select “Company information”. Type in the ASX code “CCL” (Coca-Cola Amatil Limited), and find out details about the company. Your task will be to get the opening prices of a CCL share for every quarter from January 2001 to December 2015. If you are working with the monthly prices, read the values in the beginning of every Quarter (January, April, July, October) for every year from 2001 to 2015. It is part of the assignment task to test your ability to find the information from an appropriate website. If you are unable to do so, you may read the values from the chart provided below obtained from Etrade Australia. Obviously, reading from the chart will not be accurate and you may expect around 60 percent marks with such inaccuracy. After you have recorded the share prices, answer the following questions:

(a) List all the values in a table and then construct a stem-and-leaf display for the data.

(b) Construct a relative frequency histogram for these data with equal class widths, the first class being “$4 to less than $6”.

(c) Briefly describe what the histogram and the stem-and-leaf display tell you about the data. What effects would there be if the class width is doubled, which means the first class will be “$4 to less than $8”?

(d) What proportion of stock prices were above $10?

(a) Compute the mean, median, first quartile, and third quartile for each capital city (with only the data provided for that city, do not add/delete values for any new/given suburb of question 2) using the exact position, (n+1)f, where n is the number of observations and f the relevant fraction for the quartile.

(b) Compute the standard deviation, range and coefficient of variation from the sample data for each city.

(c) Draw a box and whisker plot for the median weekly rents of each city and put them side by side on the same scale so that the prices can be compared.

(d) Compare the box plots and comment on the distribution of the data.

(a) What is the probability that an Australian household, randomly selected, uses solar as a source of energy?

(b) What is the probability that an Australian household, randomly selected, uses mains gas and is located in Victoria?

(c) Given that a household uses LPG/bottled gas, what is the probability that the household is located in South Australia?

(d) Is the percentage of Australian households using mains gas independent of the state?

Question 4 4 Marks

(a) The following data collected from the Australian Bureau of Meteorology Website gives the daily rainfall data for the year 2015 in Brisbane. The zero values indicate no rainfall and the left-most column gives the date. Assuming that the weekly rainfall event (number of days in a week with rainfall) follows a Poisson distribution (There are 52 weeks in a year and a week is assumed to start from Monday. The first week starts from 29 December 2014 – you are expected to visit the website and get the daily values which are not given in the table below. Make sure you put the correct station number. Ignore the last few days of 2015 if it exceeds 52 weeks.):

(i) What is the probability that on any given week in a year there would be no rainfall?

(ii) What is the probability that there will be 2 or more days of rainfall in a week?

(Question 4 continued)

(b) Assuming that the weekly total amount of rainfall (in mm) from the data provided in part (a) has a normal distribution, compute the mean and standard deviation of weekly totals.

(i) What is the probability that in a given week there will be between 5 mm and 10 mm of rainfall?

(ii) What is the amount of rainfall if only 13% of the weeks have that amount of rainfall or higher?

The following data is taken from the UCI machine learning data repository It lists a few attributes of red wine, randomly sampled from thousands of bottles, which can be classified as of good, medium and poor quality.

(a) Test for normality of all the variables separately for good wine using normal probability plot.

(b) Construct a 95% confidence interval for each of the variables for good wine.

(c) Find the mean of each of the variables for medium quality red wine. Do the same for the poor quality red wine.

(d) Check if the means calculated for the medium and poor quality red wines fall within the corresponding confidence intervals of the good quality wine. For those attributes whose means lie outside the confidence interval, the attributes are significant in determining the quality. This assumption is, however, partially compromised if the attribute fails the normality test. Identify the significant and non-significant variables, and comment.

2018-03-05
Assignment