代写Omics Course: Assignment Univariate analysis and cluste

100%原创包过,高质量代写&免费提供Turnitin报告--24小时客服QQ&微信：273427

代写Omics Course: Assignment Univariate analysis and clustering
Omics Course: Assignment Univariate analysis and clustering

Dr. Jeroen Jansen and MSc. Gerjen Tinnevelt

Univariate analysis (3.3 points)

Data source:

http://www.ebi.ac.uk/arrayexpress/experiments/E-MTAB-887/

Experimental background:

Paper of this study (use for background reading only!)

Use Excel for the analysis of this micro-array data, using the file ‘assignment data.xlsx’ provided to you on blackboard.

Figure 1 http://en.wikipedia.org/wiki/File:Fulvestrant.svg

This data is taken from a paper on the effect of a medicine ‘Fulvestrant’ on the development of mammary tumours in human females. Women were measured before treatment (t=0) and four weeks after treatment. They received either a low dose (250 mg) or a high dose (500 mg) of this medication. The data given to you are of the women who received a high dose and who are measured before and after 4 weeks of treatment.

Analyse with separate t-tests the difference between low and high dose on the transcription.

1. Calculate p-values for each gene with the excel function t.test. Discuss whether you need a one or two-tailed test (do this by defining a hypothesis and a null hypothesis). Assume that the people measured before the experiment are different from those measured afterwards.

2. Perform a Bonferroni correction. Explain exactly in words, what the result of this correction means.

3. Perform a FDR correction on these p-values. Show the effect of the correction on the selected biomarkers in a figure. Why is this correction more sensible than Bonferroni?

4. a. Make a volcano plot of this analysis. Indicate which genes are significant.

b. You have selected a threshold for the p-value of 0.05. What is the effect of increasing this

threshold? Does that make sense in the context of microarray analysis?

c. Describe which knowledge you would need to select a minimal fold change threshold.

Clustering (3.3 points)

This section uses the program cluster 3.0 installed on your computer and the program TreeView. You

can try out the program with the file ‘wijn.txt’. The Cluster3.0 program contains an excellent manual,

which tells you what to do step-by-step. Follow the steps described below for the Example data.txt to

cluster this data.

The data used for the assignment is described in Eisen et al., a paper that has been made available to

you on Blackboard. This paper also contains a description about how they did the analysis of this data

with an earlier version of this program.

- Open the program Cluster and load the data ‘Example data.txt’. The data is already in an

appropriate format for direct use.

- ‘Filter Data’ tab: Select which genes to include: discard all missing values.

- Skip the ‘Adjust Data’ tab.

- Perform a ‘Hierarchical’ clustering on this data. Cluster both the genes and the arrays. Choose

appropriate distances for this clustering.

Note: if you re-do an analysis, I advise you to re-start the cluster program to clear the memory.

- After the clustering has completed, start the ‘treeview’ program, load the .cdt file you just

made

- Look at the cluster plot with the Treeview program. If your cluster plot is black, go to ‘pixel

settings’ in the ‘Settings’ top menu bar and adjust the contrast.

1. Now select a cluster in the Example data and carefully explain what this cluster tells us about

the data.

2. Re-do the analysis with the alpha data: ‘example data_alpha.txt’.

3. Discuss using the results given in the lecture sheets what you expect the effect will be of

clustering the time-points. Do you see such clusters in your cluster diagrams?

Final assignment (3.3 points)

As a final assignment: try to explain IN WORDS the differences and similarities between univariate and

cluster analysis, by comparing the 2-D dendrogram with the volcano plot of the wine data described

in the lecture sheets given below. Refer to the described characteristics of a biomarker.

−log10(p)

log2fold change

Alcohol

Malic acid

Ash

Alkalinity of Ashes

Magnesium

Total phenols

Flavonoids

Non-flavonoid phenols

Proanthocyanidins

Color

Hue

Optical Density 280/315 nm ofdiluted wines

Proline
代写Omics Course: Assignment Univariate analysis and clustering