代写Omics Course: Assignment Univariate analysis and cluste

  • 100%原创包过,高质量代写&免费提供Turnitin报告--24小时客服QQ&微信:273427
  • 代写Omics Course: Assignment Univariate analysis and clustering
    Omics Course: Assignment Univariate analysis and clustering
    Dr. Jeroen Jansen and MSc. Gerjen Tinnevelt
    Univariate analysis (3.3 points)
    Data source:
    http://www.ebi.ac.uk/arrayexpress/experiments/E-MTAB-887/
    Experimental background:
    Paper of this study (use for background reading only!)
    Use Excel for the analysis of this micro-array data, using the file ‘assignment data.xlsx’ provided to you on blackboard.
    Figure 1 http://en.wikipedia.org/wiki/File:Fulvestrant.svg
    This data is taken from a paper on the effect of a medicine ‘Fulvestrant’ on the development of mammary tumours in human females. Women were measured before treatment (t=0) and four weeks after treatment. They received either a low dose (250 mg) or a high dose (500 mg) of this medication. The data given to you are of the women who received a high dose and who are measured before and after 4 weeks of treatment.
    Analyse with separate t-tests the difference between low and high dose on the transcription.
    1. Calculate p-values for each gene with the excel function t.test. Discuss whether you need a one or two-tailed test (do this by defining a hypothesis and a null hypothesis). Assume that the people measured before the experiment are different from those measured afterwards.
    2. Perform a Bonferroni correction. Explain exactly in words, what the result of this correction means.
    3. Perform a FDR correction on these p-values. Show the effect of the correction on the selected biomarkers in a figure. Why is this correction more sensible than Bonferroni?
    4. a. Make a volcano plot of this analysis. Indicate which genes are significant.
    b. You have selected a threshold for the p-value of 0.05. What is the effect of increasing this
    threshold? Does that make sense in the context of microarray analysis?
    c. Describe which knowledge you would need to select a minimal fold change threshold.
    Clustering (3.3 points)
    This section uses the program cluster 3.0 installed on your computer and the program TreeView. You
    can try out the program with the file ‘wijn.txt’. The Cluster3.0 program contains an excellent manual,
    which tells you what to do step-by-step. Follow the steps described below for the Example data.txt to
    cluster this data.
    The data used for the assignment is described in Eisen et al., a paper that has been made available to
    you on Blackboard. This paper also contains a description about how they did the analysis of this data
    with an earlier version of this program.
    - Open the program Cluster and load the data ‘Example data.txt’. The data is already in an
    appropriate format for direct use.
    - ‘Filter Data’ tab: Select which genes to include: discard all missing values.
    - Skip the ‘Adjust Data’ tab.
    - Perform a ‘Hierarchical’ clustering on this data. Cluster both the genes and the arrays. Choose
    appropriate distances for this clustering.
    Note: if you re-do an analysis, I advise you to re-start the cluster program to clear the memory.
    - After the clustering has completed, start the ‘treeview’ program, load the .cdt file you just
    made
    - Look at the cluster plot with the Treeview program. If your cluster plot is black, go to ‘pixel
    settings’ in the ‘Settings’ top menu bar and adjust the contrast.
    1. Now select a cluster in the Example data and carefully explain what this cluster tells us about
    the data.
    2. Re-do the analysis with the alpha data: ‘example data_alpha.txt’.
    3. Discuss using the results given in the lecture sheets what you expect the effect will be of
    clustering the time-points. Do you see such clusters in your cluster diagrams?
    Final assignment (3.3 points)
    As a final assignment: try to explain IN WORDS the differences and similarities between univariate and
    cluster analysis, by comparing the 2-D dendrogram with the volcano plot of the wine data described
    in the lecture sheets given below. Refer to the described characteristics of a biomarker.
    −log10(p)
    log2fold change
    Alcohol
    Malic acid
    Ash
    Alkalinity of Ashes
    Magnesium
    Total phenols
    Flavonoids
    Non-flavonoid phenols
    Proanthocyanidins
    Color
    Hue
    Optical Density 280/315 nm ofdiluted wines
    Proline
    代写Omics Course: Assignment Univariate analysis and clustering