DESeq package

1

Entering edit mode

Maia de Oliveira, Julio ▴ 20

@maia-de-oliveira-julio-5557

Last seen 9.6 years ago

Dear all, I am a PhD student in plant physiology at Wageningen UR and I need your help. Recently I got some proteomics data and I was not satisfied with the conventional analysis methods. Because of that I decided to try the DESeq package with some modifications. For example I did not normalized the data based on the number of sequences per gene, which would be the case for RNA-seq data. That is how my data looks like: ID C_WT C_WT C_WT PEG_WT PEG_WT PEG_WT AT3G21380_eID: 0005 32226632.41 19278731.46 25208573.73 21830699.73 19152002.48 25693292.2 AT3G51800_eID: 0005 32226632.41 19278731.46 25208573.73 21830699.73 19152002.48 25693292.2 AT4G14960_eID: 0007 29369953.85 23428679.56 35486655.03 22267518.14 20372442.03 20090671.03 AT5G19770_eID: 0007 29369953.85 23428679.56 35486655.03 22267518.14 20372442.03 20090671.03 AT5G19780_eID: 0007 29369953.85 23428679.56 35486655.03 22267518.14 20372442.03 20090671.03 AT3G58610_eID: 0021 5589074.283 5418898.375 8965797.349 16127244.2 13529093.34 11655398.67 AT4G01870_eID: 0027 2753358.267 3657843.548 5927653.877 7214439.251 3256542.178 4595558.56 AT3G11930_eID: 0029 23168859.54 18555307.69 26202779.43 21907172.5 25295000.28 24971871.92 AT4G28520_eID: 0029 23168859.54 18555307.69 26202779.43 21907172.5 25295000.28 24971871.92 AT3G21370_eID: 0033 23482254.09 14374658.31 27199081.02 16256182.11 10129321.89 15323853.37 Basically, what I have is a table containing the proteins distributed per treatment with their relative abundances. I am not familiar with all the maths behind it so I was wondering if the method, described in your paper (bellow), to calculate the fold changes in expression/abundance would be correct for this type of data. I am asking that because I see differences in the number of significant changes when I calculate them "manually" in comparison to the *nbinomTest.* """Having estimated the dispersion for each gene, it is straight-forward to look for differentially expressed genes. To contrast two conditions, e.g., to see whether there is differential expression between conditions\untreated"and\treated", we simply call the function *nbinomTest*. It performs the tests as described in [1] and returns a data frame with the p-values and other useful information. > res = nbinomTest( cds, "untreated", "treated" ) > head(res)"""" Hope hearing from you soon. Kind regards, Julio -------------------- Julio Maia Laboratory of Plant Physiology, Department of Plant Sciences Wageningen University Building 107, room W1Be065/Desk 16 Droevendaalsesteeg 1, 6708 PB Wageningen tel: 0031-61-4632069 / 0317482800 E-mail: julio.maiadeoliveira@wur.nl<mailto:julio.maiadeoliveira@wur.nl> Website: http://www.wageningenseedlab.nl/<http: www.wageningenseedlab.nl=""/> [[alternative HTML version deleted]]

Proteomics DESeq Proteomics DESeq • 2.0k views

ADD COMMENT • link updated 11.5 years ago by Bernd Fischer ▴ 550 • written 11.5 years ago by Maia de Oliveira, Julio ▴ 20

1

Entering edit mode

Bernd Fischer ▴ 550

@bernd-fischer-5348

Last seen 7.3 years ago

Germany / Heidelberg / DKFZ

Dear Julio! your data doesn't look like as it is count-data as required for the DESeq package. If you have spectral counts from peptide identifications, DESeq would be appropriate, but your data doesn't look like. It look like abundance data, wither from protein microarrays or quantitative mass spec data. If the data is from protein microarrays, you should consider to use the bioconductor package limma and, if necessary, normalize your data beforehand, e.g. by the bioconductor package vsn. If the data is mass spectrometry data (e.g. abundance from MS1-level, or iTraq data), you will in the first hand get abundance values per peptide. In this case, I advise you to first take the log-ratios per peptide per sample and then summarize the peptide ratios to a protein ratio (e.g. by the mean). Afterwards, you can use the biocondcutor-package limma to test for differential abundance. Best, Bernd -- Bernd Fischer EMBL Heidelberg Meyerhofstraße 1 69117 Heidelberg Tel: +49 [0] 6221 387-8131 E-Mail: bernd.fischer@embl.de Homepage: http://www-huber.embl.de/users/befische/ On 15.10.2012, at 18:12, "Maia de Oliveira, Julio" <julio.maiadeoliveira@wur.nl> wrote: > Dear all, > > I am a PhD student in plant physiology at Wageningen UR and I need > your help. Recently I got some proteomics data and I was not > satisfied with the conventional analysis methods. Because of that I > decided to try the DESeq package with some modifications. For example > I did not normalized the data based on the number of sequences per > gene, which would be the case for RNA-seq data. That is how my data looks like: > > ID C_WT C_WT C_WT PEG_WT PEG_WT PEG_WT > AT3G21380_eID: 0005 32226632.41 19278731.46 25208573.73 21830699.73 19152002.48 25693292.2 > AT3G51800_eID: 0005 32226632.41 19278731.46 25208573.73 21830699.73 19152002.48 25693292.2 > AT4G14960_eID: 0007 29369953.85 23428679.56 35486655.03 22267518.14 20372442.03 20090671.03 > AT5G19770_eID: 0007 29369953.85 23428679.56 35486655.03 22267518.14 20372442.03 20090671.03 > AT5G19780_eID: 0007 29369953.85 23428679.56 35486655.03 22267518.14 20372442.03 20090671.03 > AT3G58610_eID: 0021 5589074.283 5418898.375 8965797.349 16127244.2 13529093.34 11655398.67 > AT4G01870_eID: 0027 2753358.267 3657843.548 5927653.877 7214439.251 3256542.178 4595558.56 > AT3G11930_eID: 0029 23168859.54 18555307.69 26202779.43 21907172.5 25295000.28 24971871.92 > AT4G28520_eID: 0029 23168859.54 18555307.69 26202779.43 21907172.5 25295000.28 24971871.92 > AT3G21370_eID: 0033 23482254.09 14374658.31 27199081.02 16256182.11 10129321.89 15323853.37 > > > > Basically, what I have is a table containing the proteins > > distributed per treatment with their relative abundances. I am not > > familiar with all the maths behind it so I was wondering if the > > method, described in your paper (bellow), to calculate the fold > > changes in expression/abundance would be correct for this type of > > data. I am asking that because I see differences in the number of > > significant changes when I calculate them "manually" in comparison to > > the *nbinomTest.* > > > > """Having estimated the dispersion for each gene, it is > > straight-forward to look for differentially expressed genes. To > > contrast two conditions, e.g., to see whether there is differential > > expression between conditions\untreated"and\treated", we simply call > > the function *nbinomTest*. It performs the tests as described in [1] > > and returns a data frame with the p-values and other useful information. > > > >> res = nbinomTest( cds, "untreated", "treated" ) > > > >> head(res)"""" > > > > Hope hearing from you soon. > > > > Kind regards, > > > > Julio > > > -------------------- > Julio Maia > Laboratory of Plant Physiology, Department of Plant Sciences > Wageningen University > Building 107, room W1Be065/Desk 16 > Droevendaalsesteeg 1, 6708 PB Wageningen > tel: 0031-61-4632069 / 0317482800 > E-mail: julio.maiadeoliveira@wur.nl<mailto:julio.maiadeoliveira@wur.nl> > Website: http://www.wageningenseedlab.nl/<http: www.wageningenseedlab.nl=""/> > > > [[alternative HTML version deleted]] > > _______________________________________________ > Bioconductor mailing list > Bioconductor@r-project.org > https://stat.ethz.ch/mailman/listinfo/bioconductor > Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor [[alternative HTML version deleted]]

ADD COMMENT • link 11.5 years ago Bernd Fischer ▴ 550

Login before adding your answer.