Question

preparing sequencing data for use with anota

0

Entering edit mode

Guest User ★ 13k

@guest-user-4897

Last seen 9.7 years ago

I would like to analyse my sequencing data with anota, starting with the function "anotaPerformQc". Regrettably I get the following error message: anotaQcOut <- anotaPerformQc(dataT= my_data_cytosolic_mRNA, dataP=my_data_translational_Activity, phenoVec=vec, nDfbSimData=500, useProgBar=TRUE) Running anotaPerformQc quality control Calculating omnibus interactions & effects and dfbetas Error in if (groupSlope[i] > 1 | groupSlope[i] < 0) { : missing value where TRUE/FALSE needed > traceback() 1: anotaPerformQc(dataT = t, dataP = r, phenoVec = vec, nDfbSimData = 500, useProgBar = TRUE) My input data looks as follows: > head(my_data_cytosolic_mRNA) 1 2 3 4 5 6 7 8 A2M 3 0 7 0 6 4 5 13 A2ML1 4 11 3 0 3 1 6 3 A2MP1 2 2 2 0 0 2 2 6 A3GALT2 0 1 1 0 0 0 1 3 A4GALT 0 0 0 0 0 0 0 0 A4GNT 0 0 3 0 0 0 1 0 > head(my_data_translational_Activity) 1 2 3 4 5 6 7 8 A2M 9 0 18 4 9 41 0 0 A2ML1 4 5 1 1 0 0 2 0 A2MP1 0 0 0 0 0 0 0 0 A3GALT2 2 0 1 0 1 1 5 0 A4GALT 0 0 0 0 0 0 0 0 A4GNT 0 0 0 0 0 0 0 0 > vec [1] "wt" "wt" "wt" "wt" "mut" "mut" "mut" "mut" I read the anota vignette and reference manual, which mentions "groupSlope" in the explanation for the "omniGroupStats" argument. The arguments for the input data is simply described as "data matrix with non numerical rownames". Looking at the sample data provided with the package (see below) I ASSUME I need to process the sequencing count data before I use it within anota. > head(anota_example_counts) yorf norm dens count len total 1 15S_rRNA 1471.349 1261.805 2111 1673 857584 2 21S_rRNA 1192.194 1022.406 4563 4463 857584 3 HRA1 0.000 0.000 0 588 857584 4 LSR1 1548.272 1327.773 1592 1199 857584 5 NME1 105.715 90.659 33 364 857584 > head(anota_example_processed) [,1] 15S_rRNA 5.6848584 21S_rRNA 5.3864571 HRA1 0.5289467 LSR1 5.7882936 NME1 2.9789340 In the following paper introducing the anota package (http://www.pnas.org/content/107/50/21487.long) I found how the authors processed the sequencing data for analysis: "For the sequencing dataset, we used the count data supplied by the authors, filtered for identifiers originating from the coding regions, and used quantile normalization and a transformation to stabilize the variance." In case I am right that my data needs processing first, could please somebody suggest how I do "quantile normalization and a transformation to stabilize the variance" with my data. If the error I get is due to something else, please let me know how to solve my problem. I am new to R and bioconductor, please accept my apologies if I have overlooked something obvious. Thank you very much for your help! -- output of sessionInfo(): > sessionInfo() R version 3.0.2 (2013-09-25) Platform: i386-w64-mingw32/i386 (32-bit) locale: [1] LC_COLLATE=German_Switzerland.1252 LC_CTYPE=German_Switzerland.1252 LC_MONETARY=German_Switzerland.1252 LC_NUMERIC=C [5] LC_TIME=German_Switzerland.1252 attached base packages: [1] stats graphics grDevices utils datasets methods base other attached packages: [1] edgeR_3.4.2 limma_3.18.13 anota_1.10.0 qvalue_1.36.0 loaded via a namespace (and not attached): [1] Biobase_2.22.0 BiocGenerics_0.8.0 MASS_7.3-30 multtest_2.18.0 parallel_3.0.2 splines_3.0.2 stats4_3.0.2 survival_2.37-7 [9] tcltk_3.0.2 tools_3.0.2 -- Sent via the guest posting facility at bioconductor.org.

Sequencing Normalization PROcess anota Sequencing Normalization PROcess anota • 1.7k views

ADD COMMENT • link updated 8.7 years ago by kesarwani.anil • 0 • written 10.1 years ago by Guest User ★ 13k

score 0 · Answer 1 · 2014-03-26

Hi Nils, You will need to filter away genes without reads (e.g. A4GALT below). You will also need to filter away genes that has a standard deviation that equals 0. You will also need to normalize the data e.g. using the RPKM approach and then transform using e.g. log2. Let me know if you need more help. ATB Ola On 03/26/2014 01:43 PM, Nils Grabole [guest] wrote: > > I would like to analyse my sequencing data with anota, starting with the function "anotaPerformQc". > Regrettably I get the following error message: > > anotaQcOut <- anotaPerformQc(dataT= my_data_cytosolic_mRNA, dataP=my_data_translational_Activity, phenoVec=vec, nDfbSimData=500, useProgBar=TRUE) > > Running anotaPerformQc quality control > Calculating omnibus interactions & effects and dfbetas Error in if (groupSlope[i] > 1 | groupSlope[i] < 0) { : missing value where TRUE/FALSE needed > >> traceback() > 1: anotaPerformQc(dataT = t, dataP = r, phenoVec = vec, nDfbSimData = 500, > useProgBar = TRUE) > > My input data looks as follows: > >> head(my_data_cytosolic_mRNA) > 1 2 3 4 5 6 7 8 > A2M 3 0 7 0 6 4 5 13 > A2ML1 4 11 3 0 3 1 6 3 > A2MP1 2 2 2 0 0 2 2 6 > A3GALT2 0 1 1 0 0 0 1 3 > A4GALT 0 0 0 0 0 0 0 0 > A4GNT 0 0 3 0 0 0 1 0 >> head(my_data_translational_Activity) > 1 2 3 4 5 6 7 8 > A2M 9 0 18 4 9 41 0 0 > A2ML1 4 5 1 1 0 0 2 0 > A2MP1 0 0 0 0 0 0 0 0 > A3GALT2 2 0 1 0 1 1 5 0 > A4GALT 0 0 0 0 0 0 0 0 > A4GNT 0 0 0 0 0 0 0 0 >> vec > [1] "wt" "wt" "wt" "wt" "mut" "mut" "mut" "mut" > > I read the anota vignette and reference manual, which mentions "groupSlope" in the explanation for the "omniGroupStats" argument. The arguments for the input data is simply described as "data matrix with non numerical rownames". > Looking at the sample data provided with the package (see below) I ASSUME I need to process the sequencing count data before I use it within anota. > >> head(anota_example_counts) > yorf norm dens count len total > 1 15S_rRNA 1471.349 1261.805 2111 1673 857584 > 2 21S_rRNA 1192.194 1022.406 4563 4463 857584 > 3 HRA1 0.000 0.000 0 588 857584 > 4 LSR1 1548.272 1327.773 1592 1199 857584 > 5 NME1 105.715 90.659 33 364 857584 >> head(anota_example_processed) > [,1] > 15S_rRNA 5.6848584 > 21S_rRNA 5.3864571 > HRA1 0.5289467 > LSR1 5.7882936 > NME1 2.9789340 > > In the following paper introducing the anota package (http://www.pnas.org/content/107/50/21487.long) I found how the authors processed the sequencing data for analysis: > "For the sequencing dataset, we used the count data > supplied by the authors, filtered for identifiers originating from the coding > regions, and used quantile normalization and a transformation to stabilize > the variance." > > In case I am right that my data needs processing first, could please somebody suggest how I do "quantile normalization and a transformation to stabilize the variance" with my data. > If the error I get is due to something else, please let me know how to solve my problem. > I am new to R and bioconductor, please accept my apologies if I have overlooked something obvious. > > Thank you very much for your help! > > > > > > > > > > -- output of sessionInfo(): > >> sessionInfo() > R version 3.0.2 (2013-09-25) > Platform: i386-w64-mingw32/i386 (32-bit) > > locale: > [1] LC_COLLATE=German_Switzerland.1252 LC_CTYPE=German_Switzerland.1252 LC_MONETARY=German_Switzerland.1252 LC_NUMERIC=C > [5] LC_TIME=German_Switzerland.1252 > > attached base packages: > [1] stats graphics grDevices utils datasets methods base > > other attached packages: > [1] edgeR_3.4.2 limma_3.18.13 anota_1.10.0 qvalue_1.36.0 > > loaded via a namespace (and not attached): > [1] Biobase_2.22.0 BiocGenerics_0.8.0 MASS_7.3-30 multtest_2.18.0 parallel_3.0.2 splines_3.0.2 stats4_3.0.2 survival_2.37-7 > [9] tcltk_3.0.2 tools_3.0.2 > > -- > Sent via the guest posting facility at bioconductor.org.

score 0 · Answer 2 · 2015-08-06

0

Entering edit mode

kesarwani.anil • 0

@kesarwanianil-8573

Last seen 4.7 years ago

United States

I am getting this error when I run "anotaPerformQc" with default parameters. I have no any zero in the matrix. Could please help me out.

Error in quantile.default(as.numeric(x), c(0.25, 0.75), na.rm = na.rm,

missing values and NaN's not allowed if 'na.rm' is FALSE

ADD COMMENT • link 8.7 years ago kesarwani.anil • 0