Question: plot QuSAGE problem
0
5.9 years ago by
Guest User12k
Guest User12k wrote:
Dear all, I would like to quantify gene set differential expression using QuSAGE. When I run qusage a warning is printed because some pathways don???t contain any values matching the rownames of eset and the function returns NAs for the values of these pathways. It seems normal according to the manual; however, when I try to plot the results I get this error: Error en bar.col[p.vals == 0] = c("#00FF00", "#FF0000")[(means > 0) + : NAs no son permitidos en asignaciones subscritas As I didn???t find any forum discussion related to qusage, to solve this problem I tried to remove columns and/or rows with NAs using different commands from this forum http://stackoverflow.com/questions/4862178/remove-rows-with-nas-in- data-frame that didn???t work. I???ve also tried to export the result file to manually remove the NAs values, but I get another error: Error en data.frame(NUCLEOPLASM = c(40L, 100L, 228L, 255L, 265L, 433L, : arguments imply differing number of rows: 8, 1, 43, 0, 2, 5, 35, 3, 7, 9, 15, 57, 72, 76, 6, 4, 89, 12, 104, 20, 10, 13, 50, 11, 34, 45, 22, 14, 36, 41, 18, 28, 38, 75, 16, 55, 51, 21, 29, 19, 31, 26, 47, 46, 30, 23, 33, 37, 17, 24 So, I don???t know what else I can do. Could you please help me? Thank you in advance! Cheers Natalia here is my code: > library("qusage") Loading required package: limma > eset <- read.table("CarcinomaLog2b.txt") > dim(eset) [1] 670 4 > eset[1:4,1:4] B1238noCI B898noCI N457CI CIO4CI ago61 7.246293 7.383201 5.556158 5.646811 ABAT 7.386900 7.473050 5.935872 6.037316 ABCA8 6.423412 6.838683 4.357163 4.706340 ACAP2 7.359547 7.006194 5.658119 5.422421 > labels = c(rep("noCI",2),rep("CI",2)) > labels [1] "noCI" "noCI" "CI" "CI" > contrast = "CI-noCI" > MSIG.geneSets = read.gmt("c5.all.v4.0.symbols.gmt") > summary(MSIG.geneSets[1:5]) Length Class Mode NUCLEOPLASM 279 -none- character EXTRINSIC_TO_PLASMA_MEMBRANE 13 -none- character ORGANELLE_PART 1197 -none- character CELL_PROJECTION_PART 19 -none- character CYTOPLASMIC_VESICLE_MEMBRANE 28 -none- character > MSIG.geneSets[2] $EXTRINSIC_TO_PLASMA_MEMBRANE [1] "GNA14" "APC2" "GNAI1" "SCUBE1" "RGS19" "EEA1" "ARRB1" "TDGF1" "SYTL4" "TGM3" "SYTL2" "GNAS" "SYTL1" > qs.results.msig = qusage(eset, labels, contrast, MSIG.geneSets) Calculating gene-by-gene comparisons...Done. Aggregating gene data for gene sets................................... .....................................Done. Calculating variance inflation factors...Done. There were 50 or more warnings (use warnings() to see the first 50) > warnings() Warning messages: 1: In calcIndividualExpressions(eset.2, eset.1, paired = paired, ... : Some degrees of freedom are below minimum. They have been set to 3. 2: In aggregateGeneSet(results, geneSets, silent = F) : Gene set: (index 4) has 0 overlap with eset. 3: In aggregateGeneSet(results, geneSets, silent = F) : Gene set: (index 8) has 0 overlap with eset. ??? > p.vals = pdf.pVal(qs.results.msig) > head(p.vals) [1] 0.004420964 0.002192533 0.944526837 NA 0.015565253 0.665047012 > q.vals = p.adjust(p.vals, method="fdr") > head(q.vals) [1] 0.02186234 0.01575066 0.95799937 NA 0.03560862 0.74452140 > plot(qs.results.msig) Error en bar.col[p.vals == 0] = c("#00FF00", "#FF0000")[(means > 0) + : NAs no son permitidos en asignaciones subscritas > write.table(qs.results.msig, "C:/CarcinomaQusage/qs.results.msig.txt", sep="\t") Error en data.frame(NUCLEOPLASM = c(40L, 100L, 228L, 255L, 265L, 433L, : arguments imply differing number of rows: 8, 1, 43, 0, 2, 5, 35, 3, 7, 9, 15, 57, 72, 76, 6, 4, 89, 12, 104, 20, 10, 13, 50, 11, 34, 45, 22, 14, 36, 41, 18, 28, 38, 75, 16, 55, 51, 21, 29, 19, 31, 26, 47, 46, 30, 23, 33, 37, 17, 24 > traceback() It???s too long, I only get the final part of it: ??? 5.2071044013943e-15, 5.19547834250736e-15, 5.18371266247415e-15, 5.1716289962509e-15, 5.16099603768818e-15, 5.15051690240292e-15, 5.14112007625988e-15, 5.13108039263482e-15, 5.12053454327034e-15, 5.10908904454783e-15, 5.09777420691446e-15, 5.08657452313037e-15, 5.07631739482464e-15, 5.06622484767238e-15, 5.0557022518093e-15, 5.04523476479514e-15, 5.0346664383151e-15, 5.02514098459579e-15, ??? NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, ... vif = c(0.233781854096956, NA, 0.267630050168576, NA, NA, 0.20656384515213, NA, NA, NA, NA, NA, NA, 0.984323955738163, 1.85096210586702, 1.68418716723949, NA, NA, NA, NA, NA, NA, 1.29506034501887, 1.6117330327914, NA, 0.0769539109044178, 2.3903824686852, NA, NA, NA, 1.56236758547383, 0.320770501799159, 0.912003653287104, 0.672521204303017, NA, 1.03652420528229, NA, 1.02504640850195, 1.09865040867118, NA, NA, 1.83839137148713, 2.18615700744765, NA, NA, NA, 1.75828917222515, ... 6: eval(expr, envir, enclos) 5: eval(as.call(c(expression(data.frame), x, check.names = !optional, stringsAsFactors = stringsAsFactors))) 4: as.data.frame.list(x[[i]], optional = TRUE, stringsAsFactors = stringsAsFactors) 3: as.data.frame(x[[i]], optional = TRUE, stringsAsFactors = stringsAsFactors) 2: data.frame(x) 1: write.table(qs.results.msig, "C:/CarcinomaQusage/qs.results.msig.txt", sep = "\t") -- output of sessionInfo(): R version 3.0.2 (2013-09-25) Platform: x86_64-w64-mingw32/x64 (64-bit) locale: [1] LC_COLLATE=Spanish_Spain.1252 LC_CTYPE=Spanish_Spain.1252 LC_MONETARY=Spanish_Spain.1252 LC_NUMERIC=C LC_TIME=Spanish_Spain.1252 attached base packages: [1] stats graphics grDevices utils datasets methods base other attached packages: [1] qusage_1.2.0 limma_3.18.2 loaded via a namespace (and not attached): [1] Biobase_2.22.0 BiocGenerics_0.8.0 parallel_3.0.2 -- Sent via the guest posting facility at bioconductor.org. pathways qusage • 941 views ADD COMMENTlink modified 5.9 years ago by Natalia Sevane20 • written 5.9 years ago by Guest User12k Answer: plot QuSAGE problem 0 5.9 years ago by Chris Bolen20 Chris Bolen20 wrote: Hi Natalia, Sorry about that. Looks like you found a bug in the plotCIs function. The code will have to be fixed, but there are two workarounds that you can use in the mean time. First is to only try to plot non-NA pathways by specifying the path.index parameter, as follows: non.na = which(!is.na(qs.results.msig$path.mean) plot(qs.results.msig, path.index=non.na) You'll have to make sure to add that every time you plot the data, but it's short. The alternative is to remove the offending pathways from the pathway list before you run qusage, which should prevent any other problems down the road. You'd do this as follows: overlap = sapply(MSIG.geneSets, function(x){ sum(x %in% rownames(eset)) }) > MSIG.geneSets = MSIG.geneSets[overlap>0] Then run qusage as normal. Regarding the second error you have in your code -- I don't think that "write.table" is designed to work with list objects. You'll have to pick out specific parts of the results object to save (such as path.mean and path.sd), and then call write.table on a matrix you create from those. Also, just a note -- I notice that not all of the row names in your gene set are standard gene symbols (e.g. ago61 should be POMGNT2 according to NCBI's Gene DB). It's quite likely that a lot of pathways with 0 overlap could be because the names being used are different. This is not an easy thing to fix, but I think there may be some tools online which can help you fix those. I'll search around a bit to see if I can find them. On Mon, Nov 18, 2013 at 1:48 AM, Natalia [guest] <guest@bioconductor.org>wrote: > > Dear all, > I would like to quantify gene set differential expression using QuSAGE. > When I run qusage a warning is printed because some pathways dont contain > any values matching the rownames of eset and the function returns NAs for > the values of these pathways. It seems normal according to the manual; > however, when I try to plot the results I get this error: > Error en bar.col[p.vals == 0] = c("#00FF00", "#FF0000")[(means > 0) + : > NAs no son permitidos en asignaciones subscritas > > As I didnt find any forum discussion related to qusage, to solve this > problem I tried to remove columns and/or rows with NAs using different > commands from this forum > http://stackoverflow.com/questions/4862178/remove-rows-with-nas-in- data-framethat didnt work. Ive also tried to export the result file to manually > remove the NAs values, but I get another error: > Error en data.frame(NUCLEOPLASM = c(40L, 100L, 228L, 255L, 265L, 433L, : > arguments imply differing number of rows: 8, 1, 43, 0, 2, 5, 35, 3, 7, > 9, 15, 57, 72, 76, 6, 4, 89, 12, 104, 20, 10, 13, 50, 11, 34, 45, 22, 14, > 36, 41, 18, 28, 38, 75, 16, 55, 51, 21, 29, 19, 31, 26, 47, 46, 30, 23, 33, > 37, 17, 24 > > So, I dont know what else I can do. Could you please help me? > Thank you in advance! > Cheers > > Natalia > > > here is my code: > > library("qusage") > Loading required package: limma > > eset <- read.table("CarcinomaLog2b.txt") > > dim(eset) > [1] 670 4 > > eset[1:4,1:4] > B1238noCI B898noCI N457CI CIO4CI > ago61 7.246293 7.383201 5.556158 5.646811 > ABAT 7.386900 7.473050 5.935872 6.037316 > ABCA8 6.423412 6.838683 4.357163 4.706340 > ACAP2 7.359547 7.006194 5.658119 5.422421 > > labels = c(rep("noCI",2),rep("CI",2)) > > labels > [1] "noCI" "noCI" "CI" "CI" > > contrast = "CI-noCI" > > MSIG.geneSets = read.gmt("c5.all.v4.0.symbols.gmt") > > summary(MSIG.geneSets[1:5]) > Length Class Mode > NUCLEOPLASM 279 -none- character > EXTRINSIC_TO_PLASMA_MEMBRANE 13 -none- character > ORGANELLE_PART 1197 -none- character > CELL_PROJECTION_PART 19 -none- character > CYTOPLASMIC_VESICLE_MEMBRANE 28 -none- character > > MSIG.geneSets[2] > $EXTRINSIC_TO_PLASMA_MEMBRANE > [1] "GNA14" "APC2" "GNAI1" "SCUBE1" "RGS19" "EEA1" "ARRB1" > "TDGF1" "SYTL4" "TGM3" "SYTL2" "GNAS" "SYTL1" > > qs.results.msig = qusage(eset, labels, contrast, MSIG.geneSets) > Calculating gene-by-gene comparisons...Done. > Aggregating gene data for gene > sets................................................................ ........Done. > Calculating variance inflation factors...Done. > There were 50 or more warnings (use warnings() to see the first 50) > > warnings() > Warning messages: > 1: In calcIndividualExpressions(eset.2, eset.1, paired = paired, ... : > Some degrees of freedom are below minimum. They have been set to 3. > 2: In aggregateGeneSet(results, geneSets, silent = F) : > Gene set: (index 4) has 0 overlap with eset. > 3: In aggregateGeneSet(results, geneSets, silent = F) : > Gene set: (index 8) has 0 overlap with eset. > > > p.vals = pdf.pVal(qs.results.msig) > > head(p.vals) > [1] 0.004420964 0.002192533 0.944526837 NA 0.015565253 0.665047012 > > q.vals = p.adjust(p.vals, method="fdr") > > head(q.vals) > [1] 0.02186234 0.01575066 0.95799937 NA 0.03560862 0.74452140 > > plot(qs.results.msig) > Error en bar.col[p.vals == 0] = c("#00FF00", "#FF0000")[(means > 0) + : > NAs no son permitidos en asignaciones subscritas > > write.table(qs.results.msig, "C:/CarcinomaQusage/qs.results.msig.txt", > sep="\t") > Error en data.frame(NUCLEOPLASM = c(40L, 100L, 228L, 255L, 265L, 433L, : > arguments imply differing number of rows: 8, 1, 43, 0, 2, 5, 35, 3, 7, > 9, 15, 57, 72, 76, 6, 4, 89, 12, 104, 20, 10, 13, 50, 11, 34, 45, 22, 14, > 36, 41, 18, 28, 38, 75, 16, 55, 51, 21, 29, 19, 31, 26, 47, 46, 30, 23, 33, > 37, 17, 24 > > > traceback() > Its too long, I only get the final part of it: > > 5.2071044013943e-15, 5.19547834250736e-15, 5.18371266247415e-15, > 5.1716289962509e-15, 5.16099603768818e-15, 5.15051690240292e-15, > 5.14112007625988e-15, 5.13108039263482e-15, 5.12053454327034e-15, > 5.10908904454783e-15, 5.09777420691446e-15, 5.08657452313037e-15, > 5.07631739482464e-15, 5.06622484767238e-15, 5.0557022518093e-15, > 5.04523476479514e-15, 5.0346664383151e-15, 5.02514098459579e-15, > > NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, > NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, > NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, > NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, > NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, > ... > vif = c(0.233781854096956, NA, > 0.267630050168576, NA, NA, 0.20656384515213, NA, NA, NA, > NA, NA, NA, 0.984323955738163, 1.85096210586702, 1.68418716723949, > NA, NA, NA, NA, NA, NA, 1.29506034501887, 1.6117330327914, > NA, 0.0769539109044178, 2.3903824686852, NA, NA, NA, > 1.56236758547383, > 0.320770501799159, 0.912003653287104, 0.672521204303017, > NA, 1.03652420528229, NA, 1.02504640850195, 1.09865040867118, > NA, NA, 1.83839137148713, 2.18615700744765, NA, NA, NA, > 1.75828917222515, > ... > 6: eval(expr, envir, enclos) > 5: eval(as.call(c(expression(data.frame), x, check.names = !optional, > stringsAsFactors = stringsAsFactors))) > 4: as.data.frame.list(x[[i]], optional = TRUE, stringsAsFactors = > stringsAsFactors) > 3: as.data.frame(x[[i]], optional = TRUE, stringsAsFactors = > stringsAsFactors) > 2: data.frame(x) > 1: write.table(qs.results.msig, "C:/CarcinomaQusage/qs.results.msig.txt", > sep = "\t") > > > -- output of sessionInfo(): > > R version 3.0.2 (2013-09-25) > Platform: x86_64-w64-mingw32/x64 (64-bit) > > locale: > [1] LC_COLLATE=Spanish_Spain.1252 LC_CTYPE=Spanish_Spain.1252 > LC_MONETARY=Spanish_Spain.1252 LC_NUMERIC=C > LC_TIME=Spanish_Spain.1252 > > attached base packages: > [1] stats graphics grDevices utils datasets methods base > > other attached packages: > [1] qusage_1.2.0 limma_3.18.2 > > loaded via a namespace (and not attached): > [1] Biobase_2.22.0 BiocGenerics_0.8.0 parallel_3.0.2 > > > -- > Sent via the guest posting facility at bioconductor.org. > [[alternative HTML version deleted]] ADD COMMENTlink written 5.9 years ago by Chris Bolen20 Answer: plot QuSAGE problem 0 5.9 years ago by Natalia Sevane20 wrote: Hi, Thank you again for your help. Qusage works really well now and the plots are very useful to understand the data :) Im sorry to bother you again, but Im wondering if there is the possibility of obtaining a list with the genes from eset included in each gene set instead of using plotCIsGenes(qs.results.msig,path.index=113) for each gene set. Maybe it is explained in the manual, but Im not able to find it. Thank you in advance! Cheers Natalia From: nata_sf@hotmail.com To: cbolen1@gmail.com Subject: RE: plot QuSAGE problem Date: Mon, 18 Nov 2013 20:13:57 +0000 Thank you very much for your help!! From: cbolen1@gmail.com Date: Mon, 18 Nov 2013 10:08:11 -0800 Subject: Re: plot QuSAGE problem To: guest@bioconductor.org CC: bioconductor@r-project.org; nata_sf@hotmail.com Hi Natalia, Sorry about that. Looks like you found a bug in the plotCIs function. The code will have to be fixed, but there are two workarounds that you can use in the mean time. First is to only try to plot non-NA pathways by specifying the path.index parameter, as follows: non.na = which(!is.na(qs.results.msig$path.mean) plot(qs.results.msig, path.index=non.na) You'll have to make sure to add that every time you plot the data, but it's short. The alternative is to remove the offending pathways from the pathway list before you run qusage, which should prevent any other problems down the road. You'd do this as follows: overlap = sapply(MSIG.geneSets, function(x){ sum(x %in% rownames(eset)) }) MSIG.geneSets = MSIG.geneSets[overlap>0] Then run qusage as normal. Regarding the second error you have in your code -- I don't think that "write.table" is designed to work with list objects. You'll have to pick out specific parts of the results object to save (such as path.mean and path.sd), and then call write.table on a matrix you create from those. Also, just a note -- I notice that not all of the row names in your gene set are standard gene symbols (e.g. ago61 should be POMGNT2 according to NCBI's Gene DB). It's quite likely that a lot of pathways with 0 overlap could be because the names being used are different. This is not an easy thing to fix, but I think there may be some tools online which can help you fix those. I'll search around a bit to see if I can find them. On Mon, Nov 18, 2013 at 1:48 AM, Natalia [guest] <guest@bioconductor.org> wrote: Dear all, I would like to quantify gene set differential expression using QuSAGE. When I run qusage a warning is printed because some pathways dont contain any values matching the rownames of eset and the function returns NAs for the values of these pathways. It seems normal according to the manual; however, when I try to plot the results I get this error: Error en bar.col[p.vals == 0] = c("#00FF00", "#FF0000")[(means > 0) + : NAs no son permitidos en asignaciones subscritas As I didnt find any forum discussion related to qusage, to solve this problem I tried to remove columns and/or rows with NAs using different commands from this forum http://stackoverflow.com/questions/4862178 /remove-rows-with-nas-in-data-frame that didnt work. Ive also tried to export the result file to manually remove the NAs values, but I get another error: Error en data.frame(NUCLEOPLASM = c(40L, 100L, 228L, 255L, 265L, 433L, : arguments imply differing number of rows: 8, 1, 43, 0, 2, 5, 35, 3, 7, 9, 15, 57, 72, 76, 6, 4, 89, 12, 104, 20, 10, 13, 50, 11, 34, 45, 22, 14, 36, 41, 18, 28, 38, 75, 16, 55, 51, 21, 29, 19, 31, 26, 47, 46, 30, 23, 33, 37, 17, 24 So, I dont know what else I can do. Could you please help me? Thank you in advance! Cheers Natalia here is my code: > library("qusage") Loading required package: limma > eset <- read.table("CarcinomaLog2b.txt") > dim(eset) [1] 670 4 > eset[1:4,1:4] B1238noCI B898noCI N457CI CIO4CI ago61 7.246293 7.383201 5.556158 5.646811 ABAT 7.386900 7.473050 5.935872 6.037316 ABCA8 6.423412 6.838683 4.357163 4.706340 ACAP2 7.359547 7.006194 5.658119 5.422421 > labels = c(rep("noCI",2),rep("CI",2)) > labels [1] "noCI" "noCI" "CI" "CI" > contrast = "CI-noCI" > MSIG.geneSets = read.gmt("c5.all.v4.0.symbols.gmt") > summary(MSIG.geneSets[1:5]) Length Class Mode NUCLEOPLASM 279 -none- character EXTRINSIC_TO_PLASMA_MEMBRANE 13 -none- character ORGANELLE_PART 1197 -none- character CELL_PROJECTION_PART 19 -none- character CYTOPLASMIC_VESICLE_MEMBRANE 28 -none- character > MSIG.geneSets[2] $EXTRINSIC_TO_PLASMA_MEMBRANE [1] "GNA14" "APC2" "GNAI1" "SCUBE1" "RGS19" "EEA1" "ARRB1" "TDGF1" "SYTL4" "TGM3" "SYTL2" "GNAS" "SYTL1" > qs.results.msig = qusage(eset, labels, contrast, MSIG.geneSets) Calculating gene-by-gene comparisons...Done. Aggregating gene data for gene sets................................... .....................................Done. Calculating variance inflation factors...Done. There were 50 or more warnings (use warnings() to see the first 50) > warnings() Warning messages: 1: In calcIndividualExpressions(eset.2, eset.1, paired = paired, ... : Some degrees of freedom are below minimum. They have been set to 3. 2: In aggregateGeneSet(results, geneSets, silent = F) : Gene set: (index 4) has 0 overlap with eset. 3: In aggregateGeneSet(results, geneSets, silent = F) : Gene set: (index 8) has 0 overlap with eset. > p.vals = pdf.pVal(qs.results.msig) > head(p.vals) [1] 0.004420964 0.002192533 0.944526837 NA 0.015565253 0.665047012 > q.vals = p.adjust(p.vals, method="fdr") > head(q.vals) [1] 0.02186234 0.01575066 0.95799937 NA 0.03560862 0.74452140 > plot(qs.results.msig) Error en bar.col[p.vals == 0] = c("#00FF00", "#FF0000")[(means > 0) + : NAs no son permitidos en asignaciones subscritas > write.table(qs.results.msig, "C:/CarcinomaQusage/qs.results.msig.txt", sep="\t") Error en data.frame(NUCLEOPLASM = c(40L, 100L, 228L, 255L, 265L, 433L, : arguments imply differing number of rows: 8, 1, 43, 0, 2, 5, 35, 3, 7, 9, 15, 57, 72, 76, 6, 4, 89, 12, 104, 20, 10, 13, 50, 11, 34, 45, 22, 14, 36, 41, 18, 28, 38, 75, 16, 55, 51, 21, 29, 19, 31, 26, 47, 46, 30, 23, 33, 37, 17, 24 > traceback() Its too long, I only get the final part of it: 5.2071044013943e-15, 5.19547834250736e-15, 5.18371266247415e-15, 5.1716289962509e-15, 5.16099603768818e-15, 5.15051690240292e-15, 5.14112007625988e-15, 5.13108039263482e-15, 5.12053454327034e-15, 5.10908904454783e-15, 5.09777420691446e-15, 5.08657452313037e-15, 5.07631739482464e-15, 5.06622484767238e-15, 5.0557022518093e-15, 5.04523476479514e-15, 5.0346664383151e-15, 5.02514098459579e-15, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, ... vif = c(0.233781854096956, NA, 0.267630050168576, NA, NA, 0.20656384515213, NA, NA, NA, NA, NA, NA, 0.984323955738163, 1.85096210586702, 1.68418716723949, NA, NA, NA, NA, NA, NA, 1.29506034501887, 1.6117330327914, NA, 0.0769539109044178, 2.3903824686852, NA, NA, NA, 1.56236758547383, 0.320770501799159, 0.912003653287104, 0.672521204303017, NA, 1.03652420528229, NA, 1.02504640850195, 1.09865040867118, NA, NA, 1.83839137148713, 2.18615700744765, NA, NA, NA, 1.75828917222515, ... 6: eval(expr, envir, enclos) 5: eval(as.call(c(expression(data.frame), x, check.names = !optional, stringsAsFactors = stringsAsFactors))) 4: as.data.frame.list(x[[i]], optional = TRUE, stringsAsFactors = stringsAsFactors) 3: as.data.frame(x[[i]], optional = TRUE, stringsAsFactors = stringsAsFactors) 2: data.frame(x) 1: write.table(qs.results.msig, "C:/CarcinomaQusage/qs.results.msig.txt", sep = "\t") -- output of sessionInfo(): R version 3.0.2 (2013-09-25) Platform: x86_64-w64-mingw32/x64 (64-bit) locale: [1] LC_COLLATE=Spanish_Spain.1252 LC_CTYPE=Spanish_Spain.1252 LC_MONETARY=Spanish_Spain.1252 LC_NUMERIC=C LC_TIME=Spanish_Spain.1252 attached base packages: [1] stats graphics grDevices utils datasets methods base other attached packages: [1] qusage_1.2.0 limma_3.18.2 loaded via a namespace (and not attached): [1] Biobase_2.22.0 BiocGenerics_0.8.0 parallel_3.0.2 -- Sent via the guest posting facility at bioconductor.org. [[alternative HTML version deleted]] ADD COMMENTlink written 5.9 years ago by Natalia Sevane20 It's not stored directly, but the indices of the genes are stored in the "pathways" slot of the results. You could use these indices to find out which rows are being used, or you could just use R's "intersect" function on each of your pathways, as follows: sapply(MSIG.geneSets, intersect, rownames(eset)) On Tue, Nov 19, 2013 at 6:58 AM, Natalia Sevane <nata_sf@hotmail.com> wrote: > Hi, > > Thank you again for your help. Qusage works really well now and the plots > are very useful to understand the data :) > > Im sorry to bother you again, but Im wondering if there is the > possibility of obtaining a list with the genes from eset included in each > gene set instead of using *plotCIsGenes(qs.results.msig,path.index=113)*for each gene set. Maybe it is explained in the manual, but Im not able to > find it. > > Thank you in advance! > > Cheers > > > > Natalia > > > ------------------------------ > From: nata_sf@hotmail.com > To: cbolen1@gmail.com > Subject: RE: plot QuSAGE problem > Date: Mon, 18 Nov 2013 20:13:57 +0000 > > > Thank you very much for your help!! > > ------------------------------ > From: cbolen1@gmail.com > Date: Mon, 18 Nov 2013 10:08:11 -0800 > Subject: Re: plot QuSAGE problem > To: guest@bioconductor.org > CC: bioconductor@r-project.org; nata_sf@hotmail.com > > Hi Natalia, > > Sorry about that. Looks like you found a bug in the plotCIs function. The > code will have to be fixed, but there are two workarounds that you can use > in the mean time. > > First is to only try to plot non-NA pathways by specifying the path.index > parameter, as follows: > > non.na = which(!is.na(qs.results.msig$path.mean) > > plot(qs.results.msig, path.index=non.na) > > > You'll have to make sure to add that every time you plot the data, but > it's short. > > The alternative is to remove the offending pathways from the pathway list > before you run qusage, which should prevent any other problems down the > road. You'd do this as follows: > > overlap = sapply(MSIG.geneSets, function(x){ sum(x %in% rownames(eset)) }) > MSIG.geneSets = MSIG.geneSets[overlap>0] > > > Then run qusage as normal. > > Regarding the second error you have in your code -- I don't think that > "write.table" is designed to work with list objects. You'll have to pick > out specific parts of the results object to save (such as path.mean and > path.sd), and then call write.table on a matrix you create from those. > > Also, just a note -- I notice that not all of the row names in your gene > set are standard gene symbols (e.g. ago61 should be POMGNT2 according to > NCBI's Gene DB). It's quite likely that a lot of pathways with 0 overlap > could be because the names being used are different. This is not an easy > thing to fix, but I think there may be some tools online which can help you > fix those. I'll search around a bit to see if I can find them. > > > On Mon, Nov 18, 2013 at 1:48 AM, Natalia [guest] <guest@bioconductor.org>wrote: > > > Dear all, > I would like to quantify gene set differential expression using QuSAGE. > When I run qusage a warning is printed because some pathways dont contain > any values matching the rownames of eset and the function returns NAs for > the values of these pathways. It seems normal according to the manual; > however, when I try to plot the results I get this error: > Error en bar.col[p.vals == 0] = c("#00FF00", "#FF0000")[(means > 0) + : > NAs no son permitidos en asignaciones subscritas > > As I didnt find any forum discussion related to qusage, to solve this > problem I tried to remove columns and/or rows with NAs using different > commands from this forum > http://stackoverflow.com/questions/4862178/remove-rows-with-nas-in- data-framethat didnt work. Ive also tried to export the result file to manually > remove the NAs values, but I get another error: > Error en data.frame(NUCLEOPLASM = c(40L, 100L, 228L, 255L, 265L, 433L, : > arguments imply differing number of rows: 8, 1, 43, 0, 2, 5, 35, 3, 7, > 9, 15, 57, 72, 76, 6, 4, 89, 12, 104, 20, 10, 13, 50, 11, 34, 45, 22, 14, > 36, 41, 18, 28, 38, 75, 16, 55, 51, 21, 29, 19, 31, 26, 47, 46, 30, 23, 33, > 37, 17, 24 > > So, I dont know what else I can do. Could you please help me? > Thank you in advance! > Cheers > > Natalia > > > here is my code: > > library("qusage") > Loading required package: limma > > eset <- read.table("CarcinomaLog2b.txt") > > dim(eset) > [1] 670 4 > > eset[1:4,1:4] > B1238noCI B898noCI N457CI CIO4CI > ago61 7.246293 7.383201 5.556158 5.646811 > ABAT 7.386900 7.473050 5.935872 6.037316 > ABCA8 6.423412 6.838683 4.357163 4.706340 > ACAP2 7.359547 7.006194 5.658119 5.422421 > > labels = c(rep("noCI",2),rep("CI",2)) > > labels > [1] "noCI" "noCI" "CI" "CI" > > contrast = "CI-noCI" > > MSIG.geneSets = read.gmt("c5.all.v4.0.symbols.gmt") > > summary(MSIG.geneSets[1:5]) > Length Class Mode > NUCLEOPLASM 279 -none- character > EXTRINSIC_TO_PLASMA_MEMBRANE 13 -none- character > ORGANELLE_PART 1197 -none- character > CELL_PROJECTION_PART 19 -none- character > CYTOPLASMIC_VESICLE_MEMBRANE 28 -none- character > > MSIG.geneSets[2] > \$EXTRINSIC_TO_PLASMA_MEMBRANE > [1] "GNA14" "APC2" "GNAI1" "SCUBE1" "RGS19" "EEA1" "ARRB1" > "TDGF1" "SYTL4" "TGM3" "SYTL2" "GNAS" "SYTL1" > > qs.results.msig = qusage(eset, labels, contrast, MSIG.geneSets) > Calculating gene-by-gene comparisons...Done. > Aggregating gene data for gene > sets................................................................ ........Done. > Calculating variance inflation factors...Done. > There were 50 or more warnings (use warnings() to see the first 50) > > warnings() > Warning messages: > 1: In calcIndividualExpressions(eset.2, eset.1, paired = paired, ... : > Some degrees of freedom are below minimum. They have been set to 3. > 2: In aggregateGeneSet(results, geneSets, silent = F) : > Gene set: (index 4) has 0 overlap with eset. > 3: In aggregateGeneSet(results, geneSets, silent = F) : > Gene set: (index 8) has 0 overlap with eset. > > > p.vals = pdf.pVal(qs.results.msig) > > head(p.vals) > [1] 0.004420964 0.002192533 0.944526837 NA 0.015565253 0.665047012 > > q.vals = p.adjust(p.vals, method="fdr") > > head(q.vals) > [1] 0.02186234 0.01575066 0.95799937 NA 0.03560862 0.74452140 > > plot(qs.results.msig) > Error en bar.col[p.vals == 0] = c("#00FF00", "#FF0000")[(means > 0) + : > NAs no son permitidos en asignaciones subscritas > > write.table(qs.results.msig, "C:/CarcinomaQusage/qs.results.msig.txt", > sep="\t") > Error en data.frame(NUCLEOPLASM = c(40L, 100L, 228L, 255L, 265L, 433L, : > arguments imply differing number of rows: 8, 1, 43, 0, 2, 5, 35, 3, 7, > 9, 15, 57, 72, 76, 6, 4, 89, 12, 104, 20, 10, 13, 50, 11, 34, 45, 22, 14, > 36, 41, 18, 28, 38, 75, 16, 55, 51, 21, 29, 19, 31, 26, 47, 46, 30, 23, 33, > 37, 17, 24 > > > traceback() > Its too long, I only get the final part of it: > > 5.2071044013943e-15, 5.19547834250736e-15, 5.18371266247415e-15, > 5.1716289962509e-15, 5.16099603768818e-15, 5.15051690240292e-15, > 5.14112007625988e-15, 5.13108039263482e-15, 5.12053454327034e-15, > 5.10908904454783e-15, 5.09777420691446e-15, 5.08657452313037e-15, > 5.07631739482464e-15, 5.06622484767238e-15, 5.0557022518093e-15, > 5.04523476479514e-15, 5.0346664383151e-15, 5.02514098459579e-15, > > NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, > NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, > NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, > NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, > NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, > ... > vif = c(0.233781854096956, NA, > 0.267630050168576, NA, NA, 0.20656384515213, NA, NA, NA, > NA, NA, NA, 0.984323955738163, 1.85096210586702, 1.68418716723949, > NA, NA, NA, NA, NA, NA, 1.29506034501887, 1.6117330327914, > NA, 0.0769539109044178, 2.3903824686852, NA, NA, NA, > 1.56236758547383, > 0.320770501799159, 0.912003653287104, 0.672521204303017, > NA, 1.03652420528229, NA, 1.02504640850195, 1.09865040867118, > NA, NA, 1.83839137148713, 2.18615700744765, NA, NA, NA, > 1.75828917222515, > ... > 6: eval(expr, envir, enclos) > 5: eval(as.call(c(expression(data.frame), x, check.names = !optional, > stringsAsFactors = stringsAsFactors))) > 4: as.data.frame.list(x[[i]], optional = TRUE, stringsAsFactors = > stringsAsFactors) > 3: as.data.frame(x[[i]], optional = TRUE, stringsAsFactors = > stringsAsFactors) > 2: data.frame(x) > 1: write.table(qs.results.msig, "C:/CarcinomaQusage/qs.results.msig.txt", > sep = "\t") > > > -- output of sessionInfo(): > > R version 3.0.2 (2013-09-25) > Platform: x86_64-w64-mingw32/x64 (64-bit) > > locale: > [1] LC_COLLATE=Spanish_Spain.1252 LC_CTYPE=Spanish_Spain.1252 > LC_MONETARY=Spanish_Spain.1252 LC_NUMERIC=C > LC_TIME=Spanish_Spain.1252 > > attached base packages: > [1] stats graphics grDevices utils datasets methods base > > other attached packages: > [1] qusage_1.2.0 limma_3.18.2 > > loaded via a namespace (and not attached): > [1] Biobase_2.22.0 BiocGenerics_0.8.0 parallel_3.0.2 > > > -- > Sent via the guest posting facility at bioconductor.org. > > > [[alternative HTML version deleted]]