AgiMicroRna -1 filtered values
1
0
Entering edit mode
@coonen-maarten-tgx-6396
Last seen 9.6 years ago
Dear mailing list, I'm using the AgiMicroRna-package to analyze my Agilent microRNA arrays. It occurred to me that the ProcessedData.txt output file contains -1 values for miRNAs that were below the 0.5 intensity threshold in the 'tgsMicroRna' function, when half=TRUE. These -1 values are subsequently used in the analysis of differential genes (limma), which can (in my opinion) lead to false positive results. E.g. imagine a situation where the intensities for a miRNA of 2 control samples were below the detection limit of the scanner (noise) and therefore end up with a value of -1. For the 2 treated samples however the intensities were higher (1.395 and 1.147 respectively). This would generate a logFC of 2.271 and most likely a significant p-value. As a result, this miRNA would be selected as differentially expressed, whilst actually not being measured in 2 of the 4 samples. Would it be better to leave these miRNAs out of the differential expression analysis, i.e. setting their values to NA? Below I provided some code using the sample-data from the AgiMicroRna package that can illustrate my observations. The miRNA described in the text above is hsa-miR-339-5p. ################################################### ### code chunk number 1: data ################################################### library("AgiMicroRna") data(targets.micro) ################################################### ### code chunk number 2: data ################################################### data(dd.micro) ################################################### ### code chunk number 11: tgsMicroRna ################################################### ddTGS=tgsMicroRna(dd.micro, half=TRUE, makePLOT=FALSE, verbose=FALSE) ################################################### ### code chunk number 12: tgsNormalization ################################################### ddNORM=tgsNormalization(ddTGS, "quantile", makePLOTpre=FALSE, makePLOTpost=FALSE, targets.micro, verbose=TRUE) ################################################### ### code chunk number 14: filterMicroRna ################################################### ddPROC=filterMicroRna(ddNORM, dd.micro, control=TRUE, IsGeneDetected=TRUE, wellaboveNEG=FALSE, limIsGeneDetected=75, limNEG=25, makePLOT=FALSE, targets.micro, verbose=TRUE, writeout=FALSE) ################################################### ### code chunk number 15: esetMicroRna ################################################### esetPROC=esetMicroRna(ddPROC, targets.micro, makePLOT=FALSE, verbose=TRUE) ################################################### ### code chunk number 16: writeEset (eval = FALSE) ################################################### writeEset(esetPROC, ddPROC, targets.micro, verbose=TRUE) > sessionInfo() R version 3.0.2 (2013-09-25) Platform: x86_64-w64-mingw32/x64 (64-bit) attached base packages: [1] parallel stats graphics grDevices utils datasets methods [8] base other attached packages: [1] AgiMicroRna_2.12.0 affycoretools_1.34.0 KEGG.db_2.10.1 [4] GO.db_2.10.1 RSQLite_0.11.4 DBI_0.2-7 [7] AnnotationDbi_1.24.0 preprocessCore_1.24.0 affy_1.40.0 [10] limma_3.18.3 Biobase_2.22.0 BiocGenerics_0.8.0 Thanks for your input, Best regards, Maarten Coonen Bioinformatics Associate Dpt. of Toxicogenomics (TGX), Maastricht University
miRNA GO miRNA GO • 1.3k views
ADD COMMENT
0
Entering edit mode
@coonen-maarten-tgx-6396
Last seen 9.6 years ago
Dear mailing list, I'm using the AgiMicroRna-package to analyze my Agilent microRNA arrays. It occurred to me that the ProcessedData.txt output file contains -1 values for miRNAs that were below the 0.5 intensity threshold in the 'tgsMicroRna' function, when half=TRUE. These -1 values are subsequently used in the analysis of differential genes (limma), which can (in my opinion) lead to false positive results. E.g. imagine a situation where the intensities for a miRNA of 2 control samples were below the detection limit of the scanner (noise) and therefore end up with a value of -1. For the 2 treated samples however the intensities were higher (1.395 and 1.147 respectively). This would generate a logFC of 2.271 and most likely a significant p-value. As a result, this miRNA would be selected as differentially expressed, whilst actually not being measured in 2 of the 4 samples. Would it be better to leave these miRNAs out of the differential expression analysis, i.e. setting their values to NA? Below I provided some code using the sample-data from the AgiMicroRna package that can illustrate my observations. The miRNA described in the text above is hsa-miR-339-5p. ################################################### ### code chunk number 1: data ################################################### library("AgiMicroRna") data(targets.micro) ################################################### ### code chunk number 2: data ################################################### data(dd.micro) ################################################### ### code chunk number 11: tgsMicroRna ################################################### ddTGS=tgsMicroRna(dd.micro, half=TRUE, makePLOT=FALSE, verbose=FALSE) ################################################### ### code chunk number 12: tgsNormalization ################################################### ddNORM=tgsNormalization(ddTGS, "quantile", makePLOTpre=FALSE, makePLOTpost=FALSE, targets.micro, verbose=TRUE) ################################################### ### code chunk number 14: filterMicroRna ################################################### ddPROC=filterMicroRna(ddNORM, dd.micro, control=TRUE, IsGeneDetected=TRUE, wellaboveNEG=FALSE, limIsGeneDetected=75, limNEG=25, makePLOT=FALSE, targets.micro, verbose=TRUE, writeout=FALSE) ################################################### ### code chunk number 15: esetMicroRna ################################################### esetPROC=esetMicroRna(ddPROC, targets.micro, makePLOT=FALSE, verbose=TRUE) ################################################### ### code chunk number 16: writeEset (eval = FALSE) ################################################### writeEset(esetPROC, ddPROC, targets.micro, verbose=TRUE) > sessionInfo() R version 3.0.2 (2013-09-25) Platform: x86_64-w64-mingw32/x64 (64-bit) attached base packages: [1] parallel stats graphics grDevices utils datasets methods [8] base other attached packages: [1] AgiMicroRna_2.12.0 affycoretools_1.34.0 KEGG.db_2.10.1 [4] GO.db_2.10.1 RSQLite_0.11.4 DBI_0.2-7 [7] AnnotationDbi_1.24.0 preprocessCore_1.24.0 affy_1.40.0 [10] limma_3.18.3 Biobase_2.22.0 BiocGenerics_0.8.0 Thanks for your input, Best regards, Maarten Coonen Bioinformatics Associate Dpt. of Toxicogenomics (TGX), Maastricht University
ADD COMMENT
0
Entering edit mode
Hi, On Tue, Feb 11, 2014 at 12:04 AM, Coonen Maarten (TGX) <m.coonen at="" maastrichtuniversity.nl=""> wrote: > Dear mailing list, > > I'm using the AgiMicroRna-package to analyze my Agilent microRNA arrays. > It occurred to me that the ProcessedData.txt output file contains -1 values for miRNAs that were below the 0.5 intensity threshold in the 'tgsMicroRna' function, when half=TRUE. > > These -1 values are subsequently used in the analysis of differential genes (limma), which can (in my opinion) lead to false positive results. > E.g. imagine a situation where the intensities for a miRNA of 2 control samples were below the detection limit of the scanner (noise) and therefore end up with a value of -1. > For the 2 treated samples however the intensities were higher (1.395 and 1.147 respectively). > This would generate a logFC of 2.271 and most likely a significant p-value. As a result, this miRNA would be selected as differentially expressed, whilst actually not being measured in 2 of the 4 samples. I think I'm somewhat confused ... isn't the scenario you are describing -- a miRNA is not detected in the control arrays, but detected in the (presumably the "treatment" arrays) -- the definition of differential expression? Also, have you checked to see if this scenario you are imagining is actually happening? -steve -- Steve Lianoglou Computational Biologist Genentech
ADD REPLY
0
Entering edit mode
Dear Steve, > I think I'm somewhat confused ... isn't the scenario you are describing -- a miRNA is not detected in the control arrays, but detected in the (presumably the "treatment" arrays) -- the definition of differential expression? I see your point, but this is only when one can be 100% sure that -1 means that the miRNA is not above the detection limit. In this case, the miRNA actually is low expressed and I agree with your statement above. But what if the probe replicates of a miRNA are not accurately measured and Feature Extraction Software (FES) decides to flag this miRNA as not-detected (gIsGeneDetected). AgiMicroRna then converts the expression for this miRNA into -1 and the miRNA continues in the analysis, while it actually did not pass the FES-QC. On the other hand, I am in doubt if I might be misinterpreting the FES-output. Which of the options below is true? I couldn't find any of this in the FES manual. a) FES performs QC on the probe replicates and summarizes all good probes into 1 TotalGeneSignal. Since a miRNA is represented by 30 probes, FES will always succeed in summarizing the good probes and will always generate a reliable TotalGeneSignal. b) FES performs QC on the probe replicates. If all probes fail to pass QC, it is unable to generate a reliable TotalGeneSignal and puts a flag in the IsGeneDetected column. If a) holds true, I agree on using the miRNAs with -1 in further analysis. If b) holds true, we are not able to distinguish between miRNAs that are absent and miRNAs that are not accurately measured. > Also, have you checked to see if this scenario you are imagining is actually happening? In the sample data set it occurs multiple times, of which hsa- miR-339-5p is the first example. In my own data, I have seen these -1 values occurring multiple times, which made me curious to what was causing this. Best regards, Maarten -----Original Message----- From: mailinglist.honeypot@gmail.com [mailto:mailinglist.honeypot@gmail.com] On Behalf Of Steve Lianoglou Sent: dinsdag 11 februari 2014 15:27 To: Coonen Maarten (TGX) Cc: bioconductor at r-project.org; plopez at cnic.es Subject: Re: [BioC] AgiMicroRna -1 filtered values Hi, On Tue, Feb 11, 2014 at 12:04 AM, Coonen Maarten (TGX) <m.coonen at="" maastrichtuniversity.nl=""> wrote: > Dear mailing list, > > I'm using the AgiMicroRna-package to analyze my Agilent microRNA arrays. > It occurred to me that the ProcessedData.txt output file contains -1 values for miRNAs that were below the 0.5 intensity threshold in the 'tgsMicroRna' function, when half=TRUE. > > These -1 values are subsequently used in the analysis of differential genes (limma), which can (in my opinion) lead to false positive results. > E.g. imagine a situation where the intensities for a miRNA of 2 control samples were below the detection limit of the scanner (noise) and therefore end up with a value of -1. > For the 2 treated samples however the intensities were higher (1.395 and 1.147 respectively). > This would generate a logFC of 2.271 and most likely a significant p-value. As a result, this miRNA would be selected as differentially expressed, whilst actually not being measured in 2 of the 4 samples. I think I'm somewhat confused ... isn't the scenario you are describing -- a miRNA is not detected in the control arrays, but detected in the (presumably the "treatment" arrays) -- the definition of differential expression? Also, have you checked to see if this scenario you are imagining is actually happening? -steve -- Steve Lianoglou Computational Biologist Genentech
ADD REPLY
0
Entering edit mode
Hi, Caveat here is that I have never processed data from these arrays before, and I am also no Agilent software expert (ie. I have no idea of the internal workings of their Feature Extraction Software (FES) that you're talking about). Comments in line: On Wed, Feb 12, 2014 at 12:51 AM, Coonen Maarten (TGX) <m.coonen at="" maastrichtuniversity.nl=""> wrote: [snip] > On the other hand, I am in doubt if I might be misinterpreting the FES-output. Which of the options below is true? I couldn't find any of this in the FES manual. > a) FES performs QC on the probe replicates and summarizes all good probes into 1 TotalGeneSignal. Since a miRNA is represented by 30 probes, FES will always succeed in summarizing the good probes and will always generate a reliable TotalGeneSignal. > b) FES performs QC on the probe replicates. If all probes fail to pass QC, it is unable to generate a reliable TotalGeneSignal and puts a flag in the IsGeneDetected column. Don't know, perhaps someone else her can answer this question. >> Also, have you checked to see if this scenario you are imagining is actually happening? > In the sample data set it occurs multiple times, of which hsa- miR-339-5p is the first example. > In my own data, I have seen these -1 values occurring multiple times, which made me curious to what was causing this. Perhaps you can bring your own domain knowledge into this analysis and you can do some data-detective work that will allow you to sidestep knowing how precisely FES works. There are surely miRNAs that you expect to be expressed in your control data, and others which you know should not be. Look at where these -1s are popping up in your control data, does this match up with your expectations? -steve -- Steve Lianoglou Computational Biologist Genentech
ADD REPLY

Login before adding your answer.

Traffic: 893 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6