edgeR

0

Entering edit mode

Sridhara Gupta Kunjeti ▴ 320

@sridhara-gupta-kunjeti-4449

Last seen 10.8 years ago

United States

Hello, I was using the edgeR for Reading in the data and creating DGEList objects. I followed the instruction as described in edgeR user guide. I noticed that not all the entries in the files were loaded into R, especially first few entries. The steps / codes that I used were: step 1. created a plane text with the following information: files group description F0a PhyP18B1.txt PhyP18 Phytophthora phaseoli F0b PhyP18B2.txt PhyP18 Phytophthora phaseoli P3a PPLB3dpiB1.txt PPLB3dpi Phytophthora phaseoli P3b PPLB3dpiB2.txt PPLB3dpi Phytophthora phaseoli step 2. > setwd("C:/Users/SRIDHARA/Documents/test/bowtie/0_mismatch") > targets <- read.delim(file = "targets.txt", stringsAsFactors = FALSE) > targets files group description F0a PhyP18B1.txt PhyP18 Phytophthora phaseoli F0b PhyP18B2.txt PhyP18 Phytophthora phaseoli P3a PPLB3dpiB1.txt PPLB3dpi Phytophthora phaseoli P3b PPLB3dpiB2.txt PPLB3dpi Phytophthora phaseoli P6a PPLB6dpiB1.txt PPLB6dpi Phytophthora phaseoli P6b PPLB6dpiB2.txt PPLB6dpi Phytophthora phaseoli Step 3. > d <- readDGE(targets, skip = 5, comment.char = "!") > d An object of class "DGEList" $samples files group description lib.size norm.factors F0a PhyP18B1.txt PhyP18 Phytophthora phaseoli 2442435 1 F0b PhyP18B2.txt PhyP18 Phytophthora phaseoli 7355562 1 P3a PPLB3dpiB1.txt PPLB3dpi Phytophthora phaseoli 474592 1 P3b PPLB3dpiB2.txt PPLB3dpi Phytophthora phaseoli 13778 1 P6a PPLB6dpiB1.txt PPLB6dpi Phytophthora phaseoli 3280812 1 P6b PPLB6dpiB2.txt PPLB6dpi Phytophthora phaseoli 3906611 1 $counts F0a F0b P3a P3b P6a P6b PITG_23029 | Pi Crinkler (CRN) family protein, pseudogene (1794 nt) 170 109 0 0 12 8 PITG_14644 | Pi AMP-binding enzyme, putative (2568 nt) 5 46 0 0 44 44 PITG_09824 | Pi metalloprotease family M12A, putative (1230 nt) 7 17 1 0 33 8 Here it first five entries were removed, when I use the following codes: > d <- readDGE(targets, skip = 0, comment.char = "!") OR > d <- readDE(targets) I noticed that first entry is removed. There is first entry with counts, which I wanted to be taken into account for the DGE. I was wondering if I am doing something wrong, or is there a way to fix this problem? Any comments or suggestions will be appreciated. Many thanks in advance, Sridhara -- Sridhara G Kunjeti PhD Candidate University of Delaware Department of Plant and Soil Science email- sridhara@udel.edu Ph: 832-566-0011 [[alternative HTML version deleted]]

edgeR edgeR • 1.7k views

ADD COMMENT • link updated 15.0 years ago by Davis McCarthy ▴ 260 • written 15.0 years ago by Sridhara Gupta Kunjeti ▴ 320

0

Entering edit mode

Davis McCarthy ▴ 260

@davis-mccarthy-4138

Last seen 11.4 years ago

Hi Sridhara This is not a problem as such, your issue will hopefully be solved with a little more explanation of how readDGE() (an edgeR function) and read.delim() (a base R function) work. You will see in the documentation for readDGE() > ?readDGE that it accepts several named arguments and an argument '...' which indicates that any further arguments are passed to read.delim(). In your example from the User's Guide, the arguments 'skip=5' and 'comment.char="!"' are arguments that are passed to read.delim(). 'skip=x' skips the first x lines of each file being read in. 'comment.char="!"' skips (does not read in) any line beginning with '!'. These extra arguments were needed for the dataset used in the example in the User's Guide, but are not needed generally. That, perhaps, is not clear in the User's Guide (an overhaul thereof is on my TODO list). You are missing one line even when you set 'skip=0' because one of the defaults for read.delim() is 'header=TRUE', which means that by default read.delim() assumes the first line of your file is a header and does not read it into the table in R. Unsurprisingly, you can change this behaviour by setting 'header=FALSE'. See > ?read.delim for more information about the use of read.delim. For your example, the call > d <- readDGE(targets, header=FALSE) should get all of your data read into a DGEList object in your R session. Hope that explains why your issue was arising and fixes it so that you can proceed with your analysis. Best wishes Davis On Feb 17, 2011, at 1:47 PM, Sridhara Gupta Kunjeti wrote: > Hello, > I was using the edgeR for Reading in the data and creating DGEList objects. > I followed the instruction as described in edgeR user guide. > > I noticed that not all the entries in the files were loaded into R, > especially first few entries. > The steps / codes that I used were: > > step 1. created a plane text with the following information: > files group description > F0a PhyP18B1.txt PhyP18 Phytophthora phaseoli > F0b PhyP18B2.txt PhyP18 Phytophthora phaseoli > P3a PPLB3dpiB1.txt PPLB3dpi Phytophthora phaseoli > P3b PPLB3dpiB2.txt PPLB3dpi Phytophthora phaseoli > > step 2. >> setwd("C:/Users/SRIDHARA/Documents/test/bowtie/0_mismatch") >> targets <- read.delim(file = "targets.txt", stringsAsFactors = FALSE) >> targets > files group description > F0a PhyP18B1.txt PhyP18 Phytophthora phaseoli > F0b PhyP18B2.txt PhyP18 Phytophthora phaseoli > P3a PPLB3dpiB1.txt PPLB3dpi Phytophthora phaseoli > P3b PPLB3dpiB2.txt PPLB3dpi Phytophthora phaseoli > P6a PPLB6dpiB1.txt PPLB6dpi Phytophthora phaseoli > P6b PPLB6dpiB2.txt PPLB6dpi Phytophthora phaseoli > > Step 3. >> d <- readDGE(targets, skip = 5, comment.char = "!") >> d > An object of class "DGEList" > $samples > files group description lib.size norm.factors > F0a PhyP18B1.txt PhyP18 Phytophthora phaseoli 2442435 1 > F0b PhyP18B2.txt PhyP18 Phytophthora phaseoli 7355562 1 > P3a PPLB3dpiB1.txt PPLB3dpi Phytophthora phaseoli 474592 1 > P3b PPLB3dpiB2.txt PPLB3dpi Phytophthora phaseoli 13778 1 > P6a PPLB6dpiB1.txt PPLB6dpi Phytophthora phaseoli 3280812 1 > P6b PPLB6dpiB2.txt PPLB6dpi Phytophthora phaseoli 3906611 1 > > $counts > > F0a F0b P3a P3b P6a P6b > PITG_23029 | Pi Crinkler (CRN) family protein, pseudogene (1794 nt) 170 > 109 0 0 12 8 > PITG_14644 | Pi AMP-binding enzyme, putative (2568 nt) 5 > 46 0 0 44 44 > PITG_09824 | Pi metalloprotease family M12A, putative (1230 nt) 7 > 17 1 0 33 8 > > Here it first five entries were removed, > when I use the following codes: >> d <- readDGE(targets, skip = 0, comment.char = "!") > OR >> d <- readDE(targets) > I noticed that first entry is removed. There is first entry with counts, > which I wanted to be taken into account for the DGE. > > I was wondering if I am doing something wrong, or is there a way to fix this > problem? > > Any comments or suggestions will be appreciated. > > Many thanks in advance, > Sridhara > > -- > Sridhara G Kunjeti > PhD Candidate > University of Delaware > Department of Plant and Soil Science > email- sridhara at udel.edu > Ph: 832-566-0011 > > [[alternative HTML version deleted]] > > _______________________________________________ > Bioconductor mailing list > Bioconductor at r-project.org > https://stat.ethz.ch/mailman/listinfo/bioconductor > Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor ---------------------------------------------------------------------- -- Davis J McCarthy Research Technician Bioinformatics Division Walter and Eliza Hall Institute of Medical Research 1G Royal Parade, Parkville, Vic 3052, Australia dmccarthy at wehi.edu.au http://www.wehi.edu.au ______________________________________________________________________ The information in this email is confidential and intend...{{dropped:6}}

ADD COMMENT • link 15.0 years ago Davis McCarthy ▴ 260

0

Entering edit mode

Hello Davis, Yes, you have explained the issue and now it is fixed. Many thanks! Sridhara On Wed, Feb 16, 2011 at 11:03 PM, Davis McCarthy <dmccarthy@wehi.edu.au>wrote: > Hi Sridhara > > This is not a problem as such, your issue will hopefully be solved with a > little more explanation of how readDGE() (an edgeR function) and > read.delim() (a base R function) work. > > You will see in the documentation for readDGE() > > ?readDGE > that it accepts several named arguments and an argument '...' which > indicates that any further arguments are passed to read.delim(). In your > example from the User's Guide, the arguments 'skip=5' and 'comment.char="!"' > are arguments that are passed to read.delim(). 'skip=x' skips the first x > lines of each file being read in. 'comment.char="!"' skips (does not read > in) any line beginning with '!'. These extra arguments were needed for the > dataset used in the example in the User's Guide, but are not needed > generally. That, perhaps, is not clear in the User's Guide (an overhaul > thereof is on my TODO list). > > You are missing one line even when you set 'skip=0' because one of the > defaults for read.delim() is 'header=TRUE', which means that by default > read.delim() assumes the first line of your file is a header and does not > read it into the table in R. Unsurprisingly, you can change this behaviour > by setting 'header=FALSE'. > > See > > ?read.delim > for more information about the use of read.delim. > > For your example, the call > > d <- readDGE(targets, header=FALSE) > should get all of your data read into a DGEList object in your R session. > > Hope that explains why your issue was arising and fixes it so that you can > proceed with your analysis. > > Best wishes > Davis > > > > > > On Feb 17, 2011, at 1:47 PM, Sridhara Gupta Kunjeti wrote: > > > Hello, > > I was using the edgeR for Reading in the data and creating DGEList > objects. > > I followed the instruction as described in edgeR user guide. > > > > I noticed that not all the entries in the files were loaded into R, > > especially first few entries. > > The steps / codes that I used were: > > > > step 1. created a plane text with the following information: > > files group description > > F0a PhyP18B1.txt PhyP18 Phytophthora phaseoli > > F0b PhyP18B2.txt PhyP18 Phytophthora phaseoli > > P3a PPLB3dpiB1.txt PPLB3dpi Phytophthora phaseoli > > P3b PPLB3dpiB2.txt PPLB3dpi Phytophthora phaseoli > > > > step 2. > >> setwd("C:/Users/SRIDHARA/Documents/test/bowtie/0_mismatch") > >> targets <- read.delim(file = "targets.txt", stringsAsFactors = FALSE) > >> targets > > files group description > > F0a PhyP18B1.txt PhyP18 Phytophthora phaseoli > > F0b PhyP18B2.txt PhyP18 Phytophthora phaseoli > > P3a PPLB3dpiB1.txt PPLB3dpi Phytophthora phaseoli > > P3b PPLB3dpiB2.txt PPLB3dpi Phytophthora phaseoli > > P6a PPLB6dpiB1.txt PPLB6dpi Phytophthora phaseoli > > P6b PPLB6dpiB2.txt PPLB6dpi Phytophthora phaseoli > > > > Step 3. > >> d <- readDGE(targets, skip = 5, comment.char = "!") > >> d > > An object of class "DGEList" > > $samples > > files group description lib.size norm.factors > > F0a PhyP18B1.txt PhyP18 Phytophthora phaseoli 2442435 1 > > F0b PhyP18B2.txt PhyP18 Phytophthora phaseoli 7355562 1 > > P3a PPLB3dpiB1.txt PPLB3dpi Phytophthora phaseoli 474592 1 > > P3b PPLB3dpiB2.txt PPLB3dpi Phytophthora phaseoli 13778 1 > > P6a PPLB6dpiB1.txt PPLB6dpi Phytophthora phaseoli 3280812 1 > > P6b PPLB6dpiB2.txt PPLB6dpi Phytophthora phaseoli 3906611 1 > > > > $counts > > > > F0a F0b P3a P3b P6a P6b > > PITG_23029 | Pi Crinkler (CRN) family protein, pseudogene (1794 nt) 170 > > 109 0 0 12 8 > > PITG_14644 | Pi AMP-binding enzyme, putative (2568 nt) > 5 > > 46 0 0 44 44 > > PITG_09824 | Pi metalloprotease family M12A, putative (1230 nt) 7 > > 17 1 0 33 8 > > > > Here it first five entries were removed, > > when I use the following codes: > >> d <- readDGE(targets, skip = 0, comment.char = "!") > > OR > >> d <- readDE(targets) > > I noticed that first entry is removed. There is first entry with counts, > > which I wanted to be taken into account for the DGE. > > > > I was wondering if I am doing something wrong, or is there a way to fix > this > > problem? > > > > Any comments or suggestions will be appreciated. > > > > Many thanks in advance, > > Sridhara > > > > -- > > Sridhara G Kunjeti > > PhD Candidate > > University of Delaware > > Department of Plant and Soil Science > > email- sridhara@udel.edu > > Ph: 832-566-0011 > > > > [[alternative HTML version deleted]] > > > > _______________________________________________ > > Bioconductor mailing list > > Bioconductor@r-project.org > > https://stat.ethz.ch/mailman/listinfo/bioconductor > > Search the archives: > http://news.gmane.org/gmane.science.biology.informatics.conductor > > -------------------------------------------------------------------- ---- > Davis J McCarthy > Research Technician > Bioinformatics Division > Walter and Eliza Hall Institute of Medical Research > 1G Royal Parade, Parkville, Vic 3052, Australia > dmccarthy@wehi.edu.au > http://www.wehi.edu.au > > > > ______________________________________________________________________ > The information in this email is confidential and inte...{{dropped:20}}

ADD REPLY • link 15.0 years ago Sridhara Gupta Kunjeti ▴ 320

Login before adding your answer.