gff files: how to tell if right-open interval convention used?
2
0
Entering edit mode
@karlerhardberkeleyedu-4569
Last seen 9.9 years ago
Hi all, I'm a grad student at UC Berkeley, I'm new to the list, as well as to R programs in general, so I hope you'll forgive my simplistic questions. I'm working with the girafe package to generate counts table which can be input into edgeR. I've noticed that the readGff3 function is sensitive to whether the gff file being read uses this "right-open interval convention" or not. I'm just not sure how to tell if the gff file I am using follows this convention. Is there a simple way to find out? Any help on this would be greatly appreciated. best, karl
edgeR girafe edgeR girafe • 1.3k views
ADD COMMENT
0
Entering edit mode
@herve-pages-1542
Last seen 2 days ago
Seattle, WA, United States
Hi karl, On 03/30/2011 01:58 PM, karlerhard at berkeley.edu wrote: > > Hi all, > > I'm a grad student at UC Berkeley, I'm new to the list, as well as to R > programs in general, so I hope you'll forgive my simplistic questions. > > I'm working with the girafe package to generate counts table which can be > input into edgeR. I've noticed that the readGff3 function is sensitive to > whether the gff file being read uses this "right-open interval convention" > or not. I'm just not sure how to tell if the gff file I am using follows > this convention. Is there a simple way to find out? Valid GFF3 files should never use the "right-open interval convention". Always 1-based starts and ends: http://www.sequenceontology.org/gff3.shtml Look at the first line in your file. If you see: ##gff-version 3 then the file claims to be adhering to the GFF3 specs. Of course, you can never be 100% sure. How much do you trust the tool that generated your GFF3 file? Is it documented? Do you have access to its source code? Cheers, H. > > Any help on this would be greatly appreciated. > > best, > > karl > > _______________________________________________ > Bioconductor mailing list > Bioconductor at r-project.org > https://stat.ethz.ch/mailman/listinfo/bioconductor > Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor -- Hervé Pagès Program in Computational Biology Division of Public Health Sciences Fred Hutchinson Cancer Research Center 1100 Fairview Ave. N, M2-B876 P.O. Box 19024 Seattle, WA 98109-1024 E-mail: hpages at fhcrc.org Phone: (206) 667-5791 Fax: (206) 667-1319
ADD COMMENT
0
Entering edit mode
Chris Fields ▴ 90
@chris-fields-4329
Last seen 2.3 years ago
United States
karl, GFF should always be 1-based closed, not 0-based right-open (unlike BED format). I think this convention goes back to the original version of GFF from Sanger up to the latest version, GFF3. So, it probably comes down to whether the source of the GFF output you are using is generating the correct coordinates, not how R/BioC is processing it. Unless the girafe method in question is allowing BED output to be read as well (I would consider that bad). chris On Mar 30, 2011, at 3:58 PM, karlerhard at berkeley.edu wrote: > > Hi all, > > I'm a grad student at UC Berkeley, I'm new to the list, as well as to R > programs in general, so I hope you'll forgive my simplistic questions. > > I'm working with the girafe package to generate counts table which can be > input into edgeR. I've noticed that the readGff3 function is sensitive to > whether the gff file being read uses this "right-open interval convention" > or not. I'm just not sure how to tell if the gff file I am using follows > this convention. Is there a simple way to find out? > > Any help on this would be greatly appreciated. > > best, > > karl > > _______________________________________________ > Bioconductor mailing list > Bioconductor at r-project.org > https://stat.ethz.ch/mailman/listinfo/bioconductor > Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor
ADD COMMENT

Login before adding your answer.

Traffic: 932 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6