[Bioc-sig-seq] extract id from ShortRead
1
0
Entering edit mode
Ramzi TEMANNI ▴ 160
@ramzi-temanni-3819
Last seen 9.6 years ago
You are right Sean, That's what is needed Thanks. The idea was to gene rate mate pair from single reads by combining reads with the same id. I extract the read id 8_1_3_659 from the full id HWI-EA332_8_1_3_659#GGGGNN/1b and substr(id(aln.data), 11, nchar(id(aln.data))-9) do that. Thanks again for your help ---------------------------------------------------------------- On Mon, Nov 30, 2009 at 4:29 PM, Sean Davis <seandavi@gmail.com> wrote: > On Mon, Nov 30, 2009 at 10:20 AM, Ramzi TEMANNI <ramzi.temanni@gmail.com> > wrote: > > Hi Sean, > > Thanks for your help as.character give 2 ids by row: > > [1] "HWI-EA332_8_1_3_659#GGGGNN/1" "HWI- EA332_8_1_3_1738#CCCCNN/1" > > [3] "HWI-EA332_8_1_3_1094#AGGANN/1" "HWI-EA332_8_1_3_558#TTTCNN/1" > > [5] "HWI-EA332_8_1_3_1920#AAAANN/1" "HWI-EA332_8_1_3_228#GGGGNN/1" > > it should be some accessors to extract one Id by row. > > i've take a look at suggested help but there's no useful info to extract > > what I want > > This is NOT two ids per row. It is one vector. R outputs two > elements per row only because your screen is wide enough for that. > > If you do: > > tmp <- as.character(id(aln)) > > class(tmp) > > tmp[1] > tmp[2] > tmp[1:5] > length(tmp) > > it might give you more of an idea what is going on. > > Sean > > > > > > > On Mon, Nov 30, 2009 at 3:40 PM, Sean Davis <seandavi@gmail.com> wrote: > >> > >> On Mon, Nov 30, 2009 at 9:27 AM, Ramzi TEMANNI <ramzi.temanni@gmail.com> > > >> wrote: > >> > Hi, > >> > I have a sequence loaded from bowtie alignment > >> > aln <- readAligned("./S1", pattern="S1_1.hg19.bowtie.align", > >> > type="Bowtie") > >> > I would like to to extract the id to select specific reads > >> > I run id(aln) and I get: > >> > id(aln) > >> > A BStringSet instance of length 4340867 > >> > width seq > >> > [1] 28 HWI-EA332_8_1_3_659#GGGGNN/1 > >> > [2] 29 HWI-EA332_8_1_3_1738#CCCCNN/1 > >> > [3] 29 HWI-EA332_8_1_3_1094#AGGANN/1 > >> > [4] 28 HWI-EA332_8_1_3_558#TTTCNN/1 > >> > [5] 29 HWI-EA332_8_1_3_1920#AAAANN/1 > >> > [6] 28 HWI-EA332_8_1_3_228#GGGGNN/1 > >> > [7] 29 HWI-EA332_8_1_3_1261#AGGGNN/1 > >> > [8] 28 HWI-EA332_8_1_3_908#ACTTNN/1 > >> > [9] 27 HWI-EA332_8_1_3_53#CTGCNN/1 > >> > ... ... ... > >> > [4340859] 33 HWI-EA332_8_120_1596_499#TTGANA/1 > >> > [4340860] 34 HWI-EA332_8_120_1599_1161#CCACNT/1 > >> > [4340861] 33 HWI-EA332_8_120_1601_255#CTCTNA/1 > >> > [4340862] 33 HWI-EA332_8_120_1601_504#CCATNC/1 > >> > [4340863] 33 HWI-EA332_8_120_1624_899#CTCTNT/1 > >> > [4340864] 33 HWI-EA332_8_120_1487_658#ACCCNA/1 > >> > [4340865] 32 HWI-EA332_8_120_1533_28#CACANG/1 > >> > [4340866] 33 HWI-EA332_8_120_1564_807#CCCGNG/1 > >> > [4340867] 34 HWI-EA332_8_120_1474_1350#CCTGNC/1 > >> > > >> > This BStringSet instance has 'width' and 'seq' > >> > runing str(id(aln)) i got this > >> > > >> > Formal class 'BStringSet' [package "Biostrings"] with 5 slots > >> > ..@ pool :Formal class 'SharedRaw_Pool' [package "IRanges"] > >> > with > >> > 2 slots > >> > .. .. ..@ xp_list :List of 1 > >> > .. .. .. ..$ :<externalptr> > >> > .. .. ..@ .link_to_cached_object_list:List of 1 > >> > .. .. .. ..$ :<environment: 0x2af6400=""> > >> > ..@ ranges :Formal class 'GroupedIRanges' [package "IRanges"] > >> > with > >> > 7 slots > >> > .. .. ..@ group : int [1:4340867] 1 1 1 1 1 1 1 1 1 1 ... > >> > .. .. ..@ start : int [1:4340867] 1 29 58 87 115 144 172 201 > >> > 229 > >> > 256 ... > >> > .. .. ..@ width : int [1:4340867] 28 29 29 28 29 28 29 28 27 > >> > 29 > >> > ... > >> > .. .. ..@ NAMES : NULL > >> > .. .. ..@ elementMetadata: NULL > >> > .. .. ..@ elementType : chr "integer" > >> > .. .. ..@ metadata : list() > >> > ..@ elementMetadata: NULL > >> > ..@ elementType : chr "BString" > >> > ..@ metadata : list() > >> > > >> > But i'm wondering how to extract only the 'seq' from all that and > store > >> > result in a table ? > >> > >> as.character(id(aln)) > >> > >> will return a character vector of the names. You might want to look > >> at the help for AlignedRead-class and BStringSet-class for some help > >> in understanding these classes and what can be done with them. It may > >> be that you will not need to go to character vector to do what you > >> want with the reads. > >> > >> Sean > > > > > [[alternative HTML version deleted]]
GO GO • 1.0k views
ADD COMMENT
0
Entering edit mode
@sean-davis-490
Last seen 12 weeks ago
United States
On Mon, Nov 30, 2009 at 10:20 AM, Ramzi TEMANNI <ramzi.temanni at="" gmail.com=""> wrote: > Hi Sean, > Thanks for your help as.character give 2 ids by row: > [1] "HWI-EA332_8_1_3_659#GGGGNN/1"? "HWI-EA332_8_1_3_1738#CCCCNN/1" > [3] "HWI-EA332_8_1_3_1094#AGGANN/1" "HWI-EA332_8_1_3_558#TTTCNN/1" > [5] "HWI-EA332_8_1_3_1920#AAAANN/1" "HWI-EA332_8_1_3_228#GGGGNN/1" > it should be some accessors to extract one Id by row. > i've take a look at suggested help but there's no useful info to extract > what I want This is NOT two ids per row. It is one vector. R outputs two elements per row only because your screen is wide enough for that. If you do: tmp <- as.character(id(aln)) class(tmp) tmp[1] tmp[2] tmp[1:5] length(tmp) it might give you more of an idea what is going on. Sean > > On Mon, Nov 30, 2009 at 3:40 PM, Sean Davis <seandavi at="" gmail.com=""> wrote: >> >> On Mon, Nov 30, 2009 at 9:27 AM, Ramzi TEMANNI <ramzi.temanni at="" gmail.com=""> >> wrote: >> > Hi, >> > I have a sequence loaded from bowtie alignment >> > aln <- readAligned("./S1", pattern="S1_1.hg19.bowtie.align", >> > type="Bowtie") >> > I would like to to extract the id to select specific reads >> > I run id(aln) and I get: >> > id(aln) >> > ?A BStringSet instance of length 4340867 >> > ? ? ? ? ?width seq >> > ? ? ?[1] ? ?28 HWI-EA332_8_1_3_659#GGGGNN/1 >> > ? ? ?[2] ? ?29 HWI-EA332_8_1_3_1738#CCCCNN/1 >> > ? ? ?[3] ? ?29 HWI-EA332_8_1_3_1094#AGGANN/1 >> > ? ? ?[4] ? ?28 HWI-EA332_8_1_3_558#TTTCNN/1 >> > ? ? ?[5] ? ?29 HWI-EA332_8_1_3_1920#AAAANN/1 >> > ? ? ?[6] ? ?28 HWI-EA332_8_1_3_228#GGGGNN/1 >> > ? ? ?[7] ? ?29 HWI-EA332_8_1_3_1261#AGGGNN/1 >> > ? ? ?[8] ? ?28 HWI-EA332_8_1_3_908#ACTTNN/1 >> > ? ? ?[9] ? ?27 HWI-EA332_8_1_3_53#CTGCNN/1 >> > ? ? ?... ? ... ... >> > [4340859] ? ?33 HWI-EA332_8_120_1596_499#TTGANA/1 >> > [4340860] ? ?34 HWI-EA332_8_120_1599_1161#CCACNT/1 >> > [4340861] ? ?33 HWI-EA332_8_120_1601_255#CTCTNA/1 >> > [4340862] ? ?33 HWI-EA332_8_120_1601_504#CCATNC/1 >> > [4340863] ? ?33 HWI-EA332_8_120_1624_899#CTCTNT/1 >> > [4340864] ? ?33 HWI-EA332_8_120_1487_658#ACCCNA/1 >> > [4340865] ? ?32 HWI-EA332_8_120_1533_28#CACANG/1 >> > [4340866] ? ?33 HWI-EA332_8_120_1564_807#CCCGNG/1 >> > [4340867] ? ?34 HWI-EA332_8_120_1474_1350#CCTGNC/1 >> > >> > This BStringSet instance has 'width' and 'seq' >> > runing str(id(aln)) i got this >> > >> > Formal class 'BStringSet' [package "Biostrings"] with 5 slots >> > ?..@ pool ? ? ? ? ? :Formal class 'SharedRaw_Pool' [package "IRanges"] >> > with >> > 2 slots >> > ?.. .. ..@ xp_list ? ? ? ? ? ? ? ? ? ?:List of 1 >> > ?.. .. .. ..$ :<externalptr> >> > ?.. .. ..@ .link_to_cached_object_list:List of 1 >> > ?.. .. .. ..$ :<environment: 0x2af6400=""> >> > ?..@ ranges ? ? ? ? :Formal class 'GroupedIRanges' [package "IRanges"] >> > with >> > 7 slots >> > ?.. .. ..@ group ? ? ? ? ?: int [1:4340867] 1 1 1 1 1 1 1 1 1 1 ... >> > ?.. .. ..@ start ? ? ? ? ?: int [1:4340867] 1 29 58 87 115 144 172 201 >> > 229 >> > 256 ... >> > ?.. .. ..@ width ? ? ? ? ?: int [1:4340867] 28 29 29 28 29 28 29 28 27 >> > 29 >> > ... >> > ?.. .. ..@ NAMES ? ? ? ? ?: NULL >> > ?.. .. ..@ elementMetadata: NULL >> > ?.. .. ..@ elementType ? ?: chr "integer" >> > ?.. .. ..@ metadata ? ? ? : list() >> > ?..@ elementMetadata: NULL >> > ?..@ elementType ? ?: chr "BString" >> > ?..@ metadata ? ? ? : list() >> > >> > But i'm wondering how to extract only the 'seq' from all that and store >> > result in a table ? >> >> as.character(id(aln)) >> >> will return a character vector of the names. ?You might want to look >> at the help for AlignedRead-class and BStringSet-class for some help >> in understanding these classes and what can be done with them. ?It may >> be that you will not need to go to character vector to do what you >> want with the reads. >> >> Sean > >
ADD COMMENT

Login before adding your answer.

Traffic: 838 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6