reading a .psl file
2
0
Entering edit mode
Fahim Md ▴ 250
@fahim-md-4018
Last seen 9.7 years ago
Does anyone know how to read a .psl file( .psl is an blat output file) in R??? The .psl format is as follows(enclosed by #-line): ######## psLayout version 3 match mis- rep. N's Q gap Q gap T gap T gap strand Q Q Q Q T T T T block blockSizes qStarts tStarts match match count bases count bases name size start end name size start end count ---------------------------------------------------------------------- ---------------------------------------------------------------------- ------------------- 25 0 0 0 0 0 0 0 + 1367452_atAAAAAABBBBBBBBB 25 0 25 chr11 87759784 81380087 81380112 1 25, 0, 81380087, ---- ---- --- ####### I dont want to use file() followed by incremental readLines() function. Does anyone know of any package dealing with it or any alternative? Thanks and regards Fahim [[alternative HTML version deleted]]
• 2.0k views
ADD COMMENT
0
Entering edit mode
@sean-davis-490
Last seen 4 months ago
United States
On Wed, May 12, 2010 at 10:22 PM, Fahim Md <fahim.md@gmail.com> wrote: > Does anyone know how to read a .psl file( .psl is an blat output file) in > R??? > The .psl format is as follows(enclosed by #-line): > ######## > psLayout version 3 > > match mis- rep. N's Q gap Q gap T gap T gap > strand Q Q Q Q T T T > T block blockSizes qStarts tStarts > match match count bases count bases > name size start end name size start > end count > > -------------------------------------------------------------------- ---------------------------------------------------------------------- --------------------- > 25 0 0 0 0 0 0 0 + 1367452_atAAAAAABBBBBBBBB > 25 0 25 chr11 87759784 81380087 81380112 1 25, > 0, 81380087, > ---- > ---- > --- > ####### > > > I dont want to use file() followed by incremental readLines() function. > Does anyone know of any package dealing with it or any alternative? > See read.table() and the skip argument. Sean [[alternative HTML version deleted]]
ADD COMMENT
0
Entering edit mode
@sean-davis-490
Last seen 4 months ago
United States
On Thu, May 13, 2010 at 12:31 PM, Fahim Md <fahim.md@gmail.com> wrote: > The data is in output file, so I need to have at least one argument to read > that file. Also I think read.table() needs the number of columns in all the > rows to be same which is not true with .psl files. The following output > shows that: > > 1> dataTable = read.table('output.psl'); > Error in scan(file, what, nmax, sep, dec, quote, skip, nlines, na.strings, > : line 1 did not have 25 elements > In addition: Warning message: > In read.table("output.psl") : > incomplete final line found by readTableHeader on 'output.psl' > > Hello, Fahim. Please keep the replies on the list so you get the best help possible and everyone can learn from the answers. You really need to read the help for read.table(). In particular, you will want to know what sep, skip, nlines, and fill arguments do. Sean > Do you know any method through which I can read a block of lines in a > file(say from line 50 to line 100). That will solve my case. > ReadLines() function read the first n lines. For example. > R> x<-readLines(dataFile,n=6) > > 1> x > [1] "psLayout version > 3" > > [2] > "" > > [3] "match\tmis- \trep. \tN's\tQ gap\tQ gap\tT gap\tT gap\tstrand\tQ > \tQ \tQ \tQ \tT \tT \tT \tT \tblock\tblockSizes > \tqStarts\t tStarts" > [4] " \tmatch\tmatch\t \tcount\tbases\tcount\tbases\t \tname > \tsize\tstart\tend\tname > \tsize\tstart\tend\tcount" > [5] > "------------------------------------------------------------------- ---------------------------------------------------------------------- ----------------------" > [6] > "25\t0\t0\t0\t0\t0\t0\t0\t+\t1367452_atAAAAAABBBBBBBBB\t25\t0\t25\tc hr11\t87759784\t81380087\t81380112\t1\t25,\t0,\t81380087," > > 1> > > > Thanks and regards. > Fahim > > > > > On Thu, May 13, 2010 at 12:15 PM, Sean Davis <sdavis2@mail.nih.gov> wrote: > >> >> >> On Wed, May 12, 2010 at 10:22 PM, Fahim Md <fahim.md@gmail.com> wrote: >> >>> Does anyone know how to read a .psl file( .psl is an blat output file) in >>> R??? >>> The .psl format is as follows(enclosed by #-line): >>> ######## >>> psLayout version 3 >>> >>> match mis- rep. N's Q gap Q gap T gap T gap >>> strand Q Q Q Q T T T >>> T block blockSizes qStarts tStarts >>> match match count bases count bases >>> name size start end name size start >>> end count >>> >>> ------------------------------------------------------------------ ---------------------------------------------------------------------- ----------------------- >>> 25 0 0 0 0 0 0 0 + 1367452_atAAAAAABBBBBBBBB >>> 25 0 25 chr11 87759784 81380087 81380112 1 25, >>> 0, 81380087, >>> ---- >>> ---- >>> --- >>> ####### >>> >>> >>> I dont want to use file() followed by incremental readLines() function. >>> Does anyone know of any package dealing with it or any alternative? >>> >> >> See read.table() and the skip argument. >> >> Sean >> >> > > > -- > Mohammad Fahim > Louisville, KY, USA > Ph: +1-502-409-1167 > [[alternative HTML version deleted]]
ADD COMMENT
0
Entering edit mode
I am using dataFile = read.table('output.psl', header = FALSE, sep = "\t", skip = 5) Thanks a lot Fahim On Thu, May 13, 2010 at 12:52 PM, Sean Davis <sdavis2@mail.nih.gov> wrote: > > > On Thu, May 13, 2010 at 12:31 PM, Fahim Md <fahim.md@gmail.com> wrote: > >> The data is in output file, so I need to have at least one argument to >> read that file. Also I think read.table() needs the number of columns in all >> the rows to be same which is not true with .psl files. The following output >> shows that: >> >> 1> dataTable = read.table('output.psl'); >> Error in scan(file, what, nmax, sep, dec, quote, skip, nlines, >> na.strings, : line 1 did not have 25 elements >> In addition: Warning message: >> In read.table("output.psl") : >> incomplete final line found by readTableHeader on 'output.psl' >> >> > Hello, Fahim. > > Please keep the replies on the list so you get the best help possible and > everyone can learn from the answers. You really need to read the help for > read.table(). In particular, you will want to know what sep, skip, nlines, > and fill arguments do. > > Sean > > > > >> Do you know any method through which I can read a block of lines in a >> file(say from line 50 to line 100). That will solve my case. >> ReadLines() function read the first n lines. For example. >> R> x<-readLines(dataFile,n=6) >> >> 1> x >> [1] "psLayout version >> 3" >> >> [2] >> "" >> >> [3] "match\tmis- \trep. \tN's\tQ gap\tQ gap\tT gap\tT >> gap\tstrand\tQ \tQ \tQ \tQ \tT \tT \tT \tT >> \tblock\tblockSizes \tqStarts\t tStarts" >> [4] " \tmatch\tmatch\t \tcount\tbases\tcount\tbases\t >> \tname \tsize\tstart\tend\tname >> \tsize\tstart\tend\tcount" >> [5] >> "------------------------------------------------------------------ ---------------------------------------------------------------------- -----------------------" >> [6] >> "25\t0\t0\t0\t0\t0\t0\t0\t+\t1367452_atAAAAAABBBBBBBBB\t25\t0\t25\t chr11\t87759784\t81380087\t81380112\t1\t25,\t0,\t81380087," >> >> 1> >> >> >> Thanks and regards. >> Fahim >> >> >> >> >> On Thu, May 13, 2010 at 12:15 PM, Sean Davis <sdavis2@mail.nih.gov>wrote: >> >>> >>> >>> On Wed, May 12, 2010 at 10:22 PM, Fahim Md <fahim.md@gmail.com> wrote: >>> >>>> Does anyone know how to read a .psl file( .psl is an blat output file) >>>> in >>>> R??? >>>> The .psl format is as follows(enclosed by #-line): >>>> ######## >>>> psLayout version 3 >>>> >>>> match mis- rep. N's Q gap Q gap T gap T gap >>>> strand Q Q Q Q T T T >>>> T block blockSizes qStarts tStarts >>>> match match count bases count bases >>>> name size start end name size start >>>> end count >>>> >>>> ----------------------------------------------------------------- ---------------------------------------------------------------------- ------------------------ >>>> 25 0 0 0 0 0 0 0 + 1367452_atAAAAAABBBBBBBBB >>>> 25 0 25 chr11 87759784 81380087 81380112 1 25, >>>> 0, 81380087, >>>> ---- >>>> ---- >>>> --- >>>> ####### >>>> >>>> >>>> I dont want to use file() followed by incremental readLines() function. >>>> Does anyone know of any package dealing with it or any alternative? >>>> >>> >>> See read.table() and the skip argument. >>> >>> Sean >>> >>> >> >> >> -- >> Mohammad Fahim >> Louisville, KY, USA >> Ph: +1-502-409-1167 >> > > -- Mohammad Fahim Louisville, KY, USA Ph: +1-502-409-1167 [[alternative HTML version deleted]]
ADD REPLY

Login before adding your answer.

Traffic: 681 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6