Entering edit mode
Ken Termiso
▴
250
@ken-termiso-1087
Last seen 10.3 years ago
I apologize in advance if this is confusing...
When I use write.exprs (which, as I understand makes a call to
write.table)
to write expression data to a text file, the output text file has one
less
column name (the probe ID column does not get a name), and the other
column
names are shifted all the way to the left margin in the text file.
When this
text file is read into R using the command
read.table(file="exprs.txt",header=TRUE), R converts the file into a
data
frame, and correctly displays the row labels as probeset IDs.
(the spacing may be a little off here, depending on the display font,
but
here you can see that the probeset name is the row label)
6187.CEL 6188.CEL 6189.CEL 6190.CEL 6191.CEL 6192.CEL
1007_s_at 8.779289 8.732751 8.822360 8.743272 8.768605 8.813886
1053_at 3.508310 3.389342 3.434458 3.410836 3.373940 3.387063
117_at 3.139897 3.105285 3.114203 3.131865 3.073855 3.038960
However, with the limma toptables, each column has a name, including
the
probeset column ("ID"). When I write a toptable to a textfile, and
then read
it back into R, R thinks that the probeset IDs are a column of data
(since
it is labelled with "ID"), and then adds row numbers to this data
frame.
This makes it difficult to do other operations (at least in my novice
hands!!)
>tt[1:3,]
ID M A t P.Value B
1 1007_s_at -0.002879009 8.776694 -0.09459093 0.9999627 -6.721547
2 1053_at -0.053423214 3.417325 -1.60706334 0.9999627 -5.499340
3 117_at -0.038235209 3.100678 -1.42248721 0.9999627 -5.724391
If I open up the toptable text file in excel, and delete the "ID"
column
name and do not shift over the other ones, this is what happens:
>tt_spc[1:3,]
X M A t P.Value B
1 1007_s_at -0.002879009 8.776694 -0.09459093 0.9999627 -6.721547
2 1053_at -0.053423210 3.417325 -1.60706300 0.9999627 -5.499340
3 117_at -0.038235210 3.100678 -1.42248700 0.9999627 -5.724391
R silently appended an "X" to the "ID" column name..
If I open the toptable file in excel, delete the "ID" column name, and
then
shift the other column names over one all the way to the left, and
then open
the text file in R it looks perfect:
>tt_shft[1:3,]
M A t P.Value B
1007_s_at -0.00288 8.776694 -0.0946 0.9999627 -6.721547
1053_at -0.05340 3.417325 -1.6100 0.9999627 -5.499340
117_at -0.03820 3.100678 -1.4200 0.9999627 -5.724391
BUT, I don't want to have to edit each toptable file in excel before
re-opening it in R.
I also tried setting the column name to "", and also giving the
toptable
data frame a string of names without the ID, but neither one
worked...in
both cases R filled in an "NA" for the column name...
Is there any way for me to avoid having to edit the file in excel so
that I
can write it to a text file, read it back into R, and have it display
the
probeset names as the row labels???
I guess what I'm asking is this -- is there are way for me to modify
the
toptable data frame so that the "ID" is removed and R uses the "ID"
column
as the row labels??
Thanks in advance,
-Ken