Question

reading text file in R

0

Entering edit mode

niutster • 0

@niutster-10602

Last seen 6.4 years ago

Hi,

I have 6 text files contain chromosome's name, start, and end and probe's name and CpG numbers. I want to read these files in R and then do some processes with this data. Used commands are in below:

my_data <- read.delim("Chr.txt",header=TRUE, sep="\t")
d=as.numeric(my_data[1,1])

> my_data[1,1]
[1] 13
Levels: 1 10 11 12 13 14 15 16 17 18 19 2 20 21 22 3 4 5 6 7 8 9 X Y
> d
[1] 5
There are two problems :

1.It does not convert my_data[1,1] correctly.

2. Chr.txt starts with "17" , "13". ...

the first element is "17" not "13"

warning text read.delim • 2.1k views

ADD COMMENT • link updated 6.8 years ago by Gordon Smyth 50k • written 6.8 years ago by niutster • 0

score 0 · Answer 1 · 2017-07-23

We would know for sure if you showed us some lines from your text file, but I suspect that if you use:

my_data <- read.delim("Chr.txt", header=FALSE, sep="\t", stringsAsFactors=FALSE)

then you will find that it reads the file as you expect.

You're getting confused at the moment (1) because the Chromosome number has been converted into a factor and (2) because I suspect your file doesn't actually have a line of column headings.

Actually read.delim() has converted my_data[1,1] correctly. You will notice that "13" is the 5th possible value that my_data[,1] can take on according to the list of Levels. When you coerce my_data[1,1] to numeric (why did you do that?), it records which level the element is in terms of the ordered list of levels (5th level) not the actual level ("13"). That's why you get d=5. If you wanted the actual level, you would use as.character(my_data[1,1]) instead. Note that Chromosome number is best stored as character rather than numeric because "X" and "Y" are possible values.