I'm new to using ALDEx2, so have a very simple question about how to create the aldex.clr object described here https://www.bioconductor.org/packages/devel/bioc/manuals/ALDEx2/man/ALDEx2.pdf using a data frame. The documentation provides this example:
# The 'reads' data.frame or
# RangedSummarizedExperiment object should
# have row and column names that are unique,
# and looks like the following:
#
# T1a T1b T2 T3 N1 N2 Nx
# Gene_00001 0 0 2 0 0 1 0
# Gene_00002 20 8 12 5 19 26 14
# Gene_00003 3 0 2 0 0 0 1
# Gene_00004 75 84 241 149 271 257 188
# Gene_00005 10 16 4 0 4 10 10
# Gene_00006 129 126 451 223 243 149 209
# ... many more rows ...
data(selex)
#subset for efficiency
selex <- selex[1201:1600,]
conds <- c(rep("NS", 7), rep("S", 7))
x <- aldex.clr(selex, conds, mc.samples=2, denom="all", verbose=FALSE)
In my case, I need to load the selex data like this:
reads_df <- read.table(file="~/selex.txt", header=TRUE, sep="\t", dec=".", as.is=FALSE);
We now have the reads_df:
head(reads_df):
X X1_ANS X1_BNS X1_CNS X1_DNS X2_ANS X2_CNS X2_DNS X1_AS X1_BS X1_CS
1 S:D:A:D 524 355 443 489 465 509 754 0 0 0
2 S:D:A:E 588 383 564 462 559 564 961 5 5 11
3 S:E:A:D 596 318 542 443 605 459 1022 77 44 8
4 S:E:A:E 535 352 549 514 555 465 1476 718 168 76
5 S:D:C:D 218 104 192 193 177 190 709 0 0 0
6 S:D:C:E 269 180 151 234 281 269 467 1 0 0
X1_DS X2_AS X2_CS X2_DS
1 13 675 1 4
2 437 10 4 1
3 12 4 2 89
4 459 10 31 5
5 0 1 0 0
6 4 0 0 0
Here are the column types:
sapply(reads_df, class)
"factor" "integer" "integer" "integer" "integer" "integer" "integer" "integer" "integer" "integer" "integer" "integer" "integer" "integer" "integer"
The first column is the “feature” column, which is character, not integer, and so throws an exception when attempting to create the aldex.clr object:
conds <- c(rep("NS", 7), rep("S", 7));
aldex_clr_df <- aldex.clr(reads_df, conds=conds, mc.samples=128, denom="all");
Error in FUN(newX[, i], ...) : invalid 'type' (character) of argument
Calls: aldex.clr -> aldex.clr -> aldex.clr.function -> apply
Execution halted
I'm sure that I must be missing something simple here, but I'm not quite sure what. I so appreciate any help with this.
Thanks!
I have no experience with
ALDEx2
, but based on what you show regarding expected input, and your input + error, I would say that therownames
of your data (adata.frame
?) should be the content of the 1st column (X
) (and not 1, 2, 3 etc.). Next you should remove this columnX
from your data.Something like:
Also, please realize that the use of colons (
:
) in names is syntactically not valid inR
. You should replace them. See for example?make.names
.Thanks for your response on this. Yes, reads_df is a data frame, created with this code:
Here are a few lines of the selex.txt file, with the first line being the header:
The values in the first column are character strings - from the aldex.clr perspective, I believe these are the features. So the colons here are simply part of the string.
The row numbers are a result of printing the data frame to show what it contains. The first column in the data frame is actually the X column containing the strings. The call to the function aldex.clr() is failing because it expects integers, and so doesn't handle the character column.
I've tried removing the first column:
The aldex.clr object is built:
But the features are now incorrect:
So I'm wondering how to load the data frame in such a way that the contents mimic what is being done in the documentation example:
I am not sure if I fully got your question...
After saving the 4x14 table from your 2nd post in a txt file named
select.txt
, I am able to obtain the expected results by assigning therownames
to be the content of the first column; just add the argumentrow.names=1
when callingread.table
.Thanks so much for this simple fix. The row.names is what was missing from my read.tables() call. This code now works as expected.