Entering edit mode
detroit.drive
•
0
@detroitdrive-16757
Last seen 4.9 years ago
I'm using the ALL dataset from Bioconductor. My task is to convert the “BT” attribute to categorical variables with just two levels, “B” and “T” and then test all genes for significant association with B/T disease subtype using ANOVA and then plot the p-values.
As a solution I performed the following:
> bcell = grep("^B", as.factor(ALL$BT))
> tcell = grep("^T", as.factor(ALL$BT))
Then I adapted a snippet of code that I've also used for age:
> anova.lm.bcell <- function(x) {
+ df.tmp <-data.frame(Expr=x,bcell)
+ anova(lm(bcell~Expr,df.tmp)) ["Expr", "Pr(>F)"]
+ }
But when I assess B-cell's role (from BT) in gene expression with the code below I get the awful "differing number of rows" message!
p.bcell <- apply(exprs(ALL), 1, anova.lm.bcell)
Error in data.frame(Expr = x, bcell) :
arguments imply differing number of rows: 128, 95
This makes perfect sense, I have 95 "B" cells and 33 "T" cells but I cannot develop a script that will work?