ColData use only certain variable within
1
0
Entering edit mode
Bine ▴ 20
@bine-23912
Last seen 6 weeks ago
UK

Dear all,

I have been trying many things now, but I think there must be an easier way for my quite simple problem.

My colData has Sample.Site = Heart, Liver, ...

Now for my analysis I just want to compare Sample.Site=Heart between Males and femals. How can I filter the colData in a way that I only have Sample.Site=Heart?

Thank you so much for any idea, Bine

DESeq2 • 468 views
1
Entering edit mode
@james-w-macdonald-5106
Last seen 1 day ago
United States

The SummarizedExperiment class works just like a data.frame or matrix when you use the [ operator. So you would do just what one would expect. Using the output from ?SummarizedExperiment

> colData(se)
DataFrame with 6 rows and 1 column
Treatment
<character>
A        ChIP
B       Input
C        ChIP
D       Input
E        ChIP
F       Input

> se_sub <- se[,colData(se)$Treatment == "Input"] > se_sub class: RangedSummarizedExperiment dim: 200 3 metadata(0): assays(1): counts rownames: NULL rowData names(1): feature_id colnames(3): B D F colData names(1): Treatment > colData(se_sub) DataFrame with 3 rows and 1 column Treatment <character> B Input D Input F Input  You can also make different assumptions, such as assuming (or, like, checking) that the within-group variability is pretty consistent across groups, and just fitting the model in such a way that you can make the between-sex comparisons for heart using all your data. Or alternatively, you can specify a design that stratifies your model fit internally, using the / operator, so something like design <- model.matrix(~Sample.Site/sex + othercovariates - 1, colData(se))  So an explicit stratification of your data is not always necessary or desirable. 1 Entering edit mode Ah, I thought there was a shortcut... You don't need to extract the colData to subset > se_sub <- se[,se$Treatment == "Input"]


Does the same thing

0
Entering edit mode

Thank you. I wonder how would I remove the samples which are not from heart from the count data then?

1
Entering edit mode

Please re-read my answer more carefully. It makes no sense to have a function that subsets one part of the SummarizedExperiment object and leaves another part unchanged. The whole idea behind encapsulating the data in the SummarizedExperiment object is to allow end users to be able to easily subset the object without having to worry if the colData still match up with the columns of the assays, or if the rowRanges still match up with the rows of the assays.

0
Entering edit mode

ok you are saying like this my countdata is already taken care of and I do not need to manipulate it separately. Sorry i am still very new to these summarized experiment.

1
Entering edit mode

There is a vignette for the SummarizedExperiment package. I often make the point that Open Source software is free in the sense that you don't have to pay money, but there is a cost in your time and effort to understand how the software works. If you plan on using R/Bioconductor to any extent, then you will need to get accustomed to seeking out and reading the information that is made available to you, and the vignette is the very first thing you should read. Do note that the point I make above is in the second paragraph of the introduction! So you wouldn't have to read far to already know this.

0
Entering edit mode

Thank you, got it. Just confirming for another person who might read this, my last comment is correct.

0
Entering edit mode

One more question, since you said it works like a dataframe I assume I can do this in case I want to use two heart + lung:

dds0_sub <- dds0[,dds0$Sample.Site == "heart"] dds0_sub_1 <- dds0[,dds0$Sample.Site == "lung"]

dds0_sub<- cbind(dds0_sub,dds0_sub_1)

1
Entering edit mode

Or, rather just:

ddsub <- dds0[, dds0\$Sample.Site %in% c("heart", "lung")]

0
Entering edit mode

thank you :)

0
Entering edit mode

I was today years old when I learned that / did something in R's model.matrix formula.

I guess I shouldn't be surprised since * is a thing, but man ... still wet behind the ears, I guess ...

Thanks for the tutelage, Jim!

1
Entering edit mode

Of late I have been working with Epidemiologists, and if there is something they like more than stratifying I have yet to find it. Oh, wait. Power calculations. So like I was saying, other than power calculations nothing pleases an Epidemiologist more than stratifying. And nothing pleases me less than cutting data up into ever smaller chunks, so...

Notably, the / operator isn't even mentioned in ?formula. I know about it from Modern Applied Statistics with S, because V&R are old school legit.