TCGA data analysis
1
0
Entering edit mode
lily ▴ 20
@lily-11438
Last seen 2.9 years ago
India

I have RSEM-normalized-log2 transform data downloaded from Firehose and I found that there are number of missing data and filled as NAs. However, when I checked the raw counts for the same datasets, it was given as 0. So, for downstream analysis can I convert all the NAs as 0. Please guide me.

RSEM • 1.3k views
ADD COMMENT
0
Entering edit mode
Kevin Blighe ★ 3.9k
@kevin
Last seen 6 days ago
Republic of Ireland

I would check the accompanying notes to see exactly what post-processing has been performed on these by the Broad Institute. It would seem likely, based on the information that you provide, that they decided to convert values of 0 to NA to avoid producing a 'negative infinity' (log2(0) == -Inf).

However, if you have raw counts already, then why not use those? - these can easily be used with EdgeR or DESeq2.

TCGA raw HTSeq counts are also held at UCSC's Xena Browser.

Kevin

ADD COMMENT
0
Entering edit mode

Thank you for the response. I have taken the normalised data so that I can proceed with the feature selection and machine learning approach directly. But the problem here is there are so many missing values and I am not able to discriminate the two classes with better accuracy, sensitivity and specificity. Also, I have done the imputation method (mean), here I got the very high accuracy. So please suggest me should I take the raw counts data and perform the pre-processing steps.

ADD REPLY
0
Entering edit mode

So please suggest me should I take the raw counts data and perform the pre-processing steps.

You could try it, if you have time, and then come back with the answer if possible. There will likely be a difference between using the RSEM values and those values produced via a standard EdgeR or DESeq2 normalisation + transformation.

ADD REPLY
1
Entering edit mode

Let me try.

ADD REPLY

Login before adding your answer.

Traffic: 747 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6