Which input data is acceptable for MAST?
1
0
Entering edit mode
@d1ce014b
Last seen 8 weeks ago
United States

I'd like to run MAST on a single-cell RNA-seq dataset I have of roughly 250,000 cells, but there is ambiguity regarding which data are appropriate to be used input. In the MAST paper, page 10, it mentions the input data are "log2(TPM + 1)." In this comment (https://github.com/RGLab/MAST/issues/147#issuecomment-770277174), Finak says "The input data SHOULD be log2 transformed but NOT scaled (i.e. Normalized). If you do not log2(x+1) transform the data you will have meaningless estimates of log fold change since the data are assumed to be on the log scale." There are multiple questions in the same thread about what other input data may be appropriate, but the only answers are about TPM, not raw RNA counts, as was asked.It seems like he's saying only log2-transformed TPM data, but I'd like to generate some confidence behind that notion or dispel it if I'm misunderstanding. We'd like to try with SCT v2 data, if possible. If I do need to be using log2-TPM data, is there a standard for converting raw RNA counts to TPM or CPM(if appropriate)? I've seen conflicting answers. Thank you.

SCTransform MAST • 304 views
ADD COMMENT
0
Entering edit mode
@andrew_mcdavid-11488
Last seen 12 weeks ago
United States

Not too familiar with SCT v2. If it also has a point-mass at zero and the non-zero component is roughly-symmetric, then it might work ok with MAST. But why not just use (FindAllMarkers)[https://satijalab.org/seurat/reference/findallmarkers]? It uses the appropriately data layer and calls MAST directly.

ADD COMMENT

Login before adding your answer.

Traffic: 522 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6