#### Posts by wunderl

1
175
views
1
... Thanks, that makes sense. I can confirm from my data that the 'outlier' sample from the raw count plot had both the largest library size and highest variance. What fully convinced me was doing a simple normalization by library size. This ended up producing the same pattern of sample distances tha ...
written 3 months ago by wunderl20
2
249
views
2
... You are likely getting this error because of log=TRUE in your call to cpm(). The log of anything less than 1 will be negative, so if after normalization you end up with any fractional counts they will produce a negative number when you take the log. This issue also occurs with DESeq2's rl ...
written 4 months ago by wunderl20
2
249
views
2
... If you want to truly compare pipeline performance, you should let each method do the normalization it's own way. Each method has been designed with its own philosophy and underlying assumptions, so mixing parts of one method with another is likely going to give you sub-optimal performance. (or, if ...
written 4 months ago by wunderl20
1
175
views
1
... ::Edit:: The embedded images are not formatting well, so I have changed them to external links. I have been following the [DESeq2 vignette][1] for bulk RNA-Seq data and got the to section where it is recommended you make a [heatmap of the euclidian distances between samples][2]. In addition to the ...
written 4 months ago by wunderl20 • updated 4 months ago by Michael Love26k
1
153
views
1
... Thank you for the elaboration. I didn't realize removeBatchEffect operated at the log-transformed level, I can see now why it is distinct from the suite of tools DESeq2 provides. The point about GC bias was very interesting as well. I am very new to this sort of space so I really appreciate th ...
written 5 months ago by wunderl20
2
241
views
2
... I am new to DESeq2 myself so take this answer with a grain of salt. I was initially confused by the difference between a Wald test and a t-test as well. To explain why a Wald test is used we first need to explain why we are fitting a generalized linear model (GLM). A **lot** is going on ...
written 5 months ago by wunderl20
1
153
views
1
... Thanks Michael, I really appreciate the help!! I will look into Salmon. So far my pipeline has been STAR --> featureCounts --> DESeq2. Would you say that not correcting for GC bias is a significant oversight? Also, is there a reason DESeq2 does not have anything analogous to limma ...
written 5 months ago by wunderl20
1
150
views
1
... I have the following design matrix:  batch treatment A 1 A 2 A 3 B 1 B 2 B 3 C 1 C 2 C 3 > batch [1] A A A B B B C C C Levels: A B C > treatment [1] 1 2 3 1 2 3 1 2 3 Levels: 1 2 3  I wo ...
written 6 months ago by wunderl20 • updated 5 months ago by Michael Love26k
1
153
views
1
... I am analyzing bulk RNA data from neurons derived from iPSC cells. The experimental design matrix is as follows: | Lineage |     | Method ------------- | ------------ A | |1 A | | 2 A | | 3 B | | 1 B | | 2 B | | 3 C | | 1 C | | 2 C | | 3 Here lineage indicates what iPSC line the neuronal ...
written 6 months ago by wunderl20 • updated 5 months ago by Michael Love26k
1
141
views
1
... When using Rsubread v 1.28.1 it sucessfully parsed my input file names by removing the file path and just keeping the names. As of the current version, however, the names contain the entire file path, this makes for very long and ugly column names. Example ---- Setup --- files: /path/to/file/here/ ...
written 6 months ago by wunderl20 • updated 6 months ago by Gordon Smyth39k

