Question

Handling both technical and biological replicates in Ballgown

0

Entering edit mode

linda.boshans • 0

@lindaboshans-12526

Last seen 7.8 years ago

Hello,

I am using the Hisat2 - Stringtie - Ballgown pipeline that was published in nature protocols in 2016. When I upload my ballgown data in to r, I have 4 lane technical replicates per biological sample, in addition to 3 biological samples per condition (2 conditions total). This leads to 24 "samples" as Ballgown calls it. I have denoted which sample are tech_reps and biol_reps using the pData function. However, I'd like to collapse the technical replicates through average expression so that I'm left with 6 samples, 3 biological replicates per condition. I am not familiar working with S4 objects. I was able to get the average expression by using rowMeans(subset(bg@expr$trans, select = c(my columns)). I had averaged all technical replicates for each biological replicate, and then took a subset of that to eliminate the tech_rep columns. However, when I fed that bg object into stattest, I got the following error:

Error in `[.data.frame`(x, r, vars, drop = drop) :
undefined columns selected

Which leads me to believe I am doing this incorrectly. Any help would be greatly appreciated!!

ballgown biologicalreplicates technical replicates stringtie • 3.0k views

ADD COMMENT • link updated 7.2 years ago by Alyssa Frazee ▴ 210 • written 8.0 years ago by linda.boshans • 0

score 0 · Answer 1 · 2017-03-13

0

Entering edit mode

Jeff Leek ▴ 650

@jeff-leek-5015

Last seen 4.0 years ago

United States

Hello I think the way to handle this would be to indicate the technical reps in the model matrix. If you include them as factor terms in the matrix then you will get a similar result to if you averaged them before analyzing. Hope that helps! Jeff On Thu, Mar 9, 2017, 7:16 PM linda.boshans [bioc] <noreply@bioconductor.org> wrote: > Activity on a post you are following on support.bioconductor.org > > User linda.boshans <https: support.bioconductor.org="" u="" 12526=""/> wrote Question: > Handling both technical and biological replicates in Ballgown > <https: support.bioconductor.org="" p="" 93661=""/>: > > Hello, > > I am using the Hisat2 - Stringtie - Ballgown pipeline that was published > in nature protocols in 2016. When I upload my ballgown data in to r, I > have 4 lane technical replicates per biological sample, in addition to 3 > biological samples per condition (2 conditions total). This leads to 24 > "samples" as Ballgown calls it. I have denoted which sample are tech_reps > and biol_reps using the pData function. However, I'd like to collapse the > technical replicates through average expression so that I'm left with 6 > samples, 3 biological replicates per condition. I am not familiar working > with S4 objects. I was able to get the average expression by using > rowMeans(subset(bg@expr$trans, select = c(my columns)). I had averaged > all technical replicates for each biological replicate, and then took a > subset of that to eliminate the tech_rep columns. However, when I fed that > bg object into stattest, I got the following error: > > Error in `[.data.frame`(x, r, vars, drop = drop) : > undefined columns selected > > Which leads me to believe I am doing this incorrectly. Any help would be > greatly appreciated!! > ------------------------------ > > Post tags: ballgown, biologicalreplicates, technical replicates, stringtie > > You may reply via email or visit Handling both technical and biological replicates in Ballgown >

ADD COMMENT • link 7.9 years ago Jeff Leek ▴ 650

0

Entering edit mode

Hi Jeff,

Thank you for your response, but I don't quite understand. Could you please elaborate? How do I include the technical replicates as factors and how would that address the replicate issue? Do I assign the tech reps as a separate column in pData?

I have the following pData table:

id condition replicate
1 D1_01 Dlx2 D1
2 D1_02 Dlx2 D1
3 D1_03 Dlx2 D1
4 D1_04 Dlx2 D1
5 D2_01 Dlx2 D2
6 D2_02 Dlx2 D2
7 D2_03 Dlx2 D2
8 D2_04 Dlx2 D2
9 D3_01 Dlx2 D3
10 D3_02 Dlx2 D3
11 D3_03 Dlx2 D3
12 D3_04 Dlx2 D3
13 V1_01 Vector V1
14 V1_02 Vector V1
15 V1_03 Vector V1
16 V1_04 Vector V1
17 V2_01 Vector V2
18 V2_02 Vector V2
19 V2_03 Vector V2
20 V2_04 Vector V2
21 V4_01 Vector V4
22 V4_02 Vector V4
23 V4_03 Vector V4
24 V4_04 Vector V4

So when i run the stattest : results_transcripts = stattest(bg_filt, feature="transcript", covariate="condition", adjustvars = "replicate", getFC=TRUE, meas="FPKM")

I get the following error: Coefficients not estimable: replicateV4
Error in solve.default(t(mod) %*% mod) :
system is computationally singular: reciprocal condition number = 5.58641e-27

which makes me believe I am doing something incorrect. I've perused through ALL topics that address tech reps in ballgown and it all leads back to denoting a replicate column in pData(bg), which I have already done.

ADD REPLY • link 7.8 years ago linda.boshans • 0

0

Entering edit mode

Probably a bit late, but just for future reference. The `adjustvars` option takes a vector, not a string. If you change it to

adjustvars=c("replicate")

It should work.

ADD REPLY • link 7.5 years ago o.diogosilva • 0

0

Entering edit mode

I Linda, I am facing same error which you are facing here. Error is

Coefficients not estimable: technicalreplicateNBM9
Error in solve.default(t(mod) %*% mod) :
system is computationally singular: reciprocal condition number = 1.51496e-27
Calls: stattest -> f.pvalue -> solve -> solve.default
In addition: Warning message:
Partial NA coefficients for 47686 probe(s)
Execution halted

Can you please tell me how did you solved it?

Thanks,

Sandeep

ADD REPLY • link 7.4 years ago sandybioteck • 0

score 0 · Answer 2 · 2017-11-11

0

Entering edit mode

Alyssa Frazee ▴ 210

@alyssa-frazee-6710

Last seen 4.3 years ago

San Francisco, CA, USA

The "system is computationally singular" error generally means that either one of the variables (in either "covariate" or "adjustvars") has the same value for every sample, or that two variables that are multiples of each other are in that list. Hope this helps!

ADD COMMENT • link 7.3 years ago Alyssa Frazee ▴ 210

score 0 · Answer 3 · 2017-11-27

0

Entering edit mode

jnorth • 0

@jnorth-14485

Last seen 7.2 years ago

Hi Alyssa,

Thank you for all your feedback and help over the years.

How do you exclude genes that possess this property, if possible? I am having difficulty finding the whereabouts for the line/s of code to achieve this.

Kind regards, Julian

ADD COMMENT • link 7.2 years ago jnorth • 0

score 0 · Answer 4 · 2017-11-29

Hey Julian, the issue isn't that genes need to be excluded. If you get the "system is computationally singular" error, it means that one of your adjustvars" is perfectly correlated with either another adjustvar or the "covariate" variable (you can see this in the pData -- if there's any combination of adjustvars + covariate where there's only one example of that combination in your pData, you'll need to either gather more data or define a different set of adjustment variables).