Search
Question: Handling both technical and biological replicates in Ballgown
0
gravatar for linda.boshans
21 months ago by
linda.boshans0 wrote:

Hello,

I am using the Hisat2 - Stringtie - Ballgown pipeline that was published in nature protocols in 2016.  When I upload my ballgown data in to r, I have 4 lane technical replicates per biological sample, in addition to 3 biological samples per condition (2 conditions total). This leads to 24 "samples" as Ballgown calls it. I have denoted which sample are tech_reps and biol_reps using the pData function. However, I'd like to collapse the technical replicates through average expression so that I'm left with 6 samples, 3 biological replicates per condition. I am not familiar working with S4 objects. I was able to get the average expression by using  rowMeans(subset(bg@expr$trans, select = c(my columns)). I had averaged all technical replicates for each biological replicate, and then took a subset of that to eliminate the tech_rep columns. However, when  I fed that bg object into stattest, I got the following error: 

Error in `[.data.frame`(x, r, vars, drop = drop) : 
  undefined columns selected

Which leads me to believe I am doing this incorrectly. Any help would be greatly appreciated!! 

ADD COMMENTlink modified 12 months ago by Alyssa Frazee200 • written 21 months ago by linda.boshans0
0
gravatar for Jeff Leek
21 months ago by
Jeff Leek520
United States
Jeff Leek520 wrote:
Hello I think the way to handle this would be to indicate the technical reps in the model matrix. If you include them as factor terms in the matrix then you will get a similar result to if you averaged them before analyzing. Hope that helps! Jeff On Thu, Mar 9, 2017, 7:16 PM linda.boshans [bioc] <noreply@bioconductor.org> wrote: > Activity on a post you are following on support.bioconductor.org > > User linda.boshans <https: support.bioconductor.org="" u="" 12526=""/> wrote Question: > Handling both technical and biological replicates in Ballgown > <https: support.bioconductor.org="" p="" 93661=""/>: > > Hello, > > I am using the Hisat2 - Stringtie - Ballgown pipeline that was published > in nature protocols in 2016. When I upload my ballgown data in to r, I > have 4 lane technical replicates per biological sample, in addition to 3 > biological samples per condition (2 conditions total). This leads to 24 > "samples" as Ballgown calls it. I have denoted which sample are tech_reps > and biol_reps using the pData function. However, I'd like to collapse the > technical replicates through average expression so that I'm left with 6 > samples, 3 biological replicates per condition. I am not familiar working > with S4 objects. I was able to get the average expression by using > rowMeans(subset(bg@expr$trans, select = c(my columns)). I had averaged > all technical replicates for each biological replicate, and then took a > subset of that to eliminate the tech_rep columns. However, when I fed that > bg object into stattest, I got the following error: > > Error in `[.data.frame`(x, r, vars, drop = drop) : > undefined columns selected > > Which leads me to believe I am doing this incorrectly. Any help would be > greatly appreciated!! > ------------------------------ > > Post tags: ballgown, biologicalreplicates, technical replicates, stringtie > > You may reply via email or visit Handling both technical and biological replicates in Ballgown >
ADD COMMENTlink written 21 months ago by Jeff Leek520

Hi Jeff,

Thank you for your response, but I don't quite understand. Could you please elaborate? How do I include the technical replicates as factors and how would that address the replicate issue? Do I assign the tech reps as a separate column in pData? 

I have the following pData table:

      id condition replicate
1  D1_01      Dlx2        D1
2  D1_02      Dlx2        D1
3  D1_03      Dlx2        D1
4  D1_04      Dlx2        D1
5  D2_01      Dlx2        D2
6  D2_02      Dlx2        D2
7  D2_03      Dlx2        D2
8  D2_04      Dlx2        D2
9  D3_01      Dlx2        D3
10 D3_02      Dlx2        D3
11 D3_03      Dlx2        D3
12 D3_04      Dlx2        D3
13 V1_01    Vector        V1
14 V1_02    Vector        V1
15 V1_03    Vector        V1
16 V1_04    Vector        V1
17 V2_01    Vector        V2
18 V2_02    Vector        V2
19 V2_03    Vector        V2
20 V2_04    Vector        V2
21 V4_01    Vector        V4
22 V4_02    Vector        V4
23 V4_03    Vector        V4
24 V4_04    Vector        V4

So when i run the stattest : results_transcripts = stattest(bg_filt, feature="transcript", covariate="condition", adjustvars = "replicate", getFC=TRUE, meas="FPKM")

I get the following error: Coefficients not estimable: replicateV4 
Error in solve.default(t(mod) %*% mod) : 
  system is computationally singular: reciprocal condition number = 5.58641e-27

which makes me believe I am doing something incorrect. I've perused through ALL topics that address tech reps in ballgown and it all leads back to denoting a replicate column in pData(bg), which I have already done. 

ADD REPLYlink modified 20 months ago • written 20 months ago by linda.boshans0

Probably a bit late, but just for future reference. The `adjustvars` option takes a vector, not a string. If you change it to

adjustvars=c("replicate")

It should work.

ADD REPLYlink modified 16 months ago • written 16 months ago by o.diogosilva0

I Linda, I am facing same error which you are facing here. Error is

Coefficients not estimable: technicalreplicateNBM9 
Error in solve.default(t(mod) %*% mod) : 
  system is computationally singular: reciprocal condition number = 1.51496e-27
Calls: stattest -> f.pvalue -> solve -> solve.default
In addition: Warning message:
Partial NA coefficients for 47686 probe(s) 
Execution halted

 

 

Can you please tell me how did you solved it?

 

Thanks,

Sandeep

 

ADD REPLYlink written 14 months ago by sandybioteck0
0
gravatar for Alyssa Frazee
13 months ago by
Alyssa Frazee200
San Francisco, CA, USA
Alyssa Frazee200 wrote:

The "system is computationally singular" error generally means that either one of the variables (in either "covariate" or "adjustvars") has the same value for every sample, or that two variables that are multiples of each other are in that list. Hope this helps!

ADD COMMENTlink written 13 months ago by Alyssa Frazee200
0
gravatar for jnorth
12 months ago by
jnorth0
jnorth0 wrote:

Hi Alyssa,

Thank you for all your feedback and help over the years. 

How do you exclude genes that possess this property, if possible? I am having difficulty finding the whereabouts for the line/s of code to achieve this.

Kind regards, Julian

ADD COMMENTlink written 12 months ago by jnorth0
0
gravatar for Alyssa Frazee
12 months ago by
Alyssa Frazee200
San Francisco, CA, USA
Alyssa Frazee200 wrote:

Hey Julian, the issue isn't that genes need to be excluded. If you get the "system is computationally singular" error, it means that one of your adjustvars" is perfectly correlated with either another adjustvar or the "covariate" variable (you can see this in the pData -- if there's any combination of adjustvars + covariate where there's only one example of that combination in your pData, you'll need to either gather more data or define a different set of adjustment variables).

ADD COMMENTlink written 12 months ago by Alyssa Frazee200
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.2.0
Traffic: 294 users visited in the last hour