Question: R: estimating error rate of replicated samples
0
gravatar for meriam.nef
9 months ago by
meriam.nef0
meriam.nef0 wrote:

Hi, I am studying the genetic diversity of a population and I'm using R for filtering my genotypic data (Single Nucleotide Polymorphism/SNP), dendrogram construction and estimating the error rate between replicated samples (duplicates/triplicates).

My code :

mat <- 1-ibsmat
mat1 <- mat[-99:-100,-99:-100]

#ERROR RATE FOR triplicates
x <-  seq(1,274,3)
err.r <-  rep(NA, length(x))
for (i in 1:(length(x)-1)){
  k <- x[i]
  k1=k+2
  ibx <- mat1[k:k1,k:k1]
  print(ibx)
  err.r[i] <- mean(ibx[lower.tri(ibx)])
}
errorrate <- mean(na.omit(err.r))

#ERROR RATE FOR duplicates
x <-  seq(1,274,2)
err.r <-  rep(NA, length(x))
for (i in 1:(length(x)-1)){
  k <- x[i]
  k1=k+1
  ibx <- mat1[k:k1,k:k1]
  print(ibx)
  err.r[i] <- mean(ibx[lower.tri(ibx)])
}
errorrate <- mean(na.omit(err.r))

My questions are: 1) my .csv document should contain only sorted triplicates and duplicates or all data (duplicates, triplicates, no replicated samples)

2) Should I filter that csv.document before estimating error rate? what I mean by filteriing is %NA (missing) by genotypes for example.

3) If there's an error in my code, please feel free to comment.

Thanks, Meriam

ADD COMMENTlink modified 9 months ago • written 9 months ago by meriam.nef0
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 16.09
Traffic: 215 users visited in the last hour