R: estimating error rate of replicated samples
0
0
Entering edit mode
meriam.nef • 0
@meriamnef-19505
Last seen 5.3 years ago

Hi, I am studying the genetic diversity of a population and I'm using R for filtering my genotypic data (Single Nucleotide Polymorphism/SNP), dendrogram construction and estimating the error rate between replicated samples (duplicates/triplicates).

My code :

mat <- 1-ibsmat
mat1 <- mat[-99:-100,-99:-100]

#ERROR RATE FOR triplicates
x <-  seq(1,274,3)
err.r <-  rep(NA, length(x))
for (i in 1:(length(x)-1)){
  k <- x[i]
  k1=k+2
  ibx <- mat1[k:k1,k:k1]
  print(ibx)
  err.r[i] <- mean(ibx[lower.tri(ibx)])
}
errorrate <- mean(na.omit(err.r))

#ERROR RATE FOR duplicates
x <-  seq(1,274,2)
err.r <-  rep(NA, length(x))
for (i in 1:(length(x)-1)){
  k <- x[i]
  k1=k+1
  ibx <- mat1[k:k1,k:k1]
  print(ibx)
  err.r[i] <- mean(ibx[lower.tri(ibx)])
}
errorrate <- mean(na.omit(err.r))

My questions are: 1) my .csv document should contain only sorted triplicates and duplicates or all data (duplicates, triplicates, no replicated samples)

2) Should I filter that csv.document before estimating error rate? what I mean by filteriing is %NA (missing) by genotypes for example.

3) If there's an error in my code, please feel free to comment.

Thanks, Meriam

R SNP genotyping genetic diversity • 740 views
ADD COMMENT

Login before adding your answer.

Traffic: 891 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6