Question: Different order of samples in output of snpgdsSlidingWindow to calculate Fst and aggregation method
0
gravatar for serpalma.v
4 months ago by
serpalma.v50
Germany
serpalma.v50 wrote:

Hello!

I am using SNPRelate to calculate Fst for sliding windows. There are two things that I cannot find information about.

(1) If I pass a set of samples having a specific order, for example and their corresponding populations:

> samps
 [1] "H07750-L1" "H07754-L1" "H07760-L1" "H07775"    "H07762-L1" "H07782-L1"
 [7] "H07758-L1" "H07792-L1" "H07793-L1" "H07742-L1" "H07751-L1" "H07784"
[13] "H07746-L1" "H07767-L1" "H07781-L1" "H07741-L1" "H07779-L1" "H07748-L1"
[19] "H07778"    "H07773-L1"

> pops
 [1] pop1 pop1 pop1 pop1 pop1 pop1 pop1 pop1 pop1 pop1 pop2 pop2 pop2 pop2 pop2 pop2 pop2 pop2 pop2 pop2
Levels: pop1 pop2

After running the command:

res <- snpgdsSlidingWindow(genofile, winsize = 500000, shift = 250000, FUN ="snpgdsFst",sample.id = samps, population=pops, method = "W&C84")

The order of the samples is changed (sorted) in the output:

> res$sample.id
 [1] "H07741-L1" "H07742-L1" "H07746-L1" "H07748-L1" "H07750-L1" "H07751-L1"
 [7] "H07754-L1" "H07758-L1" "H07760-L1" "H07762-L1" "H07767-L1" "H07773-L1"
[13] "H07775"    "H07778"    "H07779-L1" "H07781-L1" "H07782-L1" "H07784"
[19] "H07792-L1" "H07793-L1"

I'm not sure what this means:

  • Is this the order in which samples are assigned to the argument population? --> not desired
  • res$sample.id just shows the samples that were used, but they were assigned to population as originally intended.

(2) Finally, how is the Fst window score calculated, is it the arithmetic mean of all Fst scores within?

Thanks in advance

snprelate • 95 views
ADD COMMENTlink modified 4 months ago by zhengx30 • written 4 months ago by serpalma.v50
Answer: Different order of samples in output of snpgdsSlidingWindow to calculate Fst and
0
gravatar for zhengx
4 months ago by
zhengx30
United States
zhengx30 wrote:

SNPRelate re-orders "population" internally according to the order of sample IDs. res$sample.id is the sample order in the GDS file.

If you are not sure whether the order of population is correct, you could order your input sample IDs as the order in the GDS file and provide population information according to your sample IDs.

See the function "snpgdsFst", there are two Fst (weighted Fst, mean Fst), snpgdsSlidingWindow() returns weighted Fst ("W&C84" suggests).

ADD COMMENTlink written 4 months ago by zhengx30
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 16.09
Traffic: 168 users visited in the last hour