I have a RADseq data set (from Stacks) which has no chromosome identifiers for the SNP calls. When importing the VCF file I get an error message due to this. So, I converted the "unk" chromosome information in the VCF into 1s. I can now load the file. When I filter out the linked SNPs I get around 400 SNPs remaining out of 40000. My PCA analysis shows virtually no structure when we were expecting strong signal and the top principal components are low and do not drop significantly after the first two (they are all around 5%). If I run the data without filtering out the linked SNPs I get a similar result which seems even more surprising.
Does anyone have any thoughts on the source of these odd results?