remove duplicated rows generated by read.ilmn()

0

Entering edit mode

Rao,Xiayu ▴ 550

@raoxiayu-6003

Last seen 8.9 years ago

United States

Hello, It is interesting to see that read.ilmn() generates some extra rows. It turns out that the extra rows are duplicates, possibly due to the different decimal points from different data sets when I combined the data from different chips together. How do I remove the duplicates (keep only the first row) for the ElistRaw class, x ?? I simply merged different data sets to keep only shared probeIDs and SYMBOLs, and I removed the rows with blank values for the SYMBOL column. library(limma) x <- read.ilmn(files="all_samples.txt",ctrlfiles="all_samples_cont.txt ",other.columns="Detection") x > x$E[rownames(x$E)=="1430239",] X8508853077_D X8508853077_G X8909358290_B X8909358290_C X8909358290_E X8909358290_F X8909358290_G X8909358290_H X8909358290_I 1430239 9375.439 6950.326 2623.736 2923.202 13051.55 9663.131 9752.901 6377.608 7601.526 1430239 9375.400 6950.300 2623.700 2923.200 13051.60 9663.100 9752.900 6377.600 7601.500 X167.H1 X403.H1 X495.E6 X527.E5 X544.B8 X619.C7 X625.D3 X340.H1 X648.B4 1430239 11586.2 12608.4 13100.6 10750.6 13198.5 12399.2 17938.1 14340.3 17329 1430239 11586.2 12608.4 13100.6 10750.6 13198.5 12399.2 17938.1 14340.3 17329 NOTE: R and limma are in the latest version. Thank you very much! Thanks, Xiayu [[alternative HTML version deleted]]

limma limma • 786 views

ADD COMMENT • link 9.7 years ago Rao,Xiayu ▴ 550

0

Entering edit mode

Rao,Xiayu ▴ 550

@raoxiayu-6003

Last seen 8.9 years ago

United States

Hello, Sorry for my last email. The extra rows are pointing to the control probes, which will be gone after background correction and normalization by neqc(). Thanks, Xiayu -----Original Message----- From: bioconductor-bounces@r-project.org [mailto:bioconductor- bounces@r-project.org] On Behalf Of Rao,Xiayu Sent: Tuesday, July 29, 2014 12:59 PM To: 'bioconductor at r-project.org' Subject: [BioC] remove duplicated rows generated by read.ilmn() Hello, It is interesting to see that read.ilmn() generates some extra rows. It turns out that the extra rows are duplicates, possibly due to the different decimal points from different data sets when I combined the data from different chips together. How do I remove the duplicates (keep only the first row) for the ElistRaw class, x ?? I simply merged different data sets to keep only shared probeIDs and SYMBOLs, and I removed the rows with blank values for the SYMBOL column. library(limma) x <- read.ilmn(files="all_samples.txt",ctrlfiles="all_samples_cont.txt ",other.columns="Detection") x > x$E[rownames(x$E)=="1430239",] X8508853077_D X8508853077_G X8909358290_B X8909358290_C X8909358290_E X8909358290_F X8909358290_G X8909358290_H X8909358290_I 1430239 9375.439 6950.326 2623.736 2923.202 13051.55 9663.131 9752.901 6377.608 7601.526 1430239 9375.400 6950.300 2623.700 2923.200 13051.60 9663.100 9752.900 6377.600 7601.500 X167.H1 X403.H1 X495.E6 X527.E5 X544.B8 X619.C7 X625.D3 X340.H1 X648.B4 1430239 11586.2 12608.4 13100.6 10750.6 13198.5 12399.2 17938.1 14340.3 17329 1430239 11586.2 12608.4 13100.6 10750.6 13198.5 12399.2 17938.1 14340.3 17329 NOTE: R and limma are in the latest version. Thank you very much! Thanks, Xiayu [[alternative HTML version deleted]] _______________________________________________ Bioconductor mailing list Bioconductor at r-project.org https://stat.ethz.ch/mailman/listinfo/bioconductor Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor

ADD COMMENT • link 9.7 years ago Rao,Xiayu ▴ 550

Login before adding your answer.