Entering edit mode
Dear Naima,
You are right, I must have missed this.
Please replace "ds <- rbind(ds, tmp)" with:
ds <- rbind(ds, tmp[setdiff(rownames(tmp),rownames(ds)),])
However, please note that it does not matter since lateron I intersect
the rownames with the rownames of the expression data. Furthermore,
please note that this was only a trial to compare the three arrays and
there is no warranty that "script4bestmatch.R" correct. Other people
might have better solutions to compare the three arrays based on the
BestMatch.txt files of Affymetrix.
Best regards
Christian
On 11/4/10 11:28 AM, Na?ma Oumouhou wrote:
> Dear Christian,
>
> I read your vignette ? Introduction to the xps Package: Comparison
to
> Affymetrix Power Tools ? and I tried to compare 2 gene expression
arrays
> : U133 Plus 2 andHuman Gene ST 1.
>
> I followed your R instructions in the script ?script4bestmatch.R?.
But I
> noticed something strange in my output.
>
> I downloaded ?U133PlusVsHuGene_BestMatch.txt? in Affymetrix website.
>
> My instructions are :
>
> #Function "uniqueframe"
>
> uniqueframe <- function(ma) {
>
> maxunique <- function(id, m) {
>
> m <- m[which(m[,1] == id),];
>
> m <- m[which(m[,2] == max(m[,2])),];
>
> return(m[1,]);
>
> }
>
> dup <- duplicated(ma[,1])
>
> uni <- unique(ma[dup,1])
>
> ds <- NULL
>
> for (i in uni) {ds <- rbind(ds, maxunique(i,ma))}
>
> tmp <- ma[dup==F,]
>
> ds <- rbind(ds, tmp)
>
> ds <- ds[order(rownames(ds)),]
>
> return(ds)
>
> }
>
> # Importation of "U133PlusVsHuGene_BestMatch.txt"
>
> up2hg<-read.delim("D:/Naima/CancerMoelleOsseuse_EFS/Analyse_Package_
XPS/U133PlusVsHuGene_BestMatch.txt",row.names=3,comment.char="")
>
> dim(up2hg)
>
> [1] 2912919
>
> up2hg<-up2hg[,5:6]
>
> up2hg_cor<-uniqueframe(up2hg)
>
> colnames(up2hg_cor)<-c("HuGene","PercentU2G")
>
> dim(up2hg_cor)
>
> [1] 252512
>
> write.csv2(up2hg_cor,"D:/Naima/CancerMoelleOsseuse_EFS/Outputs/Probe
sets_U133PlusVsHuGene.csv")
>
> The initial data frame ?up2hg? contains 29 129 lines and when I do
the
> instruction ?uniqueframe?, the data frame obtaining is composed of
25251
> lines. But the number of unique probesets for human Gene array is
17984.
>
> When I see the output (Probesets_U133PlusVsHuGene.csv), there is
> something strange:
>
> For example:
>
> U1332P
>
>
>
> HuGene
>
>
>
> PercentU2G
>
> 1552257_a_at
>
>
>
> 8076569
>
>
>
> 99,41
>
> 1552257_a_at1
>
>
>
> 8076569
>
>
>
> 99,41
>
> 1552264_a_at
>
>
>
> 8074791
>
>
>
> 98,42
>
> 1552264_a_at1
>
>
>
> 8074791
>
>
>
> 98,42
>
> There is still duplicated probesets in HuGene probesets and new
> probesets in U1332P are created ?1552257_a_at1?.
>
> I've done something wrong?
>
> Thank you for your help.
>
> Na?ma
>