how to change file format
2
0
Entering edit mode
weinong han ▴ 270
@weinong-han-1250
Last seen 9.7 years ago
An embedded and charset-unspecified text was scrubbed... Name: not available Url: https://stat.ethz.ch/pipermail/bioconductor/attachments/20050725/ 48f382d4/attachment.pl
• 845 views
ADD COMMENT
0
Entering edit mode
@uri-david-akavia-1277
Last seen 9.7 years ago
If you have a UNIX system you can use AWK. Assuming that the original file (ORIGINAL) is seperated by tabs, I would use something like (in one line) cat ORIGINAL | awk -F"\t" '{print $1"\t"$2" - "$3"\t"$4"\t"$5"\t"$6"\t"$7"\t"$8"\t"$9"\t"$10}' If the original file is seperated by something else, say commas, replace the F"\t" with the appropriate seperator (F"," and so forth). Or you could try using something like EXCEL. I'm not sure R would be very useful, since I believe it would have to read the entire file into memory, which might be slow. Yours, Uri David Akavia weinong han wrote: > Dear All, > > My question seems not to be fit for the mail list, however, I really need your help. Crouching tigers and Hidden dragons are There! > > Now ,I have the file format including 10 headers(gene, name, description, arry1,array2...array7) > Gene Name Descriptin Array 1 Array 2 Array 3 Array 4 Array 5 Array 6 Array 7 > Gene 1 Name 1 Description 1 0.2 -0.1 -1.1 0.4 -4 -2 0.2 > Gene 2 Name 2 Description 2 2.3 2.1 -3 1.1 1.2 -1.6 0.1 > Gene 3 Name 3 Description 3 0.1 1.6 1.2 1.5 2.7 0.4 -0.4 > Gene 4 Name 4 Description 4 0.3 -1.5 -1.7 0.2 0.4 2 -2.1 > Gene 5 Name 5 Description 5 1.7 2.3 2.3 2.3 3 -2 2.1 > Gene 6 Name 6 Description 6 0.2 4 4 4 0.2 -3 -4 > Gene 7 Name 7 Description 7 -0.3 1.5 1.5 1.5 -0.2 1.7 3 > Gene 8 Name 8 Description 8 1.4 -0.6 -1.1 -0.3 -3 -3 1.4 > > I want to get the following file format: > > > Gene Name Array 1 Array 2 Array 3 Array 4 Array 5 Array 6 Array 7Gene 1 Name 1 - Description 1 0.2 -0.1 -1.1 0.4 -4 -2 0.2Gene 2 Name 2 - Description 2 2.3 2.1 -3 1.1 1.2 -1.6 0.1Gene 3 Name 3 - Description 3 0.1 1.6 1.2 1.5 2.7 0.4 -0.4Gene 4 Name 4 - Description 4 0.3 -1.5 -1.7 0.2 0.4 2 -2.1Gene 5 Name 5 - Description 5 1.7 2.3 2.3 2.3 3 -2 2.1Gene 6 Name 6 - Description 6 0.2 4 4 4 0.2 -3 -4Gene 7 Name 7 - Description 7 -0.3 1.5 1.5 1.5 -0.2 1.7 3Gene 8 Name 8 - Description 8 1.4 -0.6 -1.1 -0.3 -3 -3 1.4 > > in the above file format,The first row is a header row, where the names of the > > arrays/experiments are specified from column 3 and on. The second row and on specify > > expression data for each gene, where the first column is the unique identifier of each gene, > > the second column specifies the name and the description of the gene, where the name > > and description are separated by " - " (the surrounding spaces are important), and column 3 > > and on specify the expression data for the gene across all experiments. > > thanks much for your help in advance > > Any suggestions and advice will be much appreicated. > > > > Best Regards > > Han Weinong > hanweinong at yahoo.com > > __________________________________________________ > > > > [[alternative HTML version deleted]] > > _______________________________________________ > Bioconductor mailing list > Bioconductor at stat.math.ethz.ch > https://stat.ethz.ch/mailman/listinfo/bioconductor > >
ADD COMMENT
0
Entering edit mode
@adaikalavan-ramasamy-675
Last seen 9.7 years ago
If I understand your question, this is probably what you want. df <- read.delim( file="lala.txt", row.names=NULL ) This will read in a tab delimited file. If your file is comma separated values or other formats see help(read.csv) or help(read.table). At this point, R will automatically assign rownames from 1,2,...,8 but we can ignore this. new <- paste( df[ , "Name"], df[ , "Description"], sep=" - ") df <- cbind( df[ , -c(2,3)], "Name - Description"=new ) write.table( df, file="modified_lala.txt", sep="\t", quote=FALSE, row.names=FALSE ) Hopefully this should do the trick. If it does not then try changing quote=FALSE or some other parameters. At this point I would strongly you read help(subset) and the Introduction to R (http://cran.r-project.org/doc/manuals/R-intro.html). Regards, Adai On Mon, 2005-07-25 at 22:54 -0700, weinong han wrote: > Dear All, > > My question seems not to be fit for the mail list, however, I really need your help. Crouching tigers and Hidden dragons are There! > > Now ,I have the file format including 10 headers(gene, name, description, arry1,array2...array7) > Gene Name Descriptin Array 1 Array 2 Array 3 Array 4 Array 5 Array 6 Array 7 > Gene 1 Name 1 Description 1 0.2 -0.1 -1.1 0.4 -4 -2 0.2 > Gene 2 Name 2 Description 2 2.3 2.1 -3 1.1 1.2 -1.6 0.1 > Gene 3 Name 3 Description 3 0.1 1.6 1.2 1.5 2.7 0.4 -0.4 > Gene 4 Name 4 Description 4 0.3 -1.5 -1.7 0.2 0.4 2 -2.1 > Gene 5 Name 5 Description 5 1.7 2.3 2.3 2.3 3 -2 2.1 > Gene 6 Name 6 Description 6 0.2 4 4 4 0.2 -3 -4 > Gene 7 Name 7 Description 7 -0.3 1.5 1.5 1.5 -0.2 1.7 3 > Gene 8 Name 8 Description 8 1.4 -0.6 -1.1 -0.3 -3 -3 1.4 > > I want to get the following file format: > > > Gene Name Array 1 Array 2 Array 3 Array 4 Array 5 Array 6 Array 7Gene 1 Name 1 - Description 1 0.2 -0.1 -1.1 0.4 -4 -2 0.2Gene 2 Name 2 - Description 2 2.3 2.1 -3 1.1 1.2 -1.6 0.1Gene 3 Name 3 - Description 3 0.1 1.6 1.2 1.5 2.7 0.4 -0.4Gene 4 Name 4 - Description 4 0.3 -1.5 -1.7 0.2 0.4 2 -2.1Gene 5 Name 5 - Description 5 1.7 2.3 2.3 2.3 3 -2 2.1Gene 6 Name 6 - Description 6 0.2 4 4 4 0.2 -3 -4Gene 7 Name 7 - Description 7 -0.3 1.5 1.5 1.5 -0.2 1.7 3Gene 8 Name 8 - Description 8 1.4 -0.6 -1.1 -0.3 -3 -3 1.4 > > in the above file format,The first row is a header row, where the names of the > > arrays/experiments are specified from column 3 and on. The second row and on specify > > expression data for each gene, where the first column is the unique identifier of each gene, > > the second column specifies the name and the description of the gene, where the name > > and description are separated by " - " (the surrounding spaces are important), and column 3 > > and on specify the expression data for the gene across all experiments. > > thanks much for your help in advance > > Any suggestions and advice will be much appreicated. > > > > Best Regards > > Han Weinong > hanweinong at yahoo.com > > __________________________________________________ > > > > [[alternative HTML version deleted]] > > _______________________________________________ > Bioconductor mailing list > Bioconductor at stat.math.ethz.ch > https://stat.ethz.ch/mailman/listinfo/bioconductor >
ADD COMMENT

Login before adding your answer.

Traffic: 568 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6