Question: How to automate a process on the command line for 1500 genes?
0
4.2 years ago by
Germany
ChIP-Tease0 wrote:

Hello everybody,

I have a problem i need some help for.

When I run a program on the command line for a single gene, it works fine:

program.sh genename_1 ../../../unchanged_file.txt ../../aa_bb_cc_dd_genename_1.gff genename_1.bed

But I need to run it about 1500 times on the command line and I don't know how to automate it.

I have a folder. Within this folder, there are different files with different endings. I only want to analyze the files which end with .gff.

All the .gff files are the same at the beginning aa_bb_cc_dd_ then the genename comes and finally an underscore and a number like here:

aa_bb_cc_dd_Genename_1.gff

aa_bb_cc_dd_Genename_2.gff

aa_bb_cc_dd_Genename_3.gff

aa_bb_cc_dd_otherGenename_1.gff

aa_bb_cc_dd_otherGenename_2.gff

There are more than 1500 combinations.

The code which does the job for one file looks like this:

program.sh genename_1 ../../../unchanged_file.txt ../../aa_bb_cc_dd_genename_1.gff genename_1.bed

Is there any way to do this for all 1500 .gff files in a few steps. I'm very sorry i cannot suggest anything, but i don't have too much experience with the command line. I could do something like this in R, but this doesn't help here a lot.

Thanks a lot, Alex

command line automation loop • 638 views
ADD COMMENTlink
modified 4.2 years ago • written 4.2 years ago by ChIP-Tease0
Answer: How to automate a process on the command line for 1500 genes?
1
4.2 years ago by
Jim Hester10
United States
Jim Hester10 wrote:

Note you can do this using R as well, the system() function can call any 'command line' program.  Remove the echo from the examples to actually call program.sh

gffs <- list.files(pattern="gff$", full.names = TRUE) lapply(gffs, function(file) { gene <- gsub(".*aa_bb_cc_dd_(.*).gff$", "\\1", file)

system(sprintf("echo program.sh %s ../../../unchanged_file.txt %s %s.bed", gene, file, gene))

})

But you can of course do a similar thing with bash

for file in *gff;do

temp=${file##aa_bb_cc_dd_} gene=${temp%.gff}

echo program.sh $file ../../../unchanged_file.txt$gene $file$gene.bed

done
ADD COMMENTlink modified 4.2 years ago • written 4.2 years ago by Jim Hester10

Thanks a lot, i didn't know that R can call command line programs. This will be very usefull for me.

I guess i will try both ways.

Thanks a lot again!

ADD REPLYlink written 4.2 years ago by ChIP-Tease0
Answer: How to automate a process on the command line for 1500 genes?
0
4.2 years ago by
tangming2005140
United States
tangming2005140 wrote:

something like this:

for file in *gtf

do

command $file done ADD COMMENTlink written 4.2 years ago by tangming2005140 Thank you! ADD REPLYlink written 4.2 years ago by ChIP-Tease0 Answer: How to automate a process on the command line for 1500 genes? 0 4.2 years ago by Germany ChIP-Tease0 wrote: Hello everybody, i cannot really make this suggestion work. for file in *gff;do gene=${${file##aa_bb_cc_dd_}%.gff} echo program.sh$file ../../../unchanged_file.txt $gene$file $gene.bed done  The problem seems to be this part: gene=${${file##aa_bb_cc_dd_}%.gff} I understand that the$ sign excludes what is written in the brackets from the output.

Meaning

${file##aa_bb_cc_dd_} on aa_bb_cc_dd_example_gene.gff will give me example_gene.gff and gene=${${file##aa_bb_cc_dd_}%.gff} on aa_bb_cc_dd_example_gene.gff should give me example_gene But it tells me "bad substituation" This probably means that some sign is wrong, but i cannot figure out what is wrong and i don't really know what to google for to find the rules to variable definition. Maybe someone has a link or knows what is wrong. Thanks a lot, Alex ADD COMMENTlink written 4.2 years ago by ChIP-Tease0 I forgot bash doesn't support nested substitutions like zsh. In bash you have to do it in two steps, for file in *gff;do temp=${file##aa_bb_cc_dd_}

gene=${temp%.gff} echo program.sh$file ../../../unchanged_file.txt $gene$file \$gene.bed

done

I have updated my answer appropriately, if it answers your question please mark it as accepted.

ADD REPLYlink modified 4.2 years ago • written 4.2 years ago by Jim Hester10

Hello Jim,

Thanks a lot!

I accepted it. I didn't know so far that i can accept answers. Also thanks for that hint

ADD REPLYlink written 4.2 years ago by ChIP-Tease0
Please log in to add an answer.

Content
Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 16.09
Traffic: 177 users visited in the last hour