Aliases allow to substitute a long shell command with a simple string. On all of binf machines, all user defined aliases can be found in a file '.alias' in your home directory. The syntax of aliases differs between C shell (csh; like all of our binf machines) and Bourne shell (bash; e.g., ubuntu and fedora). The letter/string following alias is the new simple command. Following are examples of some useful aliases:
Purpose of alias
For csh
For bash
Disk usage of all the folders in a directory
aliasdu 'du -h --max-depth=1'
alias du='du -h --max-depth=1'
Logging into mutant e.g., m 10
alias m 'ssh -X mutant\!*'
m () { ssh -X mutant"$@"; }
Copy files between machines using tar and ssh e.g., shcp 10 test (copies folder "/linuxhome/tmp/user/test" from mutant10 to the current directory)
alias shcp 'ssh mutant\!:1 "cd /linuxhome/tmp/user/\!:2 ; tar cf - ./" | tarxvf -'
alias shcp='ssh mutant$1 "cd /linuxhome/tmp/user/$2 ; tar cf - ./" | tar xvf -'
Bash scripts for repetitive tasks:
Disclaimer: If you have a specific task you can always look for it with google of course, and you will find 10 ways of doing something, many of which much better than what I put here.
But this little piece could serve as an intermediate between "how to use an array in bash" and "crazy complicated sysadmin stuff".
Bash scripts are a great way to start many simulations in one go, to do simple tasks on files etc, but sometimes it takes a bit of fiddling to get stuff to work. Basically you can put a number of commands that you'd normally type in the terminal in a file, put the line #!/bin/bash on top, make it executable with chmod +x and you are good to go. However, you can do clever stuff with loops and variables that prevent you from having to do a lot of copy-paste, and you can have Bash read files for you and determine by itself what should be done. Be careful though: misplaced rm -f commands are also dangerous here, especially when paired with a loop!
simple simulation start script:
#!/bin/bash# a very simple runscript. I provide different seeds to my program which I store in the arrays STR2 and STR3STR2=("963""39""244""398""62""517""887""611""166""138")STR2=("523""32""1346""23476""446""2341""3434""61342""5234""9754")#I only want to start 10 simulations at a time. there are different ways of splitting your runs up, but here I took the lazy route.for i in${STR2[*]}; do
./my_program -d directory_$i-s$i parfile.cfg &donewait# makes sure all the simulations are done before starting the new set.for i in${STR3[*]}; do
./my_program -d directory_$i-s$i parfile.cfg &done
more complex analysis code with data reading, formatting and plotting
The script below is modified from one I use to analyse many things from my simulations; I do a lot of different tasks. The comments (starting with #) explain what they do.
Note that this script is no longer functional! I stripped a lot of redundant content and not all variables will be initialised.
In short you will find code here for:
using variables in a bash script
allowing the script user to type in what the value of a variable should be (interactively)
reading a file and storing (some of) its content into variables and arrays
selecting files and storing (part of) their names
looping through arrays with for loops
looping with while loops
selectively execute commands with if statements
parsing file contents and printing data with awk (good for combining files)
plotting from the command line and with batchfiles with gnuplot
combining pictures into one with montage (an imagemagick tool)
#!/bin/bash##the top line should always be there. Make script executable with chmod +x# Declare arraydeclare-a DIRARRAY
declare-a AGENTARRAY
declare-a SUCCESS
############################## Data collection ################################# with read, you can ask the user to type in information that ends up in a variable.##echo"please enter the directory general name and the parfile"read direcname parfile
echo"Analysing directories $direcname, with parfile $parfile"#read the filename with each seed to analyse"echo"Now, please enter the file with seeds"read agfile
#print the contents of the file for checkingecho"The contents of the file: "cat$agfile## extract the contents of the agfile into the appropriate arrays in Bash ##
mapfile -t FILEARRAY <$agfile#agfile contains as elements the lines in the fileCOUNTER=0#for loop!for el in"${FILEARRAY[@]}";doIFS=' 'read seed <<<"$el"#extract the seed and agent from each line in the file
DIRARRAY[$COUNTER]=$seed((COUNTER++))done#end of for loop#note how, when you set a variable, you just give the variable name.#However, when you read out its value, you add a $.COUNTER=0COUNTER2=0echo$COUNTER#some normal commands as you'd usually type them.rm-rf$direcname\_datafiles/mkdir$direcname\_datafiles/#printf: formatted printing: does not automatically append a newlineprintf"">$direcname\_datafiles/nrbands.dat
######################### How I do my analysis ##########################another loop, looping over every element in DIRARRAY.#the "all" is denoted by the @.# here I loop through the directories containing my simulation datafor seed in${DIRARRAY[@]};do#This directory contains a number of files starting with "Coded". ls lists them all, head then selects the first of those#the $(...) allows me to store this in the variable named FILE.FILE=$(ls$direcname\_$seed/FittestGeneration0000009900/Coded*|head -1)AGENTID=${FILE:(-10)}#extract the number (which is the last 10 characters of the file name)
AGENTARRAY[$COUNTER]=$AGENTID#store in this array.echo"agent: $AGENTID"#read some data from a file: was this run a success?#first get the filename againFILE2=$(ls$direcname\_$seed/FittestGeneration0000009900/FitnessDetails*|head -1)# read data from the file into a variable. awk is a mighty handy tool for all kinds of file reading and manipulation# the piece outside the brackets is the condition and the piece between brackets is what is executed if the condition is satisfied.# NR is the current line, so awk prints the second element of the second line in this case.read-r bandnr <<(awk'NR==2 {printf $2}'$FILE2)
SUCCESS[$COUNTER]=$bandnr#append data to a file with >> because > overwrites the file.echo$bandnr>>$direcname\_datafiles/nrbands.dat
# I only run my analysis program if the run was successful.# if statements! Friggin' sensitive fuckers ### note the spaces around [ and ]? Don't forget those.# for more info on if statements: http://tldp.org/LDP/Bash-Beginners-Guide/html/sect_07_01.htmlif["${SUCCESS[$COUNTER]}"-gt"1"]; then((COUNTER2++))rm-rf$direcname\_$seed/analyse$AGENTID## run analysis program ##
./shortanalyse $AGENTID$direcname\_$seed/analyse$AGENTID-d$direcname\_$seed/FittestGeneration0000009900/-s$seed$parfilefi#end of if statement.((COUNTER++))done########################################## Data collection and plotting ##########################################COUNTER=0# while statement!!# note how you can either compare vars as "$var" -lt "16" or as $(($var)) -lt 16.# Both are weird. take your pick, they should both compare the numerical value.while[ $(($tel))-lt16]; doprintf"$tel 0.0 0.0 0.0\n">>$direcname\_datafiles/degreedistr_original.dat #print some dataif[ $(($tel))-lt7]thenprintf"$tel 0\n">>$direcname\_datafiles/loopfreqs_segm.dat
fi((tel++))done#some more awk because it is coolawk'$1=="0" { printf $3 " " }'$some_file>>$some_otherfile#printf does not append newline at the endawk'$1=="5" { print $2, $3, $4 }'$some_3rdfile>>$some_otherfile#print does end with newline
: '
more elaborate command line awk usage: f is a variable which can be set in the {} part and checked in the condition.
# the script below checks whether the 8th element on the line equals 2. if so, and nothing has been done yet (f=0), print the first element of that line. now something has been done so f=1.
if the 8th element is >2 and nothing has been done yet, print first 0, then element 1 of that line and element 8, and finish (f=2)
if instead you already printed something (so f=1), then just print the first element, the 8th and finish (f=2).
'awk'$8==2 && f==0 {printf $1 " "; f=1} $8>2 && f==1 {print $1 " " $8; f=2} $8>2 && f==0 {print "0 " $1 " " $8; f=2}'${direcname}_$seed/PopBandDynamics >>$direcname\_datafiles/firstband_time.dat
#collect data from two files into one, using conditionals#FNR denotes first file. Store multiple elements of this line in arrays (a[NR]=$2) then go to the next file with next#then we also go to the next {} block: there we sum and print data from the second file with the data from the first file (in the arrays)awk'NR==FNR {a[NR]=$2;b[NR]=$3;c[NR]=$4;next} {print $1, $2+a[FNR],$3+b[FNR], $4+c[FNR]}'$firstfile$secondfile>$endfile#when the file contains a single element, you can easily read it into a variable:genetoplot=$(cat"filename.dat")#gnuplot is great for quick plotting from the command line. you can give as many consecutive commands as you want with -e and " ..;.."#this one will make a 2d plot with the xcoordinates from col 4, the ycoordinates from col 5 and the points colored according to col 2.
gnuplot -e"unset key; set term svg; set output 'bands_size.svg'; plot 'FullAncestry' u 4:5:2 w linespoints pt 7 palette"#if you want to make things pretty or have a lot of commands to give, a batchfile collecting these may be handier.# with -e you can also pass an argument to the batchfile: neat!
gnuplot -e"gene=$genetoplot" gnubatchfile
#or we just call python to do something for us.
python somescript.py filename.dat $genetoplot#below I collect the filenames of many pictures in an array to paste them together in one figure.declare-a FOUR
## picture collection ### collect the file namesCOUNTER=0for seed in${DIRARRAY[@]};doFOUR=("${FOUR[@]}""$direcname""_$seed/somesubdir${AGENTARRAY[$COUNTER]}/thispic.png")((COUNTER++))done#an imagemagick command combining the pictures in a certain way. tile specifies the number of columns (-tile nr x) or rows (-tile x nr) of pictures#geometry here specifies the number of pixels between each picture. here I added 10 for both the x and y direction.
montage "${FOUR[@]}"-geometry +10+10-tile 4x $direcname\_datafiles/fourierdata.png
Alias in C-shell:
Aliases allow to substitute a long shell command with a simple string. On all of binf machines, all user defined aliases can be found in a file '.alias' in your home directory. The syntax of aliases differs between C shell (csh; like all of our binf machines) and Bourne shell (bash; e.g., ubuntu and fedora). The letter/string following alias is the new simple command. Following are examples of some useful aliases:(copies folder "/linuxhome/tmp/user/test" from mutant10 to the current directory)
Bash scripts for repetitive tasks:
Disclaimer: If you have a specific task you can always look for it with google of course, and you will find 10 ways of doing something, many of which much better than what I put here.But this little piece could serve as an intermediate between "how to use an array in bash" and "crazy complicated sysadmin stuff".
Bash scripts are a great way to start many simulations in one go, to do simple tasks on files etc, but sometimes it takes a bit of fiddling to get stuff to work. Basically you can put a number of commands that you'd normally type in the terminal in a file, put the line #!/bin/bash on top, make it executable with chmod +x and you are good to go. However, you can do clever stuff with loops and variables that prevent you from having to do a lot of copy-paste, and you can have Bash read files for you and determine by itself what should be done.
Be careful though: misplaced rm -f commands are also dangerous here, especially when paired with a loop!
simple simulation start script:
more complex analysis code with data reading, formatting and plotting
The script below is modified from one I use to analyse many things from my simulations; I do a lot of different tasks. The comments (starting with #) explain what they do.Note that this script is no longer functional! I stripped a lot of redundant content and not all variables will be initialised.
In short you will find code here for: