I have been working with blast for aligning sequences. This is a fairly computationally intensive exercise that is very worth parallelising. GNU parallel is a great tool for this, but I found it rather unintuitive to use. I also didn't realise that my first implementation was not actually parallelising blast, but it was still using only one processor core (traps for the unwary!). I found an implementation that worked, though, and it's below:
cat $in_fd | parallel --gnu --max-procs ${max_cores} --block 100k --recstart '>' --pipe blastx -evalue 0.01 -outfmt 6 -db ${blast_DB_loc} -query - > output_file.dat
# ${in_fd} is a fasta-formatted input file
# ${blast_DB_loc} is the database to blast against
# ${max_cores} is the number of cores one wishes to use
Please note that parellisation like this is only appropriate when the order does not matter and the analysis of one segment does not rely on the output of the analysis of another segment.