BioMotif: Simple Examples

Example 1

Please Note: motifs must be defined using lower case letters. DNA sequence may be in upper or lower case.

To search for a HindIII restriction site 5 to 25 bases away from an EcoRi site one could define the following motif in a file named ta;

>ta1
[0,]n (aagctt) [5,25]n (gaattc)
with the following sequence in a file named toto;
>toto
atataagctttatccggaattctaaatgca
running the command
bioMotif DNA -s toto -m ta
one obtains the following results:
>ta1 is the MOTIF name 
 
>toto is the SEQUENCE name
n 1 solutions 
m aagctt 5-10 gaattc 17-22
f
Remarks

Both the motif and the sequence are saved in files with fasta format. Motif files and Sequence files each may contain more than one entry. The motif element [0,]n asks bioMotif to scan the entire sequence.

Example 2

To search for a direct repeat of 6 basepairs 3 to 10 bases away from the original set one could use the following motif;

>ta2
[0,]n (a|t|c|g)(nnnnnn) $fragSavAs(%fr1,-5,0)
[3,10]n
$fBBsup(/usr/local/lib/bioMotif/data_IdentityBB,%fr1,1,.99)
with the following sequence
>toto2
atatgggcccacgtagagggccctcgt
one obtains the following result
>ta2 is the MOTIF name 
 
>toto2 is the SEQUENCE name
n 1 solutions 
m t 4-4
f fBBsup 18-23
 
>toto2 REVERSE COMPLEMENT
n 1 solutions 
m a 4-4
f fBBsup 18-23
Remarks

The motif element (a|t|c|g) asks bioMotif to match a or g or c or g, in other words, any base which also could be given simply by n. However, as seen in the output, matches to n are not reported. Using the explicit (a|t|c|g) element causes bioMotif to output the base pair number, which helps one more quickly identify the location of the match.

Using the function library, in this case the identity function, requires the use of the full path to the function file. On this computer (GENETICS) the bioMotif library files are in /usr/local/lib/bioMotif/

Back to the BioMotif Index Page


John Morris 19.Aug.1997