Please Note: motifs must be defined using lower case letters. DNA sequence may be in upper or lower case.
To search for a HindIII restriction site 5 to 25 bases away from an EcoRi site one could define the following motif in a file named ta;
>ta1 [0,]n (aagctt) [5,25]n (gaattc)with the following sequence in a file named toto;
>toto atataagctttatccggaattctaaatgcarunning the command
bioMotif DNA -s toto -m taone obtains the following results:
>ta1 is the MOTIF name >toto is the SEQUENCE name n 1 solutions m aagctt 5-10 gaattc 17-22 fRemarks
Both the motif and the sequence are saved in files with fasta format. Motif files and Sequence files each may contain more than one entry. The motif element [0,]n asks bioMotif to scan the entire sequence.
Example 2
To search for a direct repeat of 6 basepairs 3 to 10 bases away from the original set one could use the following motif;
>ta2 [0,]n (a|t|c|g)(nnnnnn) $fragSavAs(%fr1,-5,0) [3,10]n $fBBsup(/usr/local/lib/bioMotif/data_IdentityBB,%fr1,1,.99)with the following sequence
>toto2 atatgggcccacgtagagggccctcgtone obtains the following result
>ta2 is the MOTIF name >toto2 is the SEQUENCE name n 1 solutions m t 4-4 f fBBsup 18-23 >toto2 REVERSE COMPLEMENT n 1 solutions m a 4-4 f fBBsup 18-23Remarks
The motif element (a|t|c|g) asks bioMotif to match a or g or c or g, in other words, any base which also could be given simply by n. However, as seen in the output, matches to n are not reported. Using the explicit (a|t|c|g) element causes bioMotif to output the base pair number, which helps one more quickly identify the location of the match.
Using the function library, in this case the identity function, requires the use of the full path to the function file. On this computer (GENETICS) the bioMotif library files are in /usr/local/lib/bioMotif/
Back to the BioMotif Index Page