🔍 FASTA (Nucleotide-Nucleotide): Fast DNA/RNA Similarity Search
FASTA (Nucleotide-Nucleotide) is a widely used bioinformatics tool for rapidly searching a nucleotide (DNA or RNA) query sequence against a nucleotide sequence database. It identifies regions of local similarity between nucleic acid sequences, helping to infer functional or evolutionary relationships.
❓ What is FASTA (Nucleotide-Nucleotide)?
FASTA
(when used for nucleotide-nucleotide searches, often denoted as FASTA-N
) takes a DNA or RNA query sequence and compares it against a chosen nucleotide sequence database. It uses a heuristic algorithm to quickly find regions of high similarity, providing a list of hits and their alignments. It's a foundational tool for direct nucleic acid sequence comparison.
- Nucleotide Query vs. Nucleotide Database: Compares DNA/RNA to DNA/RNA.
- Local Similarity Search: Finds regions of highest similarity.
- Rapid Homology Detection: Efficiently identifies related nucleic acids.
🎯 Why Use FASTA (Nucleotide-Nucleotide)? For Quick DNA/RNA Homology
FASTA (Nucleotide-Nucleotide) is indispensable for:
- 🔍 Gene Identification: Finding known genes or homologous sequences in a newly sequenced genome or transcript.
- 🧬 Primer/Probe Specificity: Checking the specificity of PCR primers or hybridization probes against a genome or transcript database.
- 📊 Sequence Verification: Confirming the identity of a cloned DNA fragment.
- 🎯 Contamination Detection: Identifying contaminating sequences in your sample.
- 📈 Variant Screening: Quickly screening for known sequence variants.
🧑💻 How to Use FASTA (Nucleotide-Nucleotide) on Job Dispatcher: A Step-by-Step Guide
Follow these simple steps to perform a nucleotide-nucleotide FASTA search:
1️⃣ Navigate to the Tool
- From the main menu, go to All Tools (or search for "FASTA (Nucleotide vs. Nucleotide Search)").
- Click the prominent Use Tool button located next to "FASTA (Nucleotide vs. Nucleotide Search)."
2️⃣ Input Your Nucleotide Sequence
Locate the input box (large text area) or the "upload a Sequence File" option.
Paste your nucleotide (DNA or RNA) sequence(s) in FASTA format or upload a FASTA file.
>my_dna_query ATGGCCATGGCACTAGCTAGCTAGCTAGCTAGCTAGCTAGCTAGCTAGCATG
Important: You can provide a sequence either by typing into the text area OR by uploading a file, but not both simultaneously. Please clear one input to proceed.
3️⃣ Configure Parameters
📝 Title: Provide a descriptive title for your job (e.g., "My FASTA-NN Search").
💡 Program: Select the specific FASTA program to run.
FASTA
- DefaultSSEARCH
(More sensitive Smith-Waterman)GGSEARCH
(Global-Global alignment)GLSEARCH
(Global-Local alignment)
🗄️ Databases: Select one or more nucleotide databases to search against.
- Default:
em_est_env
,em_gss_env
,em_htc_env
,em_htg_env
,em_pat_env
,em_std_env
,em_sts_env
,em_tsa_env
- (Many other options available in the Nucleotide Databases Tree on the form)
- Default:
➕ MATCH/MISMATCH SCORES (
match_scores
): Define scores for matches and mismatches.- Default:
+5/-4
- Options:
+5/-4
,+3/-2
,N/A
(none
)
- Default:
➖ Gap Open (
gapopen
): The penalty for opening a new gap.- Default:
-10
- Options:
default
(10
),0
,-1
, ...,-19
- Default:
➖ Gap Extend (
gapext
): The penalty for extending an existing gap.- Default:
-2
- Options:
default
(10
),0
,-1
, ...,-16
- Default:
📏 KTUP (
ktup
): The size of the word (k-tuple) used for initial seeding. Higher values are faster but less sensitive.- Default:
6
- Options:
6
,5
,4
,3
,2
,1
,N/A
(-1
)
- Default:
📈 EXPECTATION UPPER LIMIT (
expupperlim
): Maximum E-value for reported matches. Lower values are stricter.- Default:
10
- Options:
1e-300
,1e-100
,1e-50
,1e-10
,1e-5
,0.001
,0.1
,1
,2
,5
,10
,20
,50
- Default:
📉 EXPECTATION LOWER LIMIT (
explowlim
): Minimum E-value for reported matches. Allows excluding very closely related hits.- Default:
0
- Options:
0
,1e-300
,1e-100
,1e-50
,1e-10
,1e-5
,0.001
,0.1
,1
,2
,5
,10
,20
,50
- Default:
↔️ STRAND (
strand
): For nucleotide sequences, specify the sequence strand to be used for the search.both
- Defaulttop
bottom
📊 HISTOGRAM (
hist
): Display a histogram of scores in the FASTA result.false
(no) - Defaulttrue
(yes)
🧹 FILTER (
filter
): Filter regions of low sequence complexity.none
- Defaultdust
(DUST filter)
📊 STATISTICAL ESTIMATES (
stats
): Method for calculating statistical significance.1
(Regress) - Default2
(MLE),3
(Altshul-Gish),4
(Regress/shuf.),5
(MLE/shuf.)
🔢 SCORES (
scores
): Maximum number of match score summaries to report.- Default:
50
- Options:
10
,20
,30
,40
,50
,60
,70
,80
,90
,100
,150
,200
,250
,500
,750
,1000
- Default:
↔️ ALIGNMENTS (
alignments
): Maximum number of alignments to report.- Default:
50
- Options:
10
,20
,30
,40
,50
,60
,70
,80
,90
,100
,150
,200
,250
,500
,750
,1000
- Default:
📏 SEQUENCE RANGE (
seqrange
): Specify a range within the query sequence to search.- Default:
START-END
(entire sequence)
- Default:
🗄️ DATABASE RANGE (
dbrange
): Specify a length range for database sequences to search against.- Default:
START-END
(all lengths)
- Default:
🔢 MULTI HSPS (
hsps
): Display all significant High-scoring Segment Pairs (HSPs) between query and library sequence.no
(false
) - Defaultyes
(true
)
⚙️ SCORE REPORT FORMAT (
scoreformat
): Choose the format for the score report.default
- Default-m 8 -- blast tabular
,-m 8C -- BLAST tabular with comments
, etc. (various tabular and ASN.1 formats)
4️⃣ Submit Your Job
- Once your sequence is entered and parameters are set, click the Submit or Run button.
- Your job will be dispatched to the EMBL-EBI Web Service. You will be automatically redirected to a Job Status page to monitor its progress.
5️⃣ Interpret Results
- On the results page, you will find a summary of your FASTA (Nucleotide-Nucleotide) search, including a list of significant hits, their scores, and alignments.
- Pay attention to the E-value, which indicates the statistical significance of the match. Lower E-values are more significant.
- ⭐ Tip: Use the
STRAND
option carefully; searching both strands is common, but specific needs might require 'top' or 'bottom' only.
💬 Need Help?
If you run into issues, please visit our Contact Us page for support. Happy FASTA searching!