🔍 BLASTP: Protein-Protein BLAST

BLASTP (Basic Local Alignment Search Tool for Proteins) is a fundamental bioinformatics tool used to compare a protein query sequence against a protein sequence database. It identifies regions of local similarity between sequences, helping to infer functional and evolutionary relationships.

❓ What is BLASTP?

BLASTP takes an amino acid (protein) query sequence and searches it against a chosen protein sequence database. It identifies regions of significant similarity, providing a list of hits (matching sequences) and their alignments. This is the most common BLAST program for protein sequence comparison.

  • Protein Query vs. Protein Database: Compares protein to protein.
  • Local Alignment: Finds regions of highest similarity.
  • Homology Inference: Helps deduce function and evolutionary relationships.

🎯 Why Use BLASTP? For Protein Homology & Function Prediction

BLASTP is indispensable for:

  • 🔍 Functional Annotation: Predicting the function of a novel protein by finding homologous proteins with known functions.
  • 🧬 Homology Search: Identifying evolutionarily related proteins across different species.
  • 📊 Protein Family Classification: Assigning a protein to a known family or superfamily.
  • 🎯 Target Identification: Finding similar proteins that could be drug targets or have specific binding properties.
  • 📈 Gene Discovery: Identifying new genes by searching translated genomic regions against protein databases.

🧑‍💻 How to Use BLASTP on Job Dispatcher: A Step-by-Step Guide

Follow these simple steps to perform a protein-protein BLAST search:

1️⃣ Navigate to the Tool

  1. From the main menu, go to All Tools (or search for "BLASTP").
  2. Click the prominent Use Tool button located next to "BLASTP."

2️⃣ Input Your Protein Sequence

  • Locate the input box (large text area) or the "upload a Sequence File" option.

  • Paste your protein sequence(s) in FASTA format or upload a FASTA file.

    >my_protein_query
    MGDVEKGKKIFIMKCSQCHTVEKGGKHKTGPNLHGLFGRKTGQAPGYSYTAANKNKGIIWGEDTLMEYLENPKKYIPGTKMIFVGIKKKEERADLIAYLKKATNE
    
  • Important: You can provide a sequence either by typing into the text area OR by uploading a file, but not both simultaneously. Please clear one input to proceed.

3️⃣ Configure Parameters

  • 📝 Title: Provide a descriptive title for your job (e.g., "My BLASTP Search").

  • 💡 Sequence Type: (Automatically set to Protein for BLASTP).

  • 🗄️ Databases: Select one or more protein databases to search against.

    • Default: uniprotkb_swissprot
    • (Many other options available in the Protein Databases Tree on the form)
  • 📝 INCL. TAXONOMY IDs (taxids): Enter taxonomy IDs separated by commas (e.g., 9606, 10090, 7227).

  • 📝 EXCL. TAXONOMY IDs (negative_taxids): Enter taxonomy IDs separated by commas (e.g., 9606, 10090, 7227).

  • 📊 Matrix (matrix): Select the scoring matrix to use for protein alignments.

    • BLOSUM62 - Default
    • BLOSUM45, BLOSUM50, BLOSUM80, BLOSUM90
    • PAM30, PAM70, PAM250
    • NONE (M10)
  • ➖ Gap Open (gapopen): The penalty for opening a new gap.

    • Default: 11
    • Options: -1, 0, 1, ..., 25
  • ➖ Gap Extend (gapext): The penalty for extending an existing gap.

    • Default: 1
    • Options: -1, 0, 1, ..., 10
  • 📉 EXP.THR (exp): The Expectation Value (E-value) threshold. Matches with E-values higher than this will not be reported. Lower values are stricter.

    • Default: 10
    • Options: 1e-200, 1e-100, 1e-50, 1e-10, 1e-5, 1e-4, 0.001, 0.01, 0.1, 1.0, 10, 100, 1000, 20, 50
  • 🧹 FILTER (filter): Apply a low-complexity filter (e.g., SEG filter for protein).

    • no (F) - Default
    • yes (T)
  • 🗑️ DROPOFF (dropoff): Dropoff value for the Gapped BLAST algorithm.

    • Default: 0
    • Options: 2, 4, 6, 8, 10
  • 🔢 SCORES (scores): Maximum number of scores to report.

    • Default: 50
    • Options: 0, 5, 10, 20, 50, 100, 150, 200, 250, 500, 750, 1000
  • ↔️ ALIGNMENTS (alignments): Maximum number of alignments to report.

    • Default: 50
    • Options: 0, 5, 10, 20, 50, 100, 150, 200, 250, 500, 750, 1000
  • 📏 SEQUENCE RANGE (seqrange): Define a specific range within the query sequence to search.

    • Default: START-END (entire sequence)
  • 🔢 HSPS (hsps): Maximum number of High-scoring Segment Pairs (HSPs) to report.

    • Default: 100
    • Input type: Number
  • ↔️ GAPALIGN (gapalign): Perform gapped alignments.

    • true - Default
    • false
  • 👁️ ALIGN VIEWS (align): Choose the format for displaying alignments.

    • 0 (pairwise) - Default
    • 1 (Query-anchored identities)
    • 2 (Query-anchored non-identities)
    • 3 (Flat query-anchored identities)
    • 4 (Flat query-anchored non-identities)
    • 5 (BLASTXML)
    • 6 (Tabular)
    • 7 (Tabular with comment lines)
    • 8 (Text ASN.1)
    • 9 (Binary ASN.1)
    • 10 (Comma-separated values)
    • 11 (BLAST archive format (ASN.1))
    • 12 (Tabular with comment lines with btop)
  • 📊 COMPOSITION-BASED (compstats): Use composition-based statistics.

    • F - Default
    • D, 1, 2, 3
  • 📏 WORD SIZE (wordsize): The length of the initial exact match (seed) required to initiate an alignment.

    • Default: 6
    • Input type: Number

4️⃣ Submit Your Job

  • Once your sequence is entered and parameters are set, click the Submit or Run button.
  • Your job will be dispatched to the EMBL-EBI Web Service. You will be automatically redirected to a Job Status page to monitor its progress.

5️⃣ Interpret Results

  • On the results page, you will find a summary of your BLASTP search, including a graphical overview of matches, a table of significant alignments, and detailed pairwise alignments.
  • Pay attention to the E-value (Expectation Value), which indicates the number of hits you would expect to see by chance. Lower E-values mean more significant matches.
  • ⭐ Tip: Explore the detailed alignments to understand the exact matching regions and any gaps or mismatches.

💬 Need Help?

If you run into issues, please visit our Contact Us page for support. Happy BLASTing!