🧬 InterProScan

InterProScan is a powerful bioinformatics tool that helps you understand the function of your protein or nucleic acid sequences by identifying domains, families, and functional sites. It integrates various protein signature recognition methods from the InterPro consortium databases.

❓ What is InterProScan?

InterProScan integrates diverse protein signature databases (like Pfam, SMART, Gene3D, PRINTS, etc.) into a single query. When you submit a sequence, InterProScan runs it against these different methods, providing a comprehensive annotation of known domains, families, and functional sites.

  • Integrated Analysis: Combines multiple prediction methods for a holistic view.

  • Comprehensive Annotation: Identifies protein features, helping infer function.

  • Versatile Input: Can analyze both protein and nucleic acid sequences.

🎯 Why Use InterProScan? Unlock Deeper Biological Understanding

InterProScan is indispensable for:

  • 🔍 Protein Functional Prediction: Infer the likely function of novel or uncharacterized proteins by identifying known domains and motifs.

  • 🧬 Domain Architecture Mapping: Understand the modular organization of proteins, revealing how different functional units combine.

  • 📊 Family Classification: Assign proteins to known families, superfamilies, and provide evolutionary context.

  • 🎯 Target Identification: Pinpoint specific functional sites crucial for drug discovery or experimental design.

  • 📈 Data Validation: Cross-reference predictions from multiple databases to increase confidence in your annotations.

🧑‍💻 How to Use InterProScan on Job Dispatcher: A Step-by-Step Guide

Follow these simple steps to analyze your sequence with InterProScan:

1️⃣ Navigate to the Tool

  1. From the main menu, go to All Tools (or search for "InterProScan").

  2. Click the prominent Use Tool button located next to "InterProScan."

2️⃣ Input Your Sequence

  • Locate the input box (large text area) or the "upload a Sequence File" option.

  • Paste your sequence in FASTA format or upload a FASTA file.

    >my_protein_sequence
    MAMALASLASLASLASLASLAMALASLASLASLASLASLAMALASLASLASLASLASLAM
    
  • Important: You can provide a sequence either by typing into the text area OR by uploading a file, but not both simultaneously. Please clear one input to proceed.

3️⃣ Configure Parameters

  • 📝 Title: Provide a descriptive title for your job (e.g., "My Protein Analysis").

  • 💡 Sequence Type: Select the type of sequence you are submitting:

    • Protein (p)

    • Nucleic Acid (n)

  • ✅ Applications: This is crucial. Select one or more applications (databases/methods) you want InterProScan to run your sequence against. Common choices include:

    • NCBIfam: (TIGRFAM renamed to NCBIfam)

    • SFLD: (A database of protein families based on HMMs)

    • Phobius: (Combined transmembrane topology and signal peptide predictor)

    • SignalP: (Predicts signal peptide cleavage sites)

    • SUPERFAMILY: (Structural and functional annotations)

    • PANTHER: (Protein ANalysis THrough Evolutionary Relationships)

    • Pfam: (Large collection of protein families)

    • (...and many more options available on the form)

    • You can use "Select All" or "Unselect All" buttons for convenience.

4️⃣ Submit Your Job

  • Once your sequence is entered and parameters are set, click the Submit or Run button.

  • Your job will be dispatched to the EMBL-EBI Web Service. You will be automatically redirected to a Job Status page to monitor its progress.

5️⃣ Interpret Results

  • On the results page, you will find a summary of the domains, families, and sites identified in your sequence by the selected applications.

  • Results often include graphical representations, detailed annotations, and links to the source databases for more information.

  • ⭐ Tip: Pay attention to the consensus predictions from multiple methods, as these often provide the most robust insights.

🧪 Example Usage

Input Sequence (FASTA - Protein Example):