🔬 Pfamscan: Identifying Protein Families & Domains

Pfamscan is a powerful tool for rapidly identifying known protein families, domains, and motifs within your protein sequences. It scans your input sequence(s) against the comprehensive Pfam database, a large collection of protein families represented by multiple sequence alignments and hidden Markov models (HMMs).

❓ What is Pfamscan?

Pfamscan is the sequence search tool for the Pfam database. It takes a protein sequence as input and compares it to all the protein families and domains defined in Pfam. The tool uses profile HMMs (Hidden Markov Models) to detect even distant evolutionary relationships, providing insights into a protein's potential function and evolutionary history.

  • Database Integration: Scans against the extensive Pfam-A database.

  • Domain Identification: Pinpoints known protein domains and motifs.

  • Evolutionary Insights: Helps classify proteins into families based on shared ancestry.

🎯 Why Use Pfamscan? Uncover Protein Function & Evolution

Pfamscan is indispensable for:

  • 🔍 Functional Annotation: Quickly assign putative functions to novel or uncharacterized proteins.

  • 🧬 Domain Mapping: Identify the modular architecture of a protein, revealing its constituent functional units.

  • 📊 Family Classification: Classify proteins into established families and superfamilies.

  • 📈 Comparative Analysis: Understand evolutionary relationships by identifying conserved domains across different species.

  • 🎯 Experimental Design: Guide experimental work by highlighting regions of interest for structural or functional studies.

🧑‍💻 How to Use Pfamscan on Job Dispatcher: A Step-by-Step Guide

Follow these simple steps to scan your protein sequence(s) with Pfamscan:

1️⃣ Navigate to the Tool

  1. From the main menu, go to All Tools (or search for "Pfamscan").

  2. Click the prominent Use Tool button located next to "Pfamscan."

2️⃣ Input Your Sequence

  • Locate the input box (large text area) or the "upload a Sequence File" option.

  • Paste your protein sequence(s) in FASTA format or upload a FASTA file.

    >my_protein_sequence
    MAMALASLASLASLASLASLAMALASLASLASLASLASLAMALASLASLASLASLASLAM
    
  • Important: You can provide a sequence either by typing into the text area OR by uploading a file, but not both simultaneously. Please clear one input to proceed.

3️⃣ Configure Parameters

  • 📝 Title: Provide a descriptive title for your job (e.g., "My Pfamscan Analysis").

  • 💡 Sequence Type: (Automatically set to Protein for Pfamscan, as it's a protein-specific tool).

  • 📉 Expectation Value Threshold (evalue): Set the E-value cutoff for reporting matches. Lower values mean stricter matches.

    • 1e-300, 1e-100, 1e-50, 1e-10, 1e-5, 0.0001, 0.001, 0.1, 1, 2, 5, 10, 20, 50 - (Default: 10)
  • 🎯 ACTIVE SITE PREDICTION (asp): Choose whether to perform active site prediction.

    • yes

    • no - (Default)

  • ⚙️ OUTPUT FORMAT (format): Select the format for your results.

    • JSON - (Default)

    • Plain Text

4️⃣ Submit Your Job

  • Once your sequence is entered and parameters are set, click the Submit or Run button.

  • Your job will be dispatched to the EMBL-EBI Web Service. You will be automatically redirected to a Job Status page to monitor its progress.

5️⃣ Interpret Results

  • On the results page, you will find a list of identified Pfam domains, their locations on your sequence, E-values, and links to the Pfam database for detailed information on each domain.

  • If you selected "JSON" output, the results will be in a machine-readable format.

  • ⭐ Tip: Focus on matches with low E-values, as these indicate more significant and reliable domain predictions.

🧪 Example Usage

Input Sequence (FASTA - Protein Example):

>sample_protein_sequence
MAMALASLASLASLASLASLAMALASLASLASLASLASLAMALASLASLASLASLASLAM