About
FlatProt is a Python package for protein structure and sequence analysis. It provides standardized 2D visualizations for enhanced protein comparability, combining efficient processing of proteins with user-friendly visualizations.
Standard Workflow on the command line
FlatProt provides four main commands for protein structure analysis.
Note
The following examples assume you've installed FlatProt with uv tool add FlatProt
. If you're using the no-install option, replace flatprot
with uvx flatprot
in all commands.
Single Structure Projection
Generate 2D projections of individual protein structures in three simple steps:
- Obtain the protein structure file (CIF or PDB format)
- From databases like PDB, AlphaFold, or your own modeling tools
- CIF format is recommended for better handling of complex structures
- Calculate secondary structure using DSSP
- Run:
mkdssp -i structure.cif -o structure.cif
- This identifies α-helices, β-sheets, and other structural elements
- Run:
- Generate the 2D projection with FlatProt
- Basic usage:
flatprot project structure.cif -o projection.svg
- With pdb file:
flatprot project structure.pdb --dssp structure.dssp -o projection.svg
- Customize with options like
--canvas-width 500 --canvas-height 400
for sizing
- Basic usage:
Multi-Structure Comparison
Create overlay visualizations to compare multiple related structures:
- Collect related structure files (same family or similar proteins)
- Generate family-aligned overlay:
flatprot overlay "structures/*.cif" -o comparison.png
- For consistent size comparison:
flatprot overlay "structures/*.cif" --disable-scaling -o overlay.png
Structure Alignment
Align structures to known protein families:
- Find best matching family:
flatprot align structure.cif -i alignment_info.json
- Use alignment in projections:
flatprot project structure.cif --matrix alignment_matrix.npy -o aligned.svg
Region-Based Analysis
Extract and visualize specific structural regions with comparative alignment:
- Define regions of interest:
flatprot split protein.cif --regions "A:1-100,A:150-250" -o regions.svg
- With database alignment:
flatprot split protein.cif --regions "A:1-100,A:150-250" --show-database-alignment -o aligned_regions.svg
- Progressive gap positioning:
flatprot split protein.cif --regions "A:1-100,A:150-250" --gap-x 150 --gap-y 100 -o positioned.svg
Common Command Chaining Workflows
Family Comparison with Optimal Conservation
For optimal conservation across multiple structures, align all to the same family reference:
# 1. Find optimal alignment for family
flatprot align reference_protein.cif -i family_info.json
# 2. Extract family ID from results
family_id=$(jq -r '.best_hit.target_id' family_info.json)
# 3. Create conserved overlay using fixed family ID
flatprot overlay "family_proteins/*.cif" --family "$family_id" -o family_overlay.png
Aligned Structure Visualization
Create family-aligned 2D projections of individual structures:
# 1. Find optimal family alignment
flatprot align protein.cif -m alignment_matrix.npy
# 2. Create aligned 2D projection
flatprot project protein.cif --matrix alignment_matrix.npy -o aligned_protein.svg
Comprehensive Domain Analysis
Combine full structure and domain-specific views:
# 1. Create full structure view
flatprot project protein.cif -o full_structure.svg
# 2. Extract and align specific domains
flatprot split protein.cif --regions "A:1-100,A:150-250" --show-database-alignment -o domains.svg
The resulting files contain clean, publication-ready 2D representations that can be viewed in any web browser or vector graphics editor.
Key Features
- Single Structure Visualization: Standardized 2D projections of individual protein structures
- Multi-Structure Overlays: Compare multiple structures with automatic clustering and alignment
- Region-Based Analysis: Extract and visualize specific domains, motifs, or binding sites with comparative alignment
- Family-Based Alignment: Integration with Foldseek alignments for protein family mapping with rotation-only transformations
- Database Annotations: Automatic SCOP family identification and alignment probability display
- Batch Processing: Efficient processing of proteins across varying sizes
- High Performance: Fast response times even for large, complex structures
- Scalable: Support for proteins up to 3,000 residues and datasets with 50+ structures
- Publication Ready: Clear and consistent visualizations for structure analysis
Visualization Options
FlatProt offers multiple visualization types to suit different analysis needs:
- Normal: Standard visualization with detailed secondary structure representation
- Simple-coil: Simplified coil representation while maintaining detailed secondary structure elements
- Only-path: Minimalist path representation of the protein structure
- Special: Enhanced visualization with customizable features for specific analysis needs
Secondary Structure Representation
- Helices: Visualized in red (#F05039), with options for simple or detailed representation
- Sheets: Shown in blue (#1F449C) with arrow representations
- Coils: Displayed in black with adjustable simplification levels
Advanced Features
- Multi-chain Support: Automatic handling and visualization of multiple protein chains
- Region-Specific Extraction: Precise extraction of structural domains, motifs, and binding sites
- Database-Driven Alignment: Integration with FoldSeek for family-specific structural alignment
- Rotation-Only Transformations: Align region orientations while preserving spatial relationships
- SCOP Family Annotations: Automatic identification and display of structural classification annotations
- Alignment Quality Metrics: Display of alignment probabilities and confidence scores
- Progressive Gap Positioning: Flexible domain spacing with configurable horizontal and vertical gaps for comparative visualization
- LDDT Score Visualization: Color-coded representation of local distance difference test scores
- Residue Annotations: Optional display of amino acid annotations and chain positions
- Custom Highlighting: Support for manual residue and bond highlighting through JSON configuration
Citation & Data
If you use FlatProt in your research, please cite our preprint:
FlatProt: 2D visualization eases protein structure comparison Tobias Olenyi, Constantin Carl, Tobias Senoner, Ivan Koludarov, Burkhard Rost bioRxiv 2025.04.22.650077; doi: https://doi.org/10.1101/2025.04.22.650077
Datasets and supplementary data are available on Zenodo: