About the tool:

Today there is an avalanche of predicted structures available on the AlphaFold Structure Database, which has brought a huge paradigm shift in the analysis and interpretation of protein functions. The access to the structure files in PDB and mmCIF formats makes it easy for structural biologists and bioinformaticians to perform analyses. AlphaFold DB provides an option for 48 organisms’ bulk structure downloads (model organisms and global health proteome) and otherwise very recently, the bulk download has been enabled for up to 100 structures. However, there is no direct way to selectively download all the structures of interest at once. Also, several bioinformaticians work with sequence data from NCBI and have access to various protein sequence identifiers but cannot extract structures based on those sequence identifiers.

To bridge these gaps, we have developed AlphaFoldDBExtractor which is a tool that lets users download all available predicted structures from AlphaFoldDB for their dataset of interest with just a list of taxonomy/protein accessions.


Input:

AlphaFoldDBExtractor takes the input in several formats including:

List of specific proteins:
AlphaFold IDs
Uniprot IDs
Locus tags
Old locus tags
RefSeq Protein IDs


All proteins of organims:
NCBI TaxID



Workflow:

To download the structures from the FTP page of AlphaFold Database, the UniProt accessions can be used and the main protocol that this tool utilizes is the mapping of other ID formats to UniProt Accessions. This is how the mapping is happening:

About Image