AF3parse函数解读
- 软件开发
- 2025-09-05 16:54:02

AlphaFold3中mmcif_parsing模块的parse函数的主要作用是解析 mmCIF 文件内容,提取蛋白质结构信息,转换为MmcifObject,并返回解析结果。
源代码: @functools.lru_cache(16, typed=False) def parse( *, file_id: str, mmcif_string: str, catch_all_errors: bool = True ) -> ParsingResult: """Entry point, parses an mmcif_string. Args: file_id: A string identifier for this file. Should be unique within the collection of files being processed. mmcif_string: Contents of an mmCIF file. catch_all_errors: If True, all exceptions are caught and error messages are returned as part of the ParsingResult. If False exceptions will be allowed to propagate. Returns: A ParsingResult. """ errors = {} try: parser = PDB.MMCIFParser(QUIET=True) handle = io.StringIO(mmcif_string) full_structure = parser.get_structure("", handle) first_model_structure = _get_first_model(full_structure) # Extract the _mmcif_dict from the parser, which contains useful fields not # reflected in the Biopython structure. parsed_info = parser._mmcif_dict # pylint:disable=protected-access # Ensure all values are lists, even if singletons. for key, value in parsed_info.items(): if not isinstance(value, list): parsed_info[key] = [value] header = _get_header(parsed_info) # Determine the protein chains, and their start numbers according to the # internal mmCIF numbering scheme (likely but not guaranteed to be 1). valid_chains = _get_protein_chains(parsed_info=parsed_info) if not valid_chains: return ParsingResult( None, {(file_id, ""): "No protein chains found in this file."} ) seq_start_num = { chain_id: min([monomer.num for monomer in seq]) for chain_id, seq in valid_chains.items() } # Loop over the atoms for which we have coordinates. Populate two mappings: # -mmcif_to_author_chain_id (maps internal mmCIF chain ids to chain ids used # the authors / Biopython). # -seq_to_structure_mappings (maps idx into sequence to ResidueAtPosition). mmcif_to_author_chain_id = {} seq_to_structure_mappings = {} for atom in _get_atom_site_list(parsed_info): if atom.model_num != "1": # We only process the first model at the moment. continue mmcif_to_author_chain_id[atom.mmcif_chain_id] = atom.author_chain_id if atom.mmcif_chain_id in valid_chains: hetflag = " " if atom.hetatm_atom == "HETATM": # Water atoms are assigned a special hetflagAF3parse函数解读由讯客互联软件开发栏目发布,感谢您对讯客互联的认可,以及对我们原创作品以及文章的青睐,非常欢迎各位朋友分享到个人网站或者朋友圈,但转载请说明文章出处“AF3parse函数解读”