The Deep Green list is based on the identification and curation of conserved unannotated proteins in three green lineage (Viridiplantae) model organisms; Arabidopsis thaliana, Chlamydomonas reinhardtii, and Setaria viridis. Preliminary characterization of Deep Green proteins and genes was done using various informatics tools and published data sets and is presented in Knoshaug, Sun, et al., 2023, submitted. The structures of these unannotated proteins were also predicted using AlphaFold (Jumper et al., 2021). The data deposited here are the AlphaFold structural predictions having the highest pLDDT score and thus identified as the best folded structure (ranked_0). These data enable others to do in-depth structural characterizations to aid in functional characterization leading to deeper understanding of plant biology.
References:
Jumper, J., Evans, R., Pritzel, A., Green, T., Figurnov, M., Ronneberger, O., Tunyasuvunakool, K., Bates, R., Žídek, A., Potapenko, A., Bridgland, A., Meyer, C., Kohl, S. A. A., Ballard, A. J., Cowie, A., Romera-Paredes, B., Nikolov, S., Jain, R., Adler, J., Back, T., Petersen, S., Reiman, D., Clancy, E., Zielinski, M., Steinegger, M., Pacholska, M., Berghammer, T., Bodenstein, S., Silver, D., Vinyals, O., Senior, A. W., Kavukcuoglu, K., Kohli, P. and Hassabis, D. (2021) Highly accurate protein structure prediction with AlphaFold. Nature, 596:583-589.
Knoshaug, E. P., Sun, P., Nag, A., Nguyen, H., Mattoon, E. M., Zhang, N., Liu, J., Chen, C., Cheng, J., Zhang, R., St. John, P., and Umen, J. (submitted) Identification and preliminary characterization of conserved uncharacterized proteins from Chlamydomonas reinhardtii, Arabidopsis thaliana, and Setaria viridis.
References:
Jumper, J., Evans, R., Pritzel, A., Green, T., Figurnov, M., Ronneberger, O., Tunyasuvunakool, K., Bates, R., Žídek, A., Potapenko, A., Bridgland, A., Meyer, C., Kohl, S. A. A., Ballard, A. J., Cowie, A., Romera-Paredes, B., Nikolov, S., Jain, R., Adler, J., Back, T., Petersen, S., Reiman, D., Clancy, E., Zielinski, M., Steinegger, M., Pacholska, M., Berghammer, T., Bodenstein, S., Silver, D., Vinyals, O., Senior, A. W., Kavukcuoglu, K., Kohli, P. and Hassabis, D. (2021) Highly accurate protein structure prediction with AlphaFold. Nature, 596:583-589.
Knoshaug, E. P., Sun, P., Nag, A., Nguyen, H., Mattoon, E. M., Zhang, N., Liu, J., Chen, C., Cheng, J., Zhang, R., St. John, P., and Umen, J. (submitted) Identification and preliminary characterization of conserved uncharacterized proteins from Chlamydomonas reinhardtii, Arabidopsis thaliana, and Setaria viridis.
4 Data Resources
Name | Size | Type | Resource Description | History |
---|---|---|---|---|
Deep Green folded structures.zip | 118.65 MB | Archive | Predicted structures of the Deep Green protein set | |
A.thaliana_unannotated structures.zip | 154.13 MB | Archive | Predicted structures for Arabidopsis thaliana unannotated proteins | |
C.reinhardtii_unannotated structures.zip | 592.15 MB | Archive | Predicted structures for Chlamydomonas reinhardtii unannotated proteins | |
S.viridis_unannotated structures.zip | 367.09 MB | Archive | Predicted structures for Setaria viridis unannotated proteins |
Keywords
Submitted
• Apr •
19
2023
Biosciences Center
Cite This Dataset
Knoshaug, Eric, Peipei Sun, Ambarish Nag, Huong Nguyen, Erin Mattoon, Ningning Zhang, Jian Liu, Chen Chen, Jianlin Cheng, Ru Zhang, Peter St. John, and James Umen. 2023. "Deep Green Unannotated Protein Structures." NREL Data Catalog. Golden, CO: National Renewable Energy Laboratory. Last updated: December 12, 2023. DOI: 10.7799/1970473.
About This Dataset
216
10.7799/1970473
Public
12/12/2023
DOE Project
Deep Green: Structural and Functional Genomic Characterization of Conserved Unannotated Green Lineage Proteins
Facilities
High Performance Computing Center (HPC)
Funding Organization
Department of Energy (DOE)
Sponsoring Organization
USDOE Office of Science (SC), Biological and Environmental Research (BER) (SC-23)
Research Areas
Bioenergy
License
View License
Digital Object Identifier
10.7799/1970473