Machine learning (ML) has found wide application in materials science. It is believed that a model developed by ML could depict the common trend of the data and therefore reflect the relationship between structure and property, which can be applied to most of the compounds. So, by training ML models with existed databases, important properties of compounds can be predicted ahead of time-consuming experiments or calculations, which will greatly speed up the process of new materials design.
While tremendously useful, these models do not directly show the rules and physics underlying the relationship between structure and property. And despite of their decent overall performance, there will always be some exceptions where ML models fail to give accurate predictions. Very often, it is these exceptions that shed some new insights about the underlying physics, and open up new frontiers in science.
A research group led by Prof. Feng Pan, the founding dean of the School of Advanced Materials, Peking University Shenzhen Graduate School, has recently shown that these models are valuable not only when they succeed in predicting properties accurately, but also when they fail. In their work, a model is built to predict the HSE band gaps of compounds according to their atomic structures only, based on a high-throughput calculation database constructed by the school themselves. The R2 of the model is 0.89, comparable with similar works. They then filtered out those structures with prediction error larger than 2 eV and examined them carefully. Many structures with unusual structure units, or showing other abnormities with similar compounds, like relatively large band gaps or being in different phases. Among these unusual structures, AgO2F raises great interest and a detailed analysis is given. It is found that Ag3+ and O22- coexist in this compound, and while Ag ions are in square planar coordination, there is little hybridization between orbitals of Ag and O. States near the band edges are mainly contributed by O-2p orbitals and the band gap is much smaller than other compounds with Ag3+ ions. This offers a new example for anionic redox property, a hot topic in the investigation of Li-excess electrode materials. These results demonstrate how unusual structures can be discovered from exceptions in machine learning, which can help us to investigate new physics and novel structural units from existing databases.