A new artificial intelligence system is able to recognize higher-level patterns that are consistent across different recipes for producing particular types of materials.
A team of materials scientists from the Massachusetts Institute of Technology (MIT) developed the new system that can identify correlations between precursor chemicals used in materials recipes and the crystal structures of the resulting products.
The system uses statistical methods that provide a natural mechanism for generating original recipes, which suggest alternative recipes for known materials that accord well with real recipes.
The new system learns to perform computational tasks by analyzing large sets of training data.
Traditionally, attempts to use neural networks to generate materials recipes have had problems with sparsity and scarcity. A system ideally needs to be trained on a huge number of examples in which the parameters are varied.
“People think that with machine learning, you need a lot of data, and if it’s sparse, you need more data,” Edward Kim, a graduate student in materials science and engineering at MIT, said in a statement. “When you’re trying to focus on a very specific system, where you’re forced to use high-dimensional data but you don’t have a lot of it, can you still use these neural machine-learning techniques?”
Neural networks are typically arranged into layers, each consisting of thousands of simple processing units, or nodes. Each node is connected to several nodes in the layers above and below, with data fed into the bottom layer that manipulates it and passes it to the next layer, where the process continues.
During training, the connections between nodes are constantly readjusted until the output of the final layer consistently approximates the result of some computation.
The problem with sparse, high-dimensional data is that for any given training example, most nodes in the bottom layer receive no data. It takes a prohibitively large training set to ensure that the network as a whole sees enough data to learn to make reliable generalizations.
The new network distills input vectors into small vectors, all of whose numbers are meaningful to every input. The middle layer of the network has just a few nodes in it, as little as two in some experiments.
The aim of training is to configure the network so that its output is as close as possible to its input. If training is successful, then the handful of nodes in the middle layer must somehow represent most of the information contained in the input vector, but in a much more compressed form, which compensates for sparsity.
The scientists also trained the network on not only recipes for producing particular materials, but also on recipes for producing very similar materials. They used three measures of similarity, one of which seeks to minimize the number of differences between materials — substituting, say, just one atom for another — while preserving crystal structure.
During training, the network is evaluated not only on how well its outputs match its inputs, but also on how well the values taken on by the middle layer accord with some statistical model.