In a breakthrough for computer vision and for bird-watching, researchers and bird enthusiasts have enabled computers to achieve a task that stumps most humans: identifying hundreds of bird species pictured in photos. The bird photo identifier, developed by the Visipedia research project in collaboration with the Cornell Lab of Ornithology, is available for free. An overview of the project will be presented by researchers from Cornell Tech and the California Institute of Technology at the Computer Vision and Pattern Recognition (CVPR) conference in Boston June 8.
Called Merlin Bird Photo ID, the identifier is capable of recognizing 400 of the mostly commonly encountered birds in the United States and Canada.
“It gets the bird right in the top three results about 90 percent of the time, and it’s designed to keep improving the more people use it,” said Jessie Barry, the Merlin Project Leader at the Cornell Lab of Ornithology. “That’s truly amazing, considering that the computer vision community started working on the challenge of bird identification only a few years ago.”
To see if Merlin can identify the bird in a photo, users upload an image and tell Merlin where and when the photo was taken. To orient Merlin, users draw a box around the bird and click on its bill, eye and tail. Merlin does the rest. Within seconds, it looks at the pixels and combines powerful artificial intelligence techniques with millions of data points from humans, then presents the most likely species, including photos and sounds.
“Computers can process images much more efficiently than humans — they can organize, index and match vast constellations of visual information, such as the colors of the feathers and shapes of the bill,” said Serge Belongie, a professor of computer science at Cornell Tech. “The state-of-the-art in computer vision is rapidly approaching that of human perception, and with a little help from the user, we can close the remaining gap and deliver a surprisingly accurate solution.”
Merlin’s success relies on collaboration between computers and humans. The computer learns to recognize each species from tens of thousands of images identified and labeled by bird enthusiasts. It also taps into more than 70 million sightings recorded by birders in the eBird.org database, narrowing its search to the species found at the location and time of year when the photo was taken.
Because the photo identifier uses machine-learning techniques, it has the potential to improve the more people use it. After it can reliably identify photos taken with smartphones, the team will add it to the Merlin Bird ID app, a free app that has helped users with more than 1 million bird identifications by asking them five questions.
Merlin’s computer vision system was developed by Steve Branson and Grant Van Horn of the Visipedia project, led by professors Pietro Perona at the California Institute of Technology and Belongie. Their work was made possible with support from Google, the Jacobs Technion-Cornell Institute at Cornell Tech and the National Science Foundation.