Computer scientists at the University of California San Diego have developed a new technology that can encode, transform and edit video faster–several orders of magnitude faster–than the current state of the art.
They presented their work at the ACM Symposium on Cloud Computing, Oct. 11 to 13 in Carlsbad, Calif.
The system, called Sprocket, was made possible by an innovative process that breaks down video files into extremely small pieces and then moves these pieces between thousands of servers every few thousands of a second for processing. All this happens in the cloud and allows researchers to harness a large amount of computing power in a very short amount of time. Sprocket was developed and written by CSE graduate students Lixiang Ao and Liz Izhikevich (now a PhD student at Stanford).
SPROCKET doesn’t just cut down the amount of time needed to process video, it is also extremely cheap. For example, two hours of video can be processed in 30 seconds with the system, instead of tens of minutes with other methods, for a cost of less than $1.
“Before, you could get access to a server for a few hours. Now, with cloud computing, anyone can have access to thousands of servers, for fractions of a second, for just a few dollars,” said George Porter, an associate professor in the Department of Computer Science and Engineering here at UC San Diego and one of the lead researchers on the project, as well as computer science professor Geoff Voelker.
This type of parallel computing in the cloud is offered by several big companies, including Amazon, Microsoft and Google.
SPROCKET is particularly well suited for image searches within videos. For example, a user could edit three hours of video from their summer vacation in just a few seconds to only include a video that features a certain person.
(An early demo of the technology consisted of editing down the “Infinity War” trailer so it would only feature Thor.)
SPROCKET can do this because it is extremely efficient at moving tiny fractions of video between servers and making sure they’re processed right away. It also makes sure that algorithms have enough context to process each specific video frame.