New York University’s Center for Data Science is at the cutting edge of fields with revolutionary implications such as machine learning, natural language processing, computer vision and intelligent machines.
Because computing speed is critical to accelerating experimentation and advancing research, the center’s Computational Intelligence, Learning, Vision and Robotics (CILVR) lab recently acquired a NVIDIA DGX-1 AI supercomputer to fuel this work like never before.
The CILVR lab has “unsupervised learning” as its focus. The lab’s faculty, research scientists and graduate students are developing techniques that allow machines to learn from raw, unlabeled data by, for example, observing video, looking at images or listening to speech.
These techniques are then applied to computer vision applications like self-driving cars that can understand the environment around them, medical image analysis that can detect tumors or disease earlier and more accurately than traditional methods, and natural language processing that can translate languages, answer questions or hold a dialogue with people.
“The DGX-1 is going to be used in just about every research project we have here,” said Yann LeCun, founding director of the NYU Center for Data Science and a pioneer in the field of AI. “The students here can’t wait to get their hands on it.”
The unsupervised learning algorithms the CILVR lab is working on require an immense amount of computation because researchers have to try many different versions of them to figure out which ones work best. Thousands of experiments are run in parallel. And the faster the results, the faster researchers can determine whether the settings and tuning was right.
“Having a fast machine is really crucial to be able to succeed,” said LeCun.
The DGX-1 has no peer. It delivers up to 170 teraflops of performance — equivalent to 250 conventional servers — in a box the size of a slim suitcase.
The Power of Predictive Analytics
Among other fields, the CILVR lab is breaking ground in predictive analytics as applied to areas such as urban systems and sports, led by researcher Claudio Silva.
With every taxicab in major cities now tracked via GPS, driver survey data is being superseded by machine learning models. These are built on actual traffic patterns and driver behavior across thousands of cabs and weeks of driving. The results reflect the reality of how traffic works.
In sports, Silva has been using deep learning to make sense of unprecedented amounts of detail captured on video of baseball players’ behavior. He’s already worked with Major League Baseball on its system for tracking every single movement of every player in the field, every pitch and every batter, as well as the ball.
But the amount of data is vast — one season’s worth of data is about 700,000 at-bats, which represents about a terabyte and a half of data.
With bigger models, researchers like Silva can ask bigger and better questions, like what exactly makes a certain pitch effective, what physical mechanics does a player need to change in his swing, and is a player likely to get injured.
“There is no way for us to effectively do machine learning on this without GPUs. It would take too long,” said Silva. “But systems like DGX-1 are going to enable us to go through all that data and create predictive models.”