20x Performance for the Same HW Price: GPGPU-based Machine Learning at Yandex
Russian search-engine Yandex has disclosed some details about their machine-learning framework, FML. The most interesting detail is that it runs on 80 TFLOPS cluster powered by GPGPU. This is quite unusual application for GPU, as ML algorithms are usually hard to be paralleled. However they have managed to adapt their decision tree algorithm for high-level of parallelism. As a result Yandex has achieved more than 20x speed-up for the same hardware price.
They are going to upgrade their cluster to 300 TFLOPS. Yandex expects its cluster to be in the list of top 100 most powerful supercomputers in the world after that upgrade.