A new paradigm for high performance data analytics
Conventional computer design has changed little over the last fifty years. Almost all commercial computers employ CPUs connected to data caches, which are then connected to large memory subsystems. As the size of problems becomes large, many copies of these CPU-memory combinations are interconnected using some form of high speed network to compose a larger system. Such systems work extremely well, provided that a majority of accesses to memory hit in caches and that the vast majority of all references are for data in local memory, and therefore do not require moving data across the interconnecting network.
Unfortunately, with many Big Data problems, these conditions for efficient operation are not met. The data is too large to fit in a single memory, and, since the goal of data analytics is typically to detect and analyze relationships between data elements spread throughout the entire database, the targets of memory references are randomly distributed, and the vast majority of references go across the interconnect network. Conventional, cache-based computers rely on strong data locality for performance. The EMU system gains performance from weak locality, and has no reliance on data adjacency. Applications with large graphs or very sparse matrices, where data locality cannot be assumed, will continue to be problematic. Using a new paradigm based on Migratory Threads and Memory-side Processing, Emu effectively ‘brings the man to the mountain [of data]’, thereby avoiding the bandwidth and latency limitations that choke today’s HPC systems.
Emu’s revolutionary approach enables huge performance gains in extracting knowledge from diverse, unstructured data sets. Problems not solved well by today’s supercomputers are effectively addressed, including quantitative financial modeling and analytics, social media, NORA, fraud detection, optimization, and genomics, among others.