ColTraIn HBFP Training Emulator

ColTraIn HBFP Training Emulator

Co-located deep neural network training and inference

HBFP is a hybrid Block Floating-Point (BFP) - Floating-Point (FP) number representation for DNN training introduced by ColTraIn: Co-located DNN Training and Inference team of PARSA and MLO at EPFL. HBFP offers the best of both worlds: the high accuracy of floating-point at the superior hardware density of fixed-point by performing all dot products in BFP and other operations in FP32. For a wide variety of models, HBFP matches floating-point’s accuracy while enabling hardware implementations that deliver up to 8.5x higher throughput. This repository is for ongoing research on training DNNs with HBFP.

Deep Neural Networks
Key facts
Maturity
Support
C4DT
Inactive
Lab
Unknown
  • Technical
  • Research papers

Parallel Systems Architecture Lab

Parallel Systems Architecture Lab
Babak Falsafi

Prof. Babak Falsafi

The Parallel System Architecture Lab senses that information technology has undergone a major paradigm shift with sensors, embedded and mobile devices generating massive amounts of data to be augmented with backend cloud services for enhanced experience. Data has emerged as a currency for modern society. Datacenters are now the backbone of IT offering large-scale cloud services at low costs benefiting from and exploiting the economies of scale. With silicon efficiency scaling having dwindled since 2004 and silicon density scaling slowing down, future digital platforms will rely on heterogeneous logic and memory to allow for IT scalability. Meanwhile, the demand for large-scale cloud services has grown dramatically faster than conventional silicon scaling making IT platform scalability a grand challenge. Future platforms will need hand-in-hand collaboration of application domain experts and platform designers to improve scalability. With many online services being in-memory and the minimum communication latency between the farthest nodes in a 20MW datacenter being microseconds, future server platforms will go through revolutionary changes in architecture to enable seamless aggregation of logic and memory resources across nodes, breaking the conventional system abstraction layers.

This page was last edited on 2024-03-15.