Voltage Noise

2009
M. S. Gupta, V. J. Reddi, G. Holloway, G. - Y. Wei, and D. M. Brooks, “An Event-Guided Approach to Reducing Voltage Noise in Processors,” in Design, Automation & Test in Europe Conference & Exhibition, 2009. DATE'09. 2009, pp. 160–165. Publisher's Version
V. J. Reddi, M. S. Gupta, M. D. Smith, G. - Y. Wei, D. Brooks, and S. Campanoni, “Software-Assisted Hardware Reliability: Abstracting Circuit-Level Challenges to the Software Stack,” in Proceedings of the 46th Annual Design Automation Conference, 2009, pp. 788–793. Publisher's VersionAbstract

Power constrained designs are becoming increasingly sensitive to supply voltage noise. We propose a hardware-software collaborative approach to enable aggressive operating margins: a checkpoint-recovery mechanism corrects margin violations, while a run-time software layer reschedules the program’s instruction stream to prevent recurring margin crossings at the same program location. The run-time layer removes 60% of these events with minimal overhead, thereby significantly improving overall performance.

Categories and Subject Descriptors

C.0 [Computer Systems Organization]: General— Hardware/Software interfaces and System architectures.

General Terms

Performance, Reliability.

Keywords

Runtime Optimization, Hardware Software Co-Design.

Paper
V. J. Reddi, M. S. Gupta, G. Holloway, G. - Y. Wei, M. D. Smith, and D. Brooks, “Voltage Emergency Prediction: Using Signatures to Reduce Operating Margins,” in High Performance Computer Architecture, 2009. HPCA 2009. IEEE 15th International Symposium on, 2009, pp. 18–29. Publisher's VersionAbstract

Inductive noise forces microprocessor designers to sacrifice performance in order to ensure correct and reliable operation of their designs. The possibility of wide fluctuations in supply voltage means that timing margins throughout the processor must be set pessimistically to protect against worst-case droops and surges. While sensor-based reactive schemes have been proposed to deal with voltage noise, inherent sensor delays limit their effectiveness. Instead, this paper describes a voltage emergency predictor that learns the signatures of voltage emergencies (the combinations of control flow and microarchitectural events leading up to them) and uses these signatures to prevent recurrence of the corresponding emergencies. In simulations of a representative superscalar microprocessor in which fluctuations beyond 4% of nominal voltage are treated as emergencies (an aggressive configuration), these signatures can pinpoint the likelihood of an emergency some 16 cycles ahead of time with 90% accuracy. This lead time allows machines to operate with much tighter voltage margins (4% instead of 13%) and up to 13.5% higher performance, which closely approaches the 14.2% performance improvement possible with an ideal oracle-based predictor.

Paper
V. J. Reddi, et al., “Voltage Noise: Why It’s Bad, and What To Do About It,” in 5th IEEE Workshop on Silicon Errors in Logic-System Effects (SELSE), Palo Alto, CA, 2009.Abstract

Power constrained designs are becoming increasingly sensitive to supply voltage noise. We propose hardware-software collaboration to enable aggressive voltage margins: a fail-safe hardware mechanism tolerates margin violations in order to train a run-time software layer that reschedules instructions to avoid recurring violations. Additionally, the software controls an emergency signature-based predictor that throttles to suppress emergencies that code rescheduling cannot eliminate.

Paper

Pages