General-purpose GPUs (GPGPUs) are becoming prevalent in mainstream computing, and performance per watt has emerged as a more crucial evaluation metric than peak performance. As such, GPU architects require robust tools that will enable them to quickly explore new ways to optimize GPGPUs for energy efficiency. We propose a new GPGPU power model that is configurable, capable of cycle-level calculations, and carefully validated against real hardware measurements. To achieve configurability, we use a bottom-up methodology and abstract parameters from the microarchitectural components as the model’s inputs. We developed a rigorous suite of 80 microbenchmarks that we use to bound any modeling uncertainties and inaccuracies. The power model is comprehensively validated against measurements of two commercially available GPUs, and the measured error is within 9.9% and 13.4% for the two target GPUs (GTX 480 and Quadro FX5600). The model also accurately tracks the power consumption trend over time. We integrated the power model with the cycle-level simulator GPGPU-Sim and demonstrate the energy savings by utilizing dynamic voltage and frequency scaling (DVFS) and clock gating. Traditional DVFS reduces GPU energy consumption by 14.4% by leveraging within-kernel runtime variations. More finer-grained SM cluster-level DVFS improves the energy savings from 6.6% to 13.6% for those benchmarks that show clustered execution behavior. We also show that clock gating inactive lanes during divergence reduces dynamic power by 11.2%.
Internet web browsing has reached a critical tipping point. Increasingly, users rely more on mobile web browsers to access the Internet than desktop browsers. Meanwhile, webpages over the past decade have grown in complexity by more than tenfold. The fast penetration of mobile browsing and everricher webpages implies a growing need for high-performance mobile devices in the future to ensure continued end-user browsing experience. Failing to deliver webpages meeting hard cut-off constraints could directly translate to webpage abandonment or, for e-commerce websites, great revenue loss. However, mobile devices’ limited battery capacity limits the degree of performance that mobile web browsing can achieve. In this paper, we demonstrate the benefits of heterogeneous systems with big/little cores each with different frequencies to achieve the ideal trade-off between high performance and energy efficiency. Through detailed characterizations of different webpage primitives based on the hottest 5,000 webpages, we build statistical inference models that estimate webpage load time and energy consumption. We show that leveraging such predictive models lets us identify and schedule webpages using the ideal core and frequency configuration that minimizes energy consumption while still meeting stringent cut-off constraints. Real hardware and software evaluations show that our scheduling scheme achieves 83.0% energy savings, while only violating the cut-off latency for 4.1% more webpages as compared with a performance-oriented hardware strategy. Against a more intelligent, OS-driven, dynamic voltage and frequency scaling scheme, it achieves 8.6% energy savings and 4.0% performance improvement simultaneously.
In this paper, we characterize the impact of compiler optimizations on voltage noise. While intuition may suggest that the better processor utilization ensured by optimizing compilers results in a small amount of voltage variation, our measurements on a IntelR CoreTM2 Duo processor show the opposite – the majority of SPEC 2006 benchmarks exhibit more voltage droops when aggressively optimized. We show that this increase in noise could be sufficient for a net performance decrease in a typicalcase, resilient design.
Shrinking feature size and diminishing supply voltage are making circuits sensitive to supply voltage fluctuations within the microprocessor, caused by normal workload activity changes. If left unattended,voltage fluctuations can lead to timing violations or even transistor lifetime issues that degrade processor robustness. Mechanisms that learn to tolerate, avoid, and eliminate voltage fluctuations based on program and microarchitectural events can help steer the processor clear of danger, thus enabling tighter voltage margins that improve performance or lower power consumption.We describe the problem of voltage variation and the factors that influence this variation during processor design and operation. We also describe a variety of runtime hardware and software mitigation techniques that either tolerate, avoid, and/or eliminate voltage violations.We hope processor architects will find the information useful since tolerance, avoidance, and elimination are generalizable constructs that can serve as a basis for addressing other reliability challenges as well.
voltage noise, voltage smoothing, di dt , inductive noise, voltage emergencies, error detection, error correction, error recovery, transient errors, power supply noise, power delivery networks