Summary
Generative AI’s energy requirements are massive. To reduce the energy footprint of AI, organizations can put practices into place that are not only better for the planet but also enable them to operate more efficiently and cost-effectively.
Generative AI technology is still relatively new, but its impact on energy consumption is already soaring. Per recent reports and studies:
- Integrating large language models (LLMs) into search engines could mean a fivefold increase in computing power.
- One assessment suggests that ChatGPT is already consuming the energy of 33,000 homes.
- The training of Hugging Face’s LLM BLOOM led to 25 metric tons of carbon dioxide emissions.
That’s a lot of energy usage for a technology that’s just getting started and only recently became a part of everyday life. Without question, AI is coming with a hefty carbon footprint – hence why world and technology leaders are now constantly weighing in on AI sustainability and responsibility, which, as it turns out, were actually prominent discussion points at this year’s World Economic Forum.
While many, if not most, organizations have made pledges around sustainability and minimizing e-waste, data center demand could triple by 2028, and it’s data centers that provide the cloud computing that powers AI.
Per Nokia CEO Pekka Lundmark: “As new AI-embedded consumer devices and industrial applications come to market, cloud providers will need to continue adding capacity, minimizing latency, and ensuring reliable services, all while trying to make their operations more sustainable.”
Notably, and to be fair, AI also promises to increase efficiency and could help mitigate 5% to 10% of greenhouse gas emissions by 2030, according to Boston Consulting Group. But paradoxically, as AI becomes more efficient, it uses more power, not less.
Let’s look at the five best ways to reduce your AI energy footprint.
1. Optimize AI Algorithms
Data centers use most of their energy to operate processors and chips. Like other computer systems, AI systems process information via zeros and ones. Every time a bit changes between one and zero, it consumes electricity and generates heat. Roughly 40% of a data center’s electricity usage goes toward massive air conditioners that keep servers cool so they can keep functioning.
Optimizing AI algorithms means creating more efficient AI training models, which means less parameters for them to process, which means less changes from zero to one, which means less energy consumption, which means less heat, which means less energy required to keep servers cool.
It all adds up to better use of AI data and less money spent.
Techniques for optimizing AI algorithms include:
AI Data Pruning
Pruning involves removing less important neurons or connections in a neural network to reduce its size and computational load without significantly impacting performance. This can be done through techniques like weight pruning (removing weights with small magnitudes) or neuron pruning (removing entire neurons). As an example, in AlexNet, pruning 90% of the parameters reduced the model size by 9x with only a small drop in accuracy. Another example of AI data pruning is the use of small language models (SLMs) which are easier to train, provide higher accuracy, and use less power per computation than large language models.
Quantization
Quantization reduces the precision of the numbers used to represent the model’s parameters. Instead of using 32-bit floating-point numbers, 16-bit, 8-bit, or even lower precision can be used.
Post-training quantization and quantization-aware training are common methods. The former applies quantization after training, while the latter integrates it during the training process. An example would be TensorFlow Lite’s quantization, which can reduce a model’s size by up to 4x and improve inference speed by up to 3x with minimal impact on accuracy.
Compression
Compression reduces the storage space required for a model by using methods such as parameter sharing, low-rank factorization, or encoding techniques. Methods like Huffman coding, weight sharing, or singular value decomposition (SVD) can be employed to compress neural networks. The Deep Compression framework compressed deep neural networks by 35x to 49x without loss of accuracy through pruning, quantization, and Huffman coding.
2. Use the Best Possible Hardware
Having the most efficient hardware possible becomes increasingly important as AI models grow in complexity since hardware choice can significantly influence power efficiency, speed, and overall cost of AI operations.
AI model computation often requires the support of high-performance CPUs, GPUs, and specialized AI accelerators (e.g., TPUs), but these powerful processors tend to consume massive amounts of energy and create bottlenecks that ultimately affect a model’s performance. The amount and type of memory (RAM, VRAM) influence energy consumption. More memory and faster access speeds typically require more power, but high-power components generate more heat, necessitating efficient cooling solutions.
SSDs are generally more energy-efficient than HDDs, and ultimately less costly, and now all-flash arrays are changing the data storage game by maximizing speed, performance, and flexibility.
For CPUs:
- AMD EPYC and Intel Xeon are server-grade processors that offer high performance with power efficiency, making them suitable for AI workloads in data centers
- ARM-based processors are known for their energy efficiency and are becoming popular for AI inference tasks, especially in edge and mobile applications
For GPUs:
- NVIDIA A100 is designed for AI training and inference, offering very high performance per watt thanks to its Ampere architecture
- AMD Instinct MI100 is optimized for AI and machine learning workloads with a focus on energy efficiency
- NVIDIA’s Jetson series provides powerful AI capabilities with low power consumption
As for AI accelerators, consider:
- Google TPUs, which are custom-developed for AI workloads
- Edge TPUs, which are designed for on-device AI processing
For data storage, NVMe (non-volatile memory express) SSDs provide fast data access with lower power consumption compared to traditional HDDs.
3. Make Your Data Center as Efficient as Possible
Data centers consume 1%-2% of the world’s total electricity. As already mentioned, AI-specific operations contribute significantly to this consumption.
The good news is that there are many ways to make your data centers more efficient:
Cooling Optimization
Advanced cooling techniques such as liquid cooling, free cooling (using outside air), and hot aisle/cold aisle containment can significantly reduce energy consumption. Companies like Google use AI to optimize cooling in real time, adjusting settings dynamically based on current conditions to minimize energy use.
Server Consolidation
You can consolidate servers by running multiple applications on fewer machines, often via virtualization or containerization, which we’ll discuss more below. This reduces the energy footprint by decreasing the number of servers that need to be powered and cooled, leading to lower overall energy consumption and improved utilization of resources.
Energy-efficient Data Center Design
Modular data centers allow for better scalability and energy management by expanding capacity as needed without overbuilding. Using either renewable power supplies or power supplies with high-efficiency ratings (80 Plus Gold or Platinum) reduces energy losses during conversion. Implementing systems that monitor and manage energy usage throughout the data center can also optimize power consumption. Another part of energy-efficient design is building data centers in places with immediate access to renewable energy, as NVIDIA is beginning to do.
4. Use Cloud Computing and Virtualization
Cloud computing can significantly reduce the AI energy footprint through several mechanisms:
- Resource sharing: Cloud providers use multi-tenant environments where multiple customers share the same physical resources. This leads to better use of servers and reduces the need for excess capacity, thereby saving energy.
- Efficient data centers: Cloud providers invest heavily in optimizing their data centers for energy efficiency. They often use advanced cooling techniques, efficient hardware, and renewable energy sources, which can be more efficient than on-premises data centers.
- Scalability: Cloud platforms allow for dynamic scaling, meaning that resources can be allocated based on demand. This prevents overprovisioning and underutilization, which are common in on-premises setups.
- Geographic optimization: Cloud providers have data centers around the world. Workloads can be moved to locations where energy is cheaper and greener, further reducing the carbon footprint.
Virtualization also plays a crucial role in reducing energy consumption via:
- Server consolidation: Virtualization allows multiple virtual machines (VMs) to run on a single physical server. This reduces the number of physical servers needed, leading to lower energy consumption.
- Dynamic resource allocation: Virtual machines can be migrated between physical servers to balance loads and ensure efficient resource use. This can minimize the number of active servers, thus saving energy.
- Idle resource reduction: Virtualization technologies can power down idle resources or consolidate workloads during off-peak times, reducing unnecessary energy use.
5. Use Energy Monitoring and Management Best Practices
A key part of minimizing the environmental footprint of AI operations is knowing how your energy is being used so you can maximize efficiency.
There are various energy monitoring tools you can use, including:
- Cloud provider tools: Major cloud providers like AWS, Google Cloud, and Microsoft Azure offer built-in monitoring tools (e.g., AWS CloudWatch, Google Cloud Monitoring, Azure Monitor) that track resource usage and energy consumption of cloud-based workloads.
- Specialized energy monitoring software: Tools like Intel Power Gadget, NVIDIA’s nvidia-smi, and AMD’s ROCm provide detailed insights into the energy usage of specific hardware components.
- Data center management software: Tools like VMware vRealize Operations, Schneider Electric EcoStruxure, and Cisco Data Center Analytics help monitor and manage the energy consumption of on-premises data centers.
Energy consumption techniques include:
- Energy profiling: Measure and analyze the energy consumption of different components (CPU, GPU, memory, storage) to identify the most energy-intensive parts of the AI workload.
- Power capping: Implement power capping techniques to limit the maximum power consumption of hardware, ensuring that it operates within a specified energy budget.
- Dynamic voltage and frequency scaling (DVFS): Adjust the voltage and frequency of processors based on the workload demand to optimize energy usage.
- Off-peak scheduling: Schedule energy-intensive AI tasks during off-peak hours when energy costs and demand are lower. This can also help balance the load on the power grid.
- Batch processing: Aggregate smaller tasks into larger batches to optimize resource utilization and reduce the energy overhead associated with frequent task switching.
- Dynamic load balancing: Use dynamic load balancing techniques to distribute workloads evenly across available resources. This prevents overloading individual servers and ensures more efficient energy use.
Conclusion
We’re at an AI and sustainability inflection point. While most companies don’t yet fully understand AI’s impact, Pure Storage does. That’s why we’ve prioritized ESG and sustainability as part of our core mission.
By prioritizing the monitoring and management of AI-related energy consumption, organizations can achieve a balance between performance, cost efficiency, and environmental sustainability. These practices not only contribute to a greener planet but also enhance the overall efficiency and reliability of AI operations.
Learn more about how Pure Storage helps you fully capitalize on the AI opportunity without having to compromise.

Real-world Organizations Gaining ROI from AI
Written By:
Are You Ready?
Learn how Pure Storage can help you thrive in the AI era.