Rising energy prices directly impact cloud builders like Managed Services Providers, Hosting Services Providers, Cloud Services Providers, enterprises and SaaS vendors. Since many of these companies operate their own on-premises cloud infrastructures, they are taking a hit as their overhead bills rise and derail their total cost of ownership projections.
This trend is forcing many companies to review and optimize power use by turning off sections of their infrastructure and seeking other cost-optimization tactics. With this article, we provide an overview of how cloud providers can optimize their operating costs from a hardware perspective.
Hardware components are often considered “cash consumers” because, in addition to their upfront cost, each one requires operating expenses like electricity. So, doing the same job with less hardware can significantly reduce costs and free up cash for other purposes.
While dual-socket servers—servers that support two CPUs—are commonplace in data centers, if a single-socket server can do the job, there is no need for a dual-socket. The presence of a second CPU tends to only lead to increased power consumption (by the CPU itself, the links between both CPUs, etc.) and can quickly eat up the single-rack power limit.
However, if the case requires more CPU cores than a single unit can provide, the dual-socket architecture is probably more power-efficient than two single-socket servers. That is because the dual-socket node still uses a single chipset on the motherboard, the same NIC(s), GPUs, coolers, etc.
Having two running nodes is an example of pure spending without value unless disaster recovery RPO and RTO objectives mandate it. Since doubling the hardware components (power supplies, coolers, NICs, etc.), two half-loaded physical servers will consume more power than one loaded at 85% to 90%. If the workloads and software allow it, the second node can be powered off and stay ready to be returned to work if the first one fails or needs maintenance or if there is a need for additional resources to handle peak loads. Of course, there is a tradeoff, boot-time of an offline node versus power used by the same node to stay idle.
For larger cases, a chassis with a modular system design provides high density, performance, efficiency and cost-effectiveness. Such systems share some hardware (e.g., power supplies) and provide rack space savings (e.g., eight servers in 4U chassis).
An integral part of offering hardware planning is to help customers design their cloud with the right hardware for their specific use cases. In the early stage of each project, always ask what components the customer plans to use for their particular use case (e.g., databases, IaaS services, web hosting, etc.), not only to confirm the hardware compatibility or check for known issues but also to help with the component choice. By achieving their goals with less and right-sized hardware, users ultimately achieve improved power saving as well.
With the current shortages in the components market and significantly increased delivery times, obtaining new chips, compute, network and storage resources to achieve their hardware-planning objectives could be time-consuming. This is the time during which a cloud builder’s business suffers or cannot grow because of a lack of resources.
This market situation forces companies to rethink and optimize their hardware utilization to achieve their power-saving goals with what they already have. “Hardwarepacking” is a set of approaches aiming to run maximum valuable workloads on a minimal amount of hardware resources in use. There are three general points of consideration.
• Get rid of unused hardware components. Why spend power and money on a RAID controller while the server has only NVMe devices on PCIe slots or just a single boot device? Why buy and “feed” a dual-port 100GbE NIC while the motherboard integrated 2×10 GbE NIC is enough for the job? If the 100 GbE NIC is unavoidable (e.g., already bought or integrated into the motherboard), connecting it to a 10 or 25 Gbps network will use less power. The difference is not big for a single server, but it becomes significant for a fleet of hundreds of nodes.
• Concentrate workloads on fewer servers. Servers without bottlenecks can take on additional tasks, and thus the workloads in the cloud can be concentrated in a subset of nodes. Load-free servers will use less power while idle or even powered off until they are needed to take on peak loads. If the resources are enough, it is more efficient to pack 20 VMs on a single hypervisor instead of dispersing them over two, three or more nodes.
• Make a schedule of the workloads and plan the needed resources for them. During weekdays, developers and QAs could need thousands of VMs or containers for their regular jobs, but there is no need for such a fleet to stay online and consume power during the weekends. The nightly builds and tests could run on the same groups of nodes where the daily reports are generated during the working hours—there is no value in keeping spare resources online for both 24/7. The dynamic scheduling of the used Cloud Management Platform can be helpful here. Depending on the level of automation, these schedulers can stop unnecessary workloads and free the used resources based on various criteria. Ideally, they could automatically migrate and pack workloads on only a few servers, freeing the rest and putting the off-loaded ones to sleep.
In the past, the approach to building cloud infrastructure was often just to buy a ton of RAM, CPU cores and storage devices. Nowadays, this is no longer so easy. With the difficulty of sourcing hardware and rising electricity, building a cloud to be more efficient from inception is a more cost-effective way to go. Working with the right provider to build right-sized clouds capable of running demanding applications is an ideal approach to reigning in costs due to hyperinflation or constrained energy supplies.
A shorter version of this blog post was originally published by Boyan Ivanov, CEO at StorPool Storage and Forbes Councils Member, on Forbes.com a month ago.
Boyan Ivanov, Co-Founder and Chief Executive Officer at StorPool Storage. Boyan started programming at the age of 10. At the same age, Boyan started his first venture, and the latter stuck. Boyan has versatile experience in the enterprise and SMB worlds and has also been part of several startups. Now he is focused on helping companies implement best-in-class primary storage solutions and achieve performance and efficiency for their clouds.