3 Secrets to Right-Sizing a Server

I’ve grown accustomed to the stares of disbelief. It usually starts like the conversation I had the other day with some folks from a leading North American insurance company. They were planning to roll out an advanced new analytic model. Trouble was, they had no way to predict how much compute or memory capacity they’d need.

The business unit wanted the IT team to get the biggest system they could lay their hands on, short of a supercomputer. It’s a critical application and they felt they needed the margin of safety. The workload was likely to have big peaks now and then. The IT lead on this project had heard this before. Over-buying capacity once in a while might not be bad, but when you’re supporting thousands of applications, that wasted resource adds up. It always winds up being wasted.

When I explained to them that they probably already had everything they needed to deliver the right-sized system, I saw the first hint of that look. I explained how TidalScale enables them to create servers of virtually any size on the fly. And that when you can create servers of any size, you can do a lot of things faster and easier. You can flexibly define a server with enough RAM to run big, data-intensive applications fully in-memory. You can harness hundreds of CPU cores on a single system for compute-intensive problems. You can do this with your existing hardware, your current operating systems and, most important, your existing applications and tools.

At this point the look has transformed to a stare, as if to say, “Tom, you’re full of something and it isn’t pasta.” 

A Software-Defined Server implemented with TidalScale’s HyperKernel software offers several revelations. TidalScale can combine all the collective cores, memory and I/O of one or more commodity Intel-based servers to create any number of Software-Defined Servers to satisfy any assortment of workloads.  Our technology allows you to run a single instance of unmodified, unpatched releases of popular enterprise operating systems like Red Hat Enterprise Linux, Ubuntu and CentOS.  And you don’t even need proprietary hardware or special interconnects designed for high-performance computing. In fact, TidalScale's HyperKernel software is ideally suited to work 10gb Ethernet and the kind of industry-standard “sweet spot” servers that populate today’s data centers – servers acquired primarily for their price/performance advantages and their ability to handle most enterprise workloads, but that aren’t built for the kind of outlier workloads that are increasingly found in data centers of all kinds.

Delivering that level of flexibility to data centers takes a lot of innovation. Here’s a peek at three keys that enable you to have the right size server with the right performance profile, on the fly, the TidalScale way.

  • Mobilizing virtual CPU cores. TidalScale presents virtualized CPU cores (vCPUs) to the guest operating system. At any point in time, these virtual CPUs are executing on a physical CPU core counterpart. Taking advantage of VT-x and VT-d in modern Intel processors, TidalScale can move a vCPU from one physical server node to another. Moving the vCPU to a different node makes sense when certain workloads require it. That migration happens automatically, and the action of moving a vCPU only requires two CPU instructions, while consuming only one jumbo frame on Ethernet—not counting a few related data pages that must go along for the ride.
  • Using machine learning to automatically move virtual resources to where they’re needed most. A vCPU is only one of the compute resources that TidalScale can migrate back and forth between nodes. TidalScale can also move or copy pages of memory from one node to another over a 10GbE network based interconnect. And we can move virtual-I/O devices around the physical cluster. Migrating these resources is only effective if combined with a strategy to optimize the placement of resources instantaneously. So TidalScale employs machine learning within our HyperKernel software to evaluate multiple parameters of the system. In most cases, this involves the vCPU running on the same node as related data for reasonably long periods of time (in the case of servers, “long periods” may still be measured in milliseconds.) The goal of keeping the processing and data together follows the concept of locality. By properly implementing the locality principle, we can run compute operations essentially at full hardware speeds, without losing performance to overhead.ts_sds.png
  • Minimizing latency when virtual resources must move to a different node. When an operation occurs and the vCPU and needed data pages are on separate nodes, TidalScale’s machine learning assesses the overall performance cost of its options and then generally decides one of two things: either transfer or copy the needed data pages across the network to where the vCPU is running, or (in situations where a lot of data is involved) migrate the vCPU to where the data resides. To achieve this without great cost to application performance, TidalScale uses its own low-level protocol over physical 10GbE Ethernet. Moving a page of memory across the TidalScale resource interconnect takes 3 microseconds (µsec) compared with 50-150 nanoseconds for local DRAM access. While this step is 20x slower than local DRAM access, it’s still 1000x faster than accessing that page from the fastest NVMe SSD. Moving a vCPU takes two processor instructions and 6 µsec—still much faster than the fastest SSD.

The combination of the three crucial factors enables TidalScale to deliver Memorycliff-1.jpgconsistently good, predictable performance, even as workloads grow. For organizations looking for greater flexibility from their data center to derive maximum performance from shifting and ever-growing workloads, TidalScale provides a cost-effective path to success.

Still find it unbelievable? Let’s run your real-world analytic workload on TidalScale. Contact me for a Test-Drive.

Take TidalScale for a Test Drive

Topics: software-defined server, in-memory performance