Last week, I looked at some of the compelling reasons for transforming a set of commodity servers into a big flexible computer, or BFC. At TidalScale, we call this a Software-Defined Server -- a single virtual machine operating across multiple nodes, and that makes all the aggregated resources available to the application. But for today’s blog, it’s BFC all the way.Read More
If you’re familiar at all with TidalScale, then you know we believe people should fit the computer to the problem, rather than the other way around. We believe in new technologies that can be adopted easily, in leveraging advances in cost-effective hardware, and in automation. We believe you shouldn’t have to invest in new hardware to solve large or difficult computational problems. We believe commodity, industry-standard technologies hold remarkable power and possibilities that are just waiting to be tapped.Read More
Today, let’s look at how simple, straightforward and transparent Software-Defined Servers are.Read More
You may have seen last week’s announcement that TidalScale was named an IDC Innovator in a recent report on software-defined solutions in the data center. IDC Innovators: Virtualizing Infrastructure with Software-Defined Compute, 2017 (March 2017) calls out TidalScale for allowing enterprises to “reuse commodity servers currently in service as workload demands arise.” That’s a gloriously concise way to bottom-line theRead More
In Gary' Smerdons last post, he listed eight ways Software-Defined Servers can help reduce OpEx and CapEx, while helping data center managers extract maximum use and value from existing IT resources.
As vital as these benefits are to IT, operations, finance and other areas, the ability to scale your system to the size of your problem is just as beneficial to scientists and analysts – the people on the front lines of big data analytics.If you fall into that camp, then you’re probably familiar with the dreaded “memory cliff.”Read More
Cloud infrastructure providers today don’t have much flexibility when it comes to the systems they use. Resources devoted to running specific applications and workloads are generally confined to the limits of a single system, typically a “sweet spot” server that perhaps offers 24 cores and a few hundred gigs of memory. It's a matter of economics, really:Read More
Writing quality code can be a challenge for any organization. At TidalScale, we to go to great effort not to write terrible code. And while that might seem absurdly obvious, in a fast-growth environment, it doesn't exactly come easy. Here's some of what we do to make sure we come up with the good stuff.Read More
One word can kill a startup:Read More
Most data center managers – and even many end users – are familiar with Software-Defined Networking and Software-Defined Storage. These battle-tested approaches to virtualizing existing assets make it easier for resources to zig when workloads zag. They introduce significant flexibility into the data center, which is a win for practically everyone involved.
But one piece has been conspicuously missing from the software-defined puzzle: the server.Read More
When the demands of Big Data analytics surpass the core count and memory available on your biggest server, you’re usually left with three dismal options: spend money you don’t have on new hardware; devote time you can’t spare rewriting code to run across clusters; or delay insights you can’t put off by shrinking the size of your problems to fit the limits of your hardware.Read More
I’ve known Ike Nassi since we both worked at Digital Equipment back in the good old days, and I’ve always enjoyed talking to Ike about computer architecture. In some ways, what Ike’s doing at TidalScale seems very déjà vu with what DEC did in that timeframe when it introduced its first mini-computer with real virtual memory – the VAX-11/780. Virtual memory is an abstraction – let’s pretend we have a lot of physical memory even though we don’t; TidalScale is an abstraction – let’s pretend we have a big, powerful computer, even though we don’t.In both cases the idea might seem highly questionable to someone whose job is to squeeze the last Iota of performance out of a computer.Read More
We recently came across a problem that illustrates how software might be reconsidered in this new, software-defined-server environment.
Customer Problem Statement:
- Consider two tables of historical data for each of, for example 3,000 securities. One table is called “Left” and one “Right”.
- Each table for each security has a column of timestamps, and a column containing the name of the security its represents (e.g. “AAPL”), and additional data. The Left table might have, for example, 150 additional columns of data, and the Right table might have, for example, 100 columns of additional data.
When I talk with people about the rapidly increasing volumes of data they rely on to run their business, I describe data growth in terms of the cost of sending a kid to college. Today, tuition and fees at an out-of-state public school average nearly $25,000 a year. If those costs grow at the rate that data is growing – at 62 percent CAGR – then by the time your new born daughter heads to college, her freshman year alone will cost more than $200 million!Read More
Computer Science is obsessed with "negative results" and "limits.” We seem to delight in pinpointing a terminus for virtually any technology or architecture – to map the place where the party ends.
Take, for example, Amdahl's Law, which seems to suggest that once you reach a certain point, parallelism doesn't help performance. Amdahl’s law has prevented many from believing that a market exists for bigger single systems since, since the law leads us to conclude larger multicore systems won't solve today's problems any faster.
Beware the Intuitively Obvious
The reasoning behind Amdahl’s Law turns on an assumption that all the parts of the problem must interact in such a way that
TidalScale has built and operates very large (1.5TB - 3.0TB RAM) TidalPods for customers to test and evaluate running their applications on a single, virtualized very large system. We call these systems PoC systems (i.e. Proof Of Concept). We are making these systems available to you for a limited time, at no charge, to test your application(s) in a large scale environment. The basic testing program:Read More
INTRODUCTION - R IN-MEMORY
When we started searching for large scale Open R benchmarks we were surprised to find few good workloads for multi-terabyte sized TidalScale systems. We ended up writing our own R Benchmark that allowed us to scale R workloads to arbitrarily large in-memory sizes. In the process we learned a few tips and tricks that we thought we'd share for how to run large workloads using Open Source R.
Like many statistical analytic tools, R can be incredibly memory intensive. A simple GAM (generalized additive model) or K-nearest neighbor routine can devour many multiples of memory size compared to the starting dataset. And, R doesn't always behave nicely when it runs out of memory.