Cloud infrastructure providers today don’t have much flexibility when it comes to the systems they use. Resources devoted to running specific applications and workloads are generally confined to the limits of a single system, typically a “sweet spot” server that perhaps offers 24 cores and a few hundred gigs of memory. It's a matter of economics, really:Read More
Writing quality code can be a challenge for any organization. At TidalScale, we to go to great effort not to write terrible code. And while that might seem absurdly obvious, in a fast-growth environment, it doesn't exactly come easy. Here's some of what we do to make sure we come up with the good stuff.Read More
One word can kill a startup:Read More
Most data center managers – and even many end users – are familiar with Software-Defined Networking and Software-Defined Storage. These battle-tested approaches to virtualizing existing assets make it easier for resources to zig when workloads zag. They introduce significant flexibility into the data center, which is a win for practically everyone involved.
But one piece has been conspicuously missing from the software-defined puzzle: the server.Read More
When the demands of Big Data analytics surpass the core count and memory available on your biggest server, you’re usually left with three dismal options: spend money you don’t have on new hardware; devote time you can’t spare rewriting code to run across clusters; or delay insights you can’t put off by shrinking the size of your problems to fit the limits of your hardware.Read More
I’ve known Ike Nassi since we both worked at Digital Equipment back in the good old days, and I’ve always enjoyed talking to Ike about computer architecture. In some ways, what Ike’s doing at TidalScale seems very déjà vu with what DEC did in that timeframe when it introduced its first mini-computer with real virtual memory – the VAX-11/780. Virtual memory is an abstraction – let’s pretend we have a lot of physical memory even though we don’t; TidalScale is an abstraction – let’s pretend we have a big, powerful computer, even though we don’t.In both cases the idea might seem highly questionable to someone whose job is to squeeze the last Iota of performance out of a computer.Read More
We recently came across a problem that illustrates how software might be reconsidered in this new, software-defined-server environment.
Customer Problem Statement:
- Consider two tables of historical data for each of, for example 3,000 securities. One table is called “Left” and one “Right”.
- Each table for each security has a column of timestamps, and a column containing the name of the security its represents (e.g. “AAPL”), and additional data. The Left table might have, for example, 150 additional columns of data, and the Right table might have, for example, 100 columns of additional data.
When I talk with people about the rapidly increasing volumes of data they rely on to run their business, I describe data growth in terms of the cost of sending a kid to college. Today, tuition and fees at an out-of-state public school average nearly $25,000 a year. If those costs grow at the rate that data is growing – at 62 percent CAGR – then by the time your new born daughter heads to college, her freshman year alone will cost more than $200 million!Read More
Computer Science is obsessed with "negative results" and "limits.” We seem to delight in pinpointing a terminus for virtually any technology or architecture – to map the place where the party ends.
Take, for example, Amdahl's Law, which seems to suggest that once you reach a certain point, parallelism doesn't help performance. Amdahl’s law has prevented many from believing that a market exists for bigger single systems since, since the law leads us to conclude larger multicore systems won't solve today's problems any faster.
Beware the Intuitively Obvious
The reasoning behind Amdahl’s Law turns on an assumption that all the parts of the problem must interact in such a way that
TidalScale has built and operates very large (1.5TB - 3.0TB RAM) TidalPods for customers to test and evaluate running their applications on a single, virtualized very large system. We call these systems PoC systems (i.e. Proof Of Concept). We are making these systems available to you for a limited time, at no charge, to test your application(s) in a large scale environment. The basic testing program:Read More
INTRODUCTION - R IN-MEMORY
When we started searching for large scale Open R benchmarks we were surprised to find few good workloads for multi-terabyte sized TidalScale systems. We ended up writing our own R Benchmark that allowed us to scale R workloads to arbitrarily large in-memory sizes. In the process we learned a few tips and tricks that we thought we'd share for how to run large workloads using Open Source R.
Like many statistical analytic tools, R can be incredibly memory intensive. A simple GAM (generalized additive model) or K-nearest neighbor routine can devour many multiples of memory size compared to the starting dataset. And, R doesn't always behave nicely when it runs out of memory.
TidalScale, Powering Software-Defined Servers:Executing and analyzing large, complex models pose unique computing challenges. And these challenges are only growing as companies face the need to process, analyze, and act on ever-increasing amounts of data. Typically, IT professionals are faced with two options: scale-up by purchasing extremely expensive specialized computers, or scale-out by rewriting applications using complex distributed algorithms for running on clusters of standard hardware. One costs money, the other costs time. And in today’s budget-conscious, real-time world, few organizations can afford either. Read More