The following text is a transcript of the audio you can listen to in the player above.
Welcome to Briefings In Brief, an audio digest of IT news and information from the Packet Pushers, including vendor briefings, industry research, and commentary.
I’m Ethan Banks, it’s November 27, 2018, and here’s what’s happening. I had a briefing with HammerSpace last month. HammerSpace is a newly launched company to help you manage your data in shops where you are functioning both in the public cloud and on-premises.
If you said, “Oh, so HammerSpace is a storage company,” you’d be wrong about that. HammerSpace is about managing your data, no matter where it’s stored. In fact, a big part of the HammerSpace value proposition is that you don’t have to care about where the data is stored anymore. You can just get on with using it.
In this briefing, HammerSpace introduced their company, discussed some use cases, and dove deeply into their technology.
Understanding HammerSpace
For storage admins, the simplest way to describe HammerSpace is that they are an abstraction layer between data and metadata. David Flynn, CEO described it like this. “We have introduced an abstraction which separates the data consumer side and the storage infrastructure side. And that’s really what’s necessary before we can manage data in the cloud.”
HammerSpace becomes your point of entry to the data in your organization. HammerSpace indexes all the metadata, and presents a file system view via NFS or SMB. HammerSpace also manages access to the data, based on the policy that’s been defined.
They described this approach as sort of like DNS. HammerSpace clients talk to the metadata service, obtaining routing information for the file needed. The client then accesses the file directly. This means that HammerSpace’s metadata control plane is not in the data path. Storage platforms and data consumers no longer have to be tightly coupled in this model, as HammerSpace is the abstraction layer in the middle.
The Parallel NFS Link
If you’re wondering how HammerSpace gets out of the data path, the answer is parallel NFS. Quoting from pnfs.com, “Parallel NFS removes the performance bottleneck in traditional NAS systems by allowing the compute clients to read and write data directly and in parallel, to and from the physical storage devices. The NFS server is used only to control metadata and coordinate access, allowing incredibly fast access to very large data sets from many clients.”
HammerSpace is using this technology, and even has Trond Myklebust, Chief Linux Kernel Maintainer for NFS on the HammerSpace team as CTO.
For clients without parallel NFS support, HammerSpace offers a Data Service Node to act as a proxy.
The Automated, Policy-Governed Data Mover
All of this technology is lovely, but you might be wondering how this helps with hybrid cloud operations. The other major piece of the HammerSpace puzzle is that HammerSpace can move data around for you, wherever you need it, without operator intervention beyond policy definition.
David Flynn explained it this way. “Once you’re able to present data through the lens of the metadata and manage it through the lens of the metadata, then you can think of it as truly, ‘Data like air. Simply everywhere.’ As a matter of fact, the metadata can be aggressively replicated across the globe at different data centers so that you can have the view of it and when you go to access it, it can get the data to where you need it in a on-demand basis.”
HammerSpace Use Cases
HammerSpace cited “Cloud Analytics With On-Premises Data” as their main use case currently. The idea is that you have data in-house, and you need to mine that data. However, you want to use a cutting-edge cloud-based tool to do the analysis. Uh-oh. Normally,