Tech Bytes

BiB 058: Build Workflows Around Unstructured Data With Igneous


Listen Later

The following is a transcript of the audio file you can listen to in the player above.
Welcome to Briefings In Brief, an audio digest of IT news and information from the Packet Pushers, including vendor briefings, industry research, and commentary.
I’m Ethan Banks, it’s November 19th, 2018, and here’s what’s happening:
* I had a briefing with Igneous earlier this month. Who’s Igneous? Igneous is focused on providing “as-a-service” solutions for unstructured data, including storage, backup & archival, global metadata indexing, and data workflow management. In this briefing, Igneous discussed recent announcements around their DataProtect, DataDiscover, and DataFlow services.
* Before we get to that, what is unstructured data? Unstructured data is, more or less, data that’s not in a highly structured form such what you’d find in a relational database. That doesn’t mean unstructured data doesn’t have any structure at all, but rather that the structure is not in a row/column format that lends itself well to carefully constructed queries. You know, like what SQL (structured query language) is all about.
* It’s the lack of nicely “queryable” structure that makes the unstructured data problem an interesting one to solve, as most data found at organizations is, in fact, unstructured. This is true even for big data shops, where high performance computing clusters run Hadoop and similar tools against massive unstructured datasets. In this unstructured data context, you find Igneous solving problems for companies. Let’s walk through the products they discussed with me in this briefing.
* First, DataProtect. DataProtect isn’t completely new, as it is the culmination of Igneous’ archive and backup offerings. The tool does what you’d expect. It’s about protecting your data in the form of backups, archival, and tiering of data to any public cloud, all delivered “as-a-service” meaning that Igneous is taking care of the DataProtect platform itself for you remotely.
* Second, DataDiscover. DataDiscover is about figuring out what’s interesting about your vast swaths of unstructured data by indexing all of the metadata that describes it. This service, by itself, is not about data movement or storage of the data. As the name suggests, DataDiscover wants to help you know what data you’ve got. Perhaps some of your data is old. Maybe there’s a huge volume of datasets, in which case DataDiscover helps you operate at scale. DataDiscover helps data managers understand what’s going in their data repositories and answer questions. Such as…is there data that should be archived? Where are the biggest files? Which files are the oldest? Etc.
* Third, DataFlow. DataFlow is about enabling end user workflows–making the lives of data consumers easier. In other words, this is less about IT folks who are storage professionals, and more about those who use the massive piles of data filling up all the storage volumes. The workflows around datasets tend to follow a lifecycle of capturing, processing, and analyzing data. Let’s give a simple high performance computing example. Imagine that, first, data is created by software. Then, the data is moved into an HPC cluster. The cluster runs software to process the data, generating more data in the form of results. That “results” dataset is copied to a directory for final analysis. Igneous DataFlow helps workflows around processes such as our little HPC example to be accomplished via APIs. The Igneous customer success team will work with the end user to make sure that metadata tagging, Python code, API calls, and overall workflow are functioning as well as possible. Again, remember that the DataFlow product is about the end user perspective. Igneous offered Paige.AI, a cancer research company, as a customer example for DataFlow…
...more
View all episodesView all episodes
Download on the App Store

Tech BytesBy Packet Pushers

  • 5
  • 5
  • 5
  • 5
  • 5

5

5 ratings


More shows like Tech Bytes

View all
Heavy Networking by Packet Pushers

Heavy Networking

326 Listeners

The Everything Feed - All Packet Pushers Pods by Packet Pushers

The Everything Feed - All Packet Pushers Pods

194 Listeners

The Fat Pipe - Most Popular Packet Pushers Pods by Packet Pushers

The Fat Pipe - Most Popular Packet Pushers Pods

70 Listeners

Network Break by Packet Pushers

Network Break

101 Listeners

Darknet Diaries by Jack Rhysider

Darknet Diaries

7,878 Listeners

CISO Series Podcast by David Spark, Mike Johnson, and Andy Ellis

CISO Series Podcast

187 Listeners

IPv6 Buzz by Packet Pushers

IPv6 Buzz

34 Listeners

Day Two DevOps by Packet Pushers

Day Two DevOps

15 Listeners

The Hedge by Russ White

The Hedge

15 Listeners

All-In with Chamath, Jason, Sacks & Friedberg by All-In Podcast, LLC

All-In with Chamath, Jason, Sacks & Friedberg

9,042 Listeners

Heavy Strategy by Packet Pushers

Heavy Strategy

27 Listeners

Heavy Wireless by Packet Pushers

Heavy Wireless

11 Listeners

Packet Protector by Packet Pushers

Packet Protector

6 Listeners

Network Automation Nerds by Packet Pushers

Network Automation Nerds

3 Listeners

Technically Leadership by Packet Pushers

Technically Leadership

0 Listeners

Total Network Operations by Packet Pushers

Total Network Operations

3 Listeners

N Is For Networking by Packet Pushers

N Is For Networking

11 Listeners