Storage Unpacked Podcast

#65 – Challenges in Managing Unstructured Data with Shirish Phatak


Listen Later

In this week’s podcast we focus on the issues of managing unstructured data in a distributed world.  Chris and Martin are joined by Shirish Phatak, CEO at Talon Storage.
It’s interesting that “unstructured” proves to have a moveable definition, depending on what you want to include.  While we traditionally think of files and objects as unstructured, these so-called binary pieces of content typically do have structure within them.  In contrast, databases can be made up of unstructured data – e.g. files, that together take a structured form.
Getting past the definition, we find that data growth is certainly dependent on the industry, with a minimum 20% annually, rising to as much as 100%.  As Martin points out, in his company, the assumption is that 80% of storage will be full within 6 months of deployment.
With distributed data, we see processing at the edge and data management at the core.  In practical terms though, this can also mean moving data into the core for more analytics or match processing.  The conversation highlights how data consistency or concurrency is so important in a distributed environment.  It’s easy for users to simply copy and rename a file, throwing data management processes into confusion.
Finally, the conversation moves to the public cloud, which at present seems to be acting simply as a large, easy to use repository.
You can find Talon Storage here – https://www.talonstorage.com/ and Shirish on LinkedIn here.
Elapsed Time: 00:31:51
Timeline

* 00:00:00 – Intros
* 00:01:00 – What is unstructured data?
* 00:04:30 – Why is unstructured the source of new data growth?
* 00:06:30 – Automated/background tasks creating data
* 00:07:00 – To centralise or not centralise?  What data is actually useful?
* 00:09:00 – How can you define security rules outside the data centre?
* 00:12:00 – Increased volumes of data result in policies, not active management
* 00:13:30 – Consistency and concurrency – enemies of distributed data
* 00:16:00 – One true copy – but at the risk of performance?
* 00:17:30 – You can’t fix stupid users!
* 00:19:00 – Are filesystems at fault?  Do we need ILM (again)?
* 00:22:30 – How is public cloud helping manage data?
* 00:28:30 – Are there any standards or best practices we can follow?
* 00:30:30 – Wrap up

Related Podcasts & Blogs

* #62 – The Future of Data Infrastructure with Scott Hamilton
* #60 – New Data Economy with Derek Dicker
* #57 – Storage on the Edge with Scott Shadley
* DataGravity Pointed The Way to Data Rather than Storage Management
* Conflating Data Protection and Data Mobility
* Technology Choices for Data Mobility in Hybrid Cloud

Shirish’s Bio
Shirish Phatak is the Founder and CEO of Talon. Shirish has over 15 years of experience building scalable, high performance systems that solve mission critical information technology challenges.  Shirish was Chairman of the Board and Co-founder of Velocius Networks, a creator of network performance management solutions.
...more
View all episodesView all episodes
Download on the App Store

Storage Unpacked PodcastBy Storage Unpacked Podcast

  • 4.8
  • 4.8
  • 4.8
  • 4.8
  • 4.8

4.8

21 ratings


More shows like Storage Unpacked Podcast

View all
Security Now (Audio) by TWiT

Security Now (Audio)

1,963 Listeners

Marketplace by Marketplace

Marketplace

8,594 Listeners

Risky Business by Patrick Gray

Risky Business

363 Listeners

TechONTAPPodcast by NetApp

TechONTAPPodcast

32 Listeners

The Ben Shapiro Show by The Daily Wire

The Ben Shapiro Show

153,794 Listeners

Pivot by New York Magazine

Pivot

8,872 Listeners

The Daily by The New York Times

The Daily

111,388 Listeners

The Pure Report by Pure Storage

The Pure Report

33 Listeners

Practical AI by Practical AI LLC

Practical AI

190 Listeners

Hard Fork by The New York Times

Hard Fork

5,350 Listeners

The Ringer F1 Show by The Ringer

The Ringer F1 Show

590 Listeners

House of R by The Ringer

House of R

3,020 Listeners