Gnarly Data Waves by Dremio

By Dremio (The Open Data Lakehouse Platform)

Gnarly Data Waves is a weekly show about the world of Data Analytics and Data Architecture. Learn about the technologies giving the company access to cutting-edge insights. If you work datasets, data ... more

· Technology

Download on the App Store

Download on the App Store

Get it on Google Play

FAQs about Gnarly Data Waves by Dremio:

How many episodes does Gnarly Data Waves by Dremio have?

The podcast currently has 62 episodes available.

Gnarly Data Waves by Dremio episodes:

April 12, 2023 EP12 - How to Modernize Hive to the Data Lakehouse with Dremio and Apache Iceberg
So you want to base your data lakehouse around Apache Iceberg to take advantage of its features, performance and vast ecosystem of platform/tool compatibility. You’ll need to take your current Hive tables and convert them to Iceberg. Iceberg offers several tools to do so depending on your needs and in this video we’ll explore that migration process.
- How to do an in-place migration and avoid rewriting your data
- How to do a shadow migration where you can update your data’s schema and partitioning
- How to move Apache Iceberg tables from one catalog to another
See all upcoming episodes: https://www.dremio.com/gnarly-data-wa...
Connect with us!
Twitter: https://bit.ly/30pcpE1
LinkedIn: https://bit.ly/2PoqsDq
Facebook: https://bit.ly/2BV881V
Community Forum: https://bit.ly/2ELXT0W
Github: https://bit.ly/3go4dcM
Blog: https://bit.ly/2DgyR9B
Questions?: https://bit.ly/30oi8tX
Website: https://bit.ly/2XmtEnN#datalakehouse #analytics #datawarehouse #datalake #opendatalakehouse #gnarlydatawaves #apacheiceberg #dremio #dremioartic #datamesh #metadata #modernization #datasharing #migration #ETL #datasilos #datagrowth #selfservice #compliance #arctic #dataascode #branches #tags #optimized #automates #datamovement #zorder #clustering #metrics #filtering #partitioning #sorting #tableformat #hive
...more
1h 4min
April 05, 2023 EP11 - Apache Iceberg Office Hours - Apache Iceberg 1.2.0 has been released
Listen to Dremio's developer advocacy and engineering teams for an installment of Apache Iceberg Office Hours. During this time we’ll have a brief Iceberg presentation on Hidden Partitioning and Partitioning transforms in Iceberg and then lots of dedicated time for Q&A on the presented topic or any other questions or guidance you’re looking for help on in learning about Apache Iceberg or architecting your data lakehouse around Apache Iceberg.
Questions being asked:
How can I optimize my Iceberg tables for my different use cases?
What tools will best handle my ETL job to write to Iceberg?
How can I control access to my Iceberg tables?
How can I convert data from X into an Iceberg table?
How can I get started with Iceberg in Databricks?
#datalakehouse #analytics #datawarehouse #datalake #opendatalakehouse #gnarlydatawaves #apacheiceberg #dremio #tableau #bestpractices #dashboards #partitionevolution #metadata #icebergpartitioning #partitiontransforms #officehours
...more
37min
March 31, 2023 EP10 - Optimizing Data Files in Apache Iceberg Performance Strategies
Querying 100s of petabytes of data demands optimized query speed specifically when data accumulates over time. We have to ensure that the queries remain efficient because over time you may end up with a lot of small files and your data might not be optimally organized.
In this video, Dipankar will cover:
Apache Iceberg table format
Problems in the data lake: small files, unorganized files
Techniques such as: partitioning, compaction, metrics filtering
Overlapping metrics problem
Solving it using sorting, Z-order clustering
See all upcoming episodes: https://www.dremio.com/gnarly-data-wa...
Connect with us!
Twitter: https://bit.ly/30pcpE1
LinkedIn: https://bit.ly/2PoqsDq
Facebook: https://bit.ly/2BV881V
Community Forum: https://bit.ly/2ELXT0W
Github: https://bit.ly/3go4dcM
Blog: https://bit.ly/2DgyR9B
Questions?: https://bit.ly/30oi8tX
Website: https://bit.ly/2XmtEnN

#datalakehouse #analytics #datawarehouse #datalake #opendatalakehouse #gnarlydatawaves #apacheiceberg #dremio #dremioartic #datamesh #metadata #modernization #datasharing #datagovernance #ETL #datasilos #datagrowth #selfservice #compliance #arctic #dataascode #branches #tags #optimized #automates #datamovement #zorder #clustering #metrics #filtering #partitioning #sorting #tableformat
...more
43min
March 22, 2023 EP9 - Build your open data lakehouse Iceberg with Fivetran and Dremio
The data lakehouse is quickly emerging as the ideal data architecture because it combines the flexibility and scalability of data lakes with the data management, data governance, and data analytics capabilities of data warehouses. Table formats bring many of the “house” features to the data lakehouse. Apache Iceberg is a truly open table format that is built for easy management and high performance analytics on the largest data volumes in the world.
In this video, we’ll discuss:
- Why open table formats are fundamental to building a data lakehouse
- How Fivetran automates data movement and helps organizations easily move data from various sources to their Amazon S3 data lake in Apache Iceberg tables.
- How Dremio & Fivetran simplify your data lakehouse architecture while providing high performance and ease of use.
See all upcoming episodes: https://www.dremio.com/gnarly-data-wa...
Connect with us!
Twitter: https://bit.ly/30pcpE1
LinkedIn: https://bit.ly/2PoqsDq
Facebook: https://bit.ly/2BV881V
Community Forum: https://bit.ly/2ELXT0W
Github: https://bit.ly/3go4dcM
Blog: https://bit.ly/2DgyR9B
Questions?: https://bit.ly/30oi8tX
Website: https://bit.ly/2XmtEnN#datalakehouse #analytics #datawarehouse #datalake #opendatalakehouse #gnarlydatawaves #apacheiceberg #dremio #dremioartic #datamesh #metadata #modernization #datasharing #datagovernance #ETL #datasilos #datagrowth #selfservice #compliance #arctic #dataascode #branches #tags #fivetran #automates #datamovement #amazons3
...more
42min
March 15, 2023 EP8 - Managing your data as code with Dremio Arctic
As data lakes become the primary destination for growing volumes of customer and operational data, data teams need tools and processes that ensure data quality and consistency across data consumers and use cases. Join Dremio’s Jeremiah Morrow and Alex Merced as they discuss the emergence of data as code for data management, its benefits for data teams, and how Dremio customers are using it to deliver access to a consistent and accurate view of data in their data lakes.

In this video on Gnarly Data Waves - Managing your data as code with Dremio Arctic, you will learn about:
- Why data as code is necessary for ensuring consistency and data quality for large data lakes.
- How Dremio Arctic uses Git-like concepts such as branches, tags, and commits to make data management easy.
- Some high value use cases for data as code.

See all upcoming episodes: https://www.dremio.com/gnarly-data-waves/?utm_medium=social-free&utm_source=youtube&utm_term=GDWEP8&utm_content=gdw-OD&utm_campaign=gdw-EP8

Connect with us!
Twitter: https://bit.ly/30pcpE1
LinkedIn: https://bit.ly/2PoqsDq
Facebook: https://bit.ly/2BV881V
Community Forum: https://bit.ly/2ELXT0W
Github: https://bit.ly/3go4dcM
Blog: https://bit.ly/2DgyR9B
Questions?: https://bit.ly/30oi8tX
Website: https://bit.ly/2XmtEnN

#datalakehouse #analytics #datawarehouse #datalake #opendatalakehouse #gnarlydatawaves #apacheiceberg #dremio #dremioartic #datamesh #metadata #modernization #datasharing #datagovernance #ETL #datasilos #datagrowth #selfservice #compliance #artic #dataascode #branches #tags
...more
49min
February 22, 2023EP 7 - Getting Started with Hadoop Migration and Modernization
Most companies use Hadoop for big data analytical workloads. The problem is, on-premises Hadoop deployments have failed to deliver business value after it is implemented. Over time, the high cost of operations and poor performance places a limitation on an organization’s ability to be agile. As a result, data platform teams are looking to modernize their Hadoop workloads to the data lakehouse.
In this video, learn about:
Use cases for modernizing Hadoop workloads

How the data lakehouse solves the inefficiencies of on-premises Hadoop

Success stories from organizations that have modernized Hadoop with the data lakehouse on Dremio
...more
43min
February 15, 2023 EP6 - Total Economic Impact of Data Lakehouse
As enterprise data platforms look to operate at a more efficient level, they face the pressure to pivot their data management strategies. The increasing volume of data, demand for self-service analytics that meets compliance requirements, and complexity of data distribution channels are all factors to consider when making a business case. In this video, we will cover the three-year Total Economic Impact™ of the data lakehouse and quantifiable benefits to productivity across all teams. You will learn about: - Key challenges organizations face with explosive data growth and data silos - Increasing team productivity and focusing more on high-value projects - Reducing data storage costs and retiring complicated ETL processes

See all upcoming episodes: https://www.dremio.com/gnarly-data-wa...
...more
45min
February 08, 2023 EP5 - Apache Iceberg Office Hours - Apache Iceberg Partitioning Explanation
Join the Dremio developer advocacy and engineering teams for an installment of Apache Iceberg Office Hours. In this video, we’ll have a brief Iceberg presentation on Hidden Partitioning and Partitioning transforms in Iceberg and then lots of dedicated time for Q&A on the presented topic or any other questions or guidance you’re looking for help on in learning about Apache Iceberg or architecting your data lakehouse around Apache Iceberg. Examples of questions you can come to ask: How can I optimize my Iceberg tables for my different use cases? What tools will best handle my ETL job to write to Iceberg? How can I control access to my Iceberg tables? How can I convert data from X into an Iceberg table? How can I get started with Iceberg in Databricks? #datalakehouse #analytics #datawarehouse #datalake #opendatalakehouse #gnarlydatawaves #apacheiceberg #dremio #tableau #bestpractices #dashboards #partitionevolution #metadata #icebergpartitioning #partitiontransforms #officehours #iceberg
...more
52min
February 01, 2023 EP4 - Best Practices for Optimizing Tableau Dashboards with Dremio
Tableau is a visual analytics platform that helps more people in organizations see and understand their data. Dremio helps Tableau users accelerate access to data, including cloud data lakes, and it can dramatically improve query performance, delivering analytics for every data consumer at interactive speed. In this video, we'll cover: - how the Dremio open data lakehouse connects Tableau users directly to data lake storage and other data repositories,
- how reflections accelerate query performance for ad hoc analysis and interactive dashboards, and
- how the Dremio semantic layer extends self
-service capabilities beyond the visualization layer, so anyone can join and query data easily.
VIDEO ON YOUTUBE: https://www.youtube.com/watch?v=8fzYLgKHIj0
#datalakehouse #analytics #datawarehouse #datalake #opendatalakehouse #gnarlydatawaves #apacheiceberg #shorts #dremio #tableau #bestpractices #dashboards #optimizing #selfservice
...more
50min
January 25, 2023EP 3 - Migrating from Delta Lake to Iceberg
Iceberg has been gaining wide adoption in the industry as the defacto open standard for data lakehouse table formats. Join us as we help you learn the options and strategies you can employ when migrating tables from Delta Lake to Apache Iceberg. We’ll cover:
Why migrate to Apache Iceberg

How to do an In-place migration and avoid rewriting files

How to do a shadow migration

Best practices
PRESENTATION ON YOUTUBE: https://youtu.be/11p3AaPduos
Apache Iceberg FAQ: https://www.dremio.com/blog/apache-iceberg-faq/

Apache Iceberg 101: https://www.dremio.com/subsurface/apache-iceberg-101-your-guide-to-learning-apache-iceberg-concepts-and-practices/
...more
1h 1min

FAQs about Gnarly Data Waves by Dremio:

How many episodes does Gnarly Data Waves by Dremio have?

The podcast currently has 62 episodes available.