DataOps Unleashed | Free Virtual Sessions on March 17, 2021

Data Teams Summit 2021 Speakers & Sessions

Priya Vijayarajendran

Vice President Data & AI

Maxime Beauchemin

CEO & Founder

Kunal Agarwal

Co-Founder & CEO

Abe Gong

Co-Founder & CEO

James Fielder

Senior Data Engineer

Srinivasa Gajula

Lead Engineer

Jeff Lambert

Vice President of Data Solutions

Angelo Carvalho

Principal Solutions Architect

Matthew Carroll

CEO

Shirshanka Das

Co-Founder & CTO

Kevin Davis

Application Engineering Manager

Kumar Menon

SVP Data Fabric & Decision Science Technology

Sarah Gadd

Global Head of Data and Artificial Intelligence Solutions

Sanjeev Mohan

Research Vice President, Big Data & Advanced Analytics

Wayne Eckerson

President

Paloma González Martínez

Chief Data Officer

Patrick Druley

Senior Solution Engineer

Shivnath Babu

Co-Founder & Chief Technology Officer

David Lloyd

Chief Data Officer

David Bath

Vice President of Platforms

Chinmay Sagade

Principal Engineer

Christopher Bergh

CEO, Founder, Head Chef

Suresh Devarakonda

Lead Database Engineer

Justin Borgman

Chairman & CEO

Sandeep Uttamchandani

CDO & VP of Eng.

Ry Walker

Founder & CTO

Vijay Kiran

Head of Data Engineering

Guy Adams

CTO

Nick Acosta

Developer Advocate

Stephen Bailey

Director of Applied Data Science

Tristan Spaulding

Senior Director of Product Management

Andrew Gelinas

Co-Founder

Watch All of the 2021 Sessions On-Demand

Welcome | Andrew Gelinas, Co-Founder @ Solution Monday

Opening Keynote: Unleashing DataOps | Kunal Agarwal, Co-Founder & CEO @ Unravel

DataOps empowers data teams to cost-effectively deliver high-quality data products, with the increasing use of AI and machine learning. With DataOps, teams go deep on specific technologies, such as Hadoop, Hive, Spark, Presto, Kafka, Databricks, and Snowflake. But they also maintain and manage across technologies, from data ingest through data pipelines to on-time delivery of analytics and results. Join DataOps innovator Kunal Agarwal, CEO of Unravel Data, as he describes how companies large and small are using DataOps to make their technology stacks hum, get more done at a lower cost, and improve both customer experience and the bottom line.

A journey to the cloud for Adobe’s corporate data platform | Kevin Davis, Application Engineering Manager @ Adobe

Adobe has just embarked on a multi-year journey to transition their on-premise Hadoop data platform to the cloud. With thousands of users, petabytes of data, and millions of monthly job executions, transitioning to the cloud will be a tremendously challenging task. Join Kevin Davis as he shares the catalysts that started Adobe on this journey, the processes being employed to ensure key customer challenges are addressed in the new environment, and other tools and strategies that are helping along the way. If your organization is contemplating a move to the cloud, this session will provide key insights into the early stages of Adobe’s transition that will help you plan your initiative.

Data Quality in DataOps | Abe Gong, Co-Founder & CEO @ Superconductive

As the world’s leading tool for data quality, Great Expectations occupies a unique position in the DataOps ecosystem. Over the last year, thousands of data scientists, engineers, and analysts have joined the Great Expectations community, making it one of the fastest-growing data communities in the world. In addition, Great Expectations integrates with many other DataOps tools, giving our developers a unique perspective on how the ecosystem is developing.

This presentation will share examples, patterns, and emerging best practices for data quality from the Great Expectations community. The first half of the talk will focus on nuts-and-bolts engineering, including common use cases and deployment patterns for data quality. The second half of the talk will share learnings for how data quality and DataOps are reshaping data workflows and collaboration. Together this presentation will give you a clear view into how to get started with data quality, and where the field is going as a whole.

DataOps automation and orchestration with Fivetran, dbt, and the modern data stack | Nick Acosta, Developer Advocate @ Fivetran

Many organizations struggle with creating repeatable and standardized processes for their data pipeline. Fivetran reduces pipeline complexity by fully managing the extraction and loading of data from a source to a destination and orchestrating transformations in the warehouse.

This talk will explain and evaluate the various benefits currently available using a DataOps approach with Fivetran and the rest of the modern data stack.

DataOps principles and practices | Vijay Kiran, Head of Data Engineering @ Soda

DataOps has grown because of the need to support execution at scale in the data management space. During this session Vijay Kiran, Head of Data Engineering at Soda Data, will present how the practice of DataOps is fundamental to how data moving across the stack, from source to data product, is monitored and managed to provide trusted data to the business to transform how data analytics works. Attendees will walk away with DataOps principles and practices that deliver value from data.

Observability of Apache Airflow | Ry Walker, Founder & CTO @ Astronomer

Apache Airflow has become an important tool in the modern data stack. We will explore the current state of observability of Airflow, common pitfalls if you haven't planned for observability, and chart a course for where we can take it going forward.

The evolution of a data platform | James Fielder, Senior Data Engineer @ Cox Automotive

Designing a data platform is no easy task, particularly when there are new technologies, techniques, and approaches appearing every week. At Cox Auto UK we have been on a journey from manually deployed Hadoop clusters to a full platform as a service setup using Azure Databricks. This journey hasn’t always been smooth however and we’ve learned some things along the way! In this talk, we will examine how we have made design choices while evolving our platform, our decision to open source some of our work, and what our past, present, and future look like.

A modern data estate is the key to data-led digital transformation | Priya Vijayarajendran, VP Data & AI @ Microsoft

At Microsoft, we believe that a modern, cloud-based data strategy is the foundation of successful digital transformation. Microsoft has helped innumerable organizations with that journey and we know that making the shift to the cloud or embracing a hybrid approach can be challenging. In this talk, Microsoft VP of Data and AI PriyaVijayarajendran will describe the benefits of data-led digital transformation and share her insights on what it takes to build a data-first culture. She’ll share how a modern data estate, powered by the right operational and analytics databases and employing AI and ML can create cloud-native user experiences that give your enterprise a competitive edge. If your organization is ready to embrace the cloud but unsure of how to take the first steps, this session will provide the inspiration, insights, and strategies to help you succeed.

How to Find a Misbehaving Model | Tristan Spaulding, Senior Director of Product Management @ DataRobot

Monitoring machine learning models once they are deployed can make the difference between a creating competitive advantage with ML and suffering setbacks that erode trust with your users and customers. But measuring ML model quality in production environments requires a different perspective and toolbox than monitoring normal software applications. In this talk, I share some practical techniques for identifying decaying models, along with strategies for providing this protection at scale in large organizations.

DataOps for the new data stack | Shivnath Babu, Co-Founder & CTO @ Unravel

This talk demystifies the new data stack that thousands of companies are deploying to convert data into insights continuously and with high agility. This stack continues to evolve with the emergence of new data roles like analytics engineers and ML engineers as well as new data technologies like lake houses and data validation. A new wave of operational challenges has emerged with this stack that, unless addressed from day one, will derail its success. Shivnath will discuss these DataOps challenges and the best practices to address them. The talk will be accompanied by a brief demonstration.

Driving DataOps Culture with LinkedIn DataHub | Shirshanka Das, Co-Founder & CTO @ Acryl Data

Your data is not changing slowly, so why should your metadata?

LinkedIn DataHub was open-sourced to enable other organizations to harness the power of metadata and unleash excellent DataOps practices. Doing DataOps well requires bringing together multiple disciplines of data science, data analytics, and data engineering into a cohesive unit. However, this is complicated, because there are a wide variety of data tools that are in use by these different tribes. Shirshanka, who founded and architected DataHub at LinkedIn, will describe its journey in enabling DataOps use-cases on top of the metadata platform. He will also showcase the latest integrations and features in the tool and share the roadmap for the project.

ELT-G: Locating governance in the modern data stack | Stephen Bailey, Director of Applied Data Science @ Immuta

ELT is a data ingestion pattern that promotes an "extract first, model later" approach to building data workflows. While it saved time for data teams and enabled more agile development, the buzz about ELT has not given proper credit to its silent G: governance. Unlike transformations, proper governance, and in particular, securing access to data, cannot be deferred until later, and requires clear, consistent principles to be implemented by data teams.

During his talk, Stephen will provide a framework for thinking about data governance in an ELT landscape, introduce policy-based access controls, and provide some suggestions for data teams to get started with better governance today.

Best practices for optimizing your big data costs with Amazon EMR | Angelo Carvalho, Principal Solutions Architect @ AWS

As data volumes increase, so do the costs of processing it. We’ll review several best practices and new features that enable you to cut operating costs and create efficiencies when processing vast amounts of data using Amazon EMR. Session attendees will be able to walk away with a solid understanding of Managed Scaling, improving Apache Spark performance to help lower their Amazon EMR costs, and monitoring, tuning, and troubleshooting solutions for big data workloads on Amazon EMR.

The state of DataOps | Wayne Eckerson, President @ Eckerson Group and Kunal Agarwal, Co-Founder & CEO @ Unravel

There is a lot of interest in DataOps, but many people are confused about what it is and isn't. Is DataOps a methodology for building pipelines? A set of development and execution tools? Or a process for continuous improvement? This session will clear up the confusion and help align our understanding before we dive into details during breakout sessions.

To start this fireside chat, veteran data and analytics thought leader Wayne Eckerson will deliver a short presentation based on three-years of research that describes the core principles and components of DataOps. He will then sit down with Unravel CEO, Kunal Agarwal, to discuss the trends, challenges, and best practices required to succeed with DataOps. The goal is to give attendees a clear, concise, and unbiased understanding of DataOps with guidance about where to start and how to implement it.

Panel: Creating a data-driven culture | Moderated by Sanjeev Mohan, Research Vice President, Big Data & Advanced Analytics @ Gartner

Panelists:

Sarah Gadd, Global Head of Data and Artificial Intelligence Solutions @ Credit Suisse

Kumar Menon, SVP Data Fabric & Decision Science Technology @ Equifax

David Lloyd, Chief Data Officer @ Ceridian

Paloma González Martínez, Chief Data Officer @ AlphaCredit

More and more companies are adding a Chief Data Officer (CDO) and other leadership roles to the executive-suite, often leading an organization-wide transformation to a data-driven culture. CDOs, and other senior technologists, face a wide range of challenges. At the same time, the progress of data technology - especially cloud services, AI, and machine learning - is opening up new opportunities.

Our panelists will share their insights and describe the strategies they’re using to move data-driven decision-making to the core of organizational processes, products, and services.

Things you may not know about Apache Kafka but should | Patrick Druley, Senior Solution Engineer @ Confluent

In this session, you will learn about some of the common misconceptions, best practices, and little-known facts about Apache Kafka. Event Streaming has changed the way businesses think about data movement and integration. If you are new to Kafka or having been creating topics and developing clients for years, there's something for everyone in this fun and informative session.

Apache Superset for Data Engineers | Maxime Beauchemin, CEO & Founder @ Preset

Superset is the leading open source data exploration and visualization platform. In this talk, we’ll be presenting Superset with a focus on advanced topics that are most relevant to Data Engineers. The presentation will include a live demo of the product, and dive into advanced topics including the alert & report framework, the REST API, and building custom visualization plugins.

Building Checkpoints in your DataOps | Sandeep Uttamchandani, CDO & VP of Engineering @ Unravel

Behind every successful insight (BI analytics or ML model) is a reliable data pipeline! These pipelines are planned, implemented, deployed, and monitored in an ongoing fashion referred to as the DataOps infinity loop (similar to CI/CD for traditional software). This talk covers battle scars in managing DataOps at scale, and how building checkpoints in the DataOps loop can reduce missed SLAs, cost outages, escalation from data users, and most importantly avoid data pipeline surprises!

Universal Data Authorization for Your Data Platform: What is It and Why Now? | Mary Flynn, Senior Director of Product Marketing @ Okera

With all the advances in DataOps, many data-driven initiatives still fail. Why? Because organizations still struggle to resolve two problems as old as data itself: people can retrieve and use data they should not have access to, and other people cannot access data for legitimate business purposes.

In this session, you’ll learn what Universal Data Authorization is and how adding it to your modern technology stack brings clarity and appropriate control across your entire data platform. You’ll learn why fine-grained access control and de-identification techniques are the new table stakes, and why success at enterprise scale is only achieved through an API-first platform approach with delegated stewardship and full visibility for audit and reporting. Join this session to learn how your company can accelerate business agility, minimize data security risks, and demonstrate regulatory compliance.

How 84.51° Slashed Operational Costs & Improved DataOps Efficiency by Solving Problems with Small Files | Jeff Lambert, Vice President of Data Solutions @ Kroger/84.51˚ and Suresh Devarakonda, Lead Database Engineer @ Kroger/84.51˚

Hear from 84.51° as they give a 30,000 ft view into their management of Yarn and Impala. They will share how they solved challenges associated with small files and used a centralized DataOps approach to troubleshoot issues with their big data pipelines. 84.51° will also take from their executive dashboards and share key learnings in helping your business improve efficiency and reduce operational costs.

Improving platform resiliency by detecting harmful workloads | Chinmay Sagade, Principal BizOps Engineer @ Mastercard and Srinivasa Gajula, Lead Engineer @ Mastercard

Big Data unlocks tremendous opportunities. The distributed platforms which enable this capability are dependent on optimized workloads to make efficient use of available resources. In this talk, we will be presenting an application monitoring system created at MasterCard, which detects harmful workloads and helps maintains the business goals on resiliency and latency.

Founder's Roundtable | Panel Discussion

Moderator: Wayne Eckerson, President @ Eckerson Group

Kunal Agarwal, Co-Founder & CEO @ Unravel

Matthew Carrol, CEO @ Immuta

Christopher Bergh, CEO, Founder, and Head Chef @ DataKitchen

Ry Walker, Founder & CTO @ Astronomer

Justin Borgman, Chairman & CEO @ Starburst

Join Wayne Eckerson and Unravel’s Kunal Agarwal for a Founders' Roundtable with Astronomer's Ry Walker, DataKitchen's Christopher Bergh, and Immuta's Matthew Carrol, and Starburst's Justin Borgman as we look at the evolution of data and the rapid adoption of DataOps.

Our participants join us to share their perspectives on their place in the fast-changing DataOps world. They will share their thoughts on why this community and these conversations are so important today and give real-life examples of how they are working to help organizations realize value from their data.

This roundtable is sure to be lively. Each of our panelists is an innovator in their respective realm; we invite you to attend and help us tap their expertise for innovative DataOps strategies.

What is the cost to attend and watch the virtual sessions?

Data Team Summit is always free and open for all to attend.

When is Data Teams Summit 2024?

Data Teams Summit 2024 was held on January 24, 2024.

What is Data Teams Summit?

Data Teams Summit is an annual peer-to-peer day of empowerment for data teams that reflects our focus on the teams and individuals running, managing, and monitoring data pipelines.

Data Teams Summit is a full-day virtual conference, led by real-world data practitioners and leaders at future-forward organizations about how they're establishing predictability, increasing reliability, and creating economic efficiencies with their data.

Who comes to the Data Teams Summit?

Data professionals and experts including data engineers, administrators, architects, analysts, AI/ML professionals, and relevant data technology leadership.