2023 Data Teams Summit Speakers & Sessions

Matthew Weingarten

Senior Data Engineer

Jonathan Neo

Data Engineer

Nikolas Schriefer

Director of Product - Global AI

Andrew Jones

Tech Lead & Senior Engineer

Julia King

VP, Data & Analytics

Sarah Floris, MS

Senior Data & ML Engineer

Loris Marini

Data Scientist, Founder & Host

Benjamin Rogojan

Data Consultant

Pravin Kedia

Chief Data Architect

Sanjeev Mohan

Principal

Nirmal Budhathoki

Senior Data & Applied Scientist

Swati Vishwanathan

Data Engineer

Mark Freeman II

Senior Data Scientist

Tobias Zwingmann

Data Science Mentor

Matthew Blasa

Data Scientist & YouTuber

Kunal Agarwal

Co-founder & CEO

Elliot Shmukler

Co-Founder & CEO

Stephan Claus

Director of Data Analytics

Preeti Hemant

Manager, Data Science & Machine Learning

Pragyansmita Nayak

Chief Data Scientist

Brandon Beidel

Director of Product Management

Ben Doremus

Chief Technology Officer

Vishal Ramrakhyani

VP Engineering

Alejandra Cabrera

Data Product Manager

Charles Boicey

CIO & Co-Founder

Mona Rakibe

Co-Founder & CEO

Puppy Tsai

Associate Product Manager

Ali Khalid

Manager, Enterprise Data & Analytics

Monica Kay Royal

Founder & Chief Data Enthusiast

Rob Albritton

VP, AI Practice Lead

Thiago Gil

Ambassador

Shane Murray

Field CTO

Anais Dotis-Georgiou

Lead Developer Advocate

Cristian Barca

Principal Data Engineer

Ted Sfikas

Senior Director of Digital Strategy & Value Engineering, Americas

Richad Nieves-Becker

Sr. Assoc. VP, Data Science

Zach Wright

Solution Engineer

Eric Callahan

Sr. Data Consultant

Andrew Gelinas

Co-Founder

Clinton Ford

Director of Product Marketing

Data Teams Summit 2023 | Session Recordings

Panel: Winning strategies to unleash your data team

Sanjeev Mohan, Principal @ SanjMo

Benjamin Rogojan, Data Consultant @ Seattle Data Guy

Kunal Agarwal, Co-Founder & CEO @ Unravel Data

Great data outcomes depend on successful data teams. Every single day, data teams deal with hundreds of different problems arising from the volume, velocity, variety—and complexity—of the modern data stack.

Learn best practices and winning strategies for what works (and what doesn’t) to help data teams tackle the top day-to-day challenges and unleash innovation.

Panel: Building flexible data teams to improve product delivery

Julia King, VP, Data & Analytics @ Carta

Elliot Shmukler, Co-Founder & CEO @ Anomalo

Stephan Claus, Director of Data Analytics @ Home to Go

Who should you hire to build your data product? A data scientist, data engineer, MLOps, or data analyst, and don’t forget the good old DBA. With many people specializing and marketing teams pushing new labels on us, it is hard to know how to organize your data team.

Learn from three leaders who have been a part of building data products at companies of all sizes on how they have seen data teams evolve and some ways to build flexible teams to improve data product delivery.

Breakout Session: Becoming a data engineering team lead

Matthew Weingarten, Senior Data Engineer @ Disney Streaming

As you progress up the career ladder for data engineering, responsibilities shift as you start to become more hands-off and look at the overall picture rather than a project in particular.

How do you ensure your team's success? It starts with focusing on the team members themselves.

In this talk, Matt Weingarten, a lead Data Engineer at Disney Streaming, will walk through some of his suggestions and best practices for how to be a leader in the data engineering world.

Breakout Session: The future of data orchestration: asset-based orchestration

Jonathan Neo, Data Engineer @ Canva

Data orchestration is a core component for any batch data processing platform and we’ve been using patterns that haven't changed since the 1980s.

In this talk, I’ll be introducing a new pattern and way of thinking for data orchestration known as asset-based orchestration, with data freshness sensors to trigger pipelines. I will demo this new pattern using popular tools of the modern data stack - dbt, airbyte, and dagster.

Breakout Session: How to build data quality as a product

Alejandra Cabrera, Data Product Manager @ Clearbit

Mona Rakibe, Co-Founder & CEO @ Telmai

With dozens of new data sources and a drive to experiment with data, many data engineers and data product managers are tackling data quality issues on a daily basis.

In this talk, Clearbit’s data product manager, Ale Cabrera, and Telmai’s CEO/Co-Founder, Mona Rakibe, will talk about how to solve this universal problem with a new, pragmatic approach.

In treating data quality as a first-class citizen and applying the same rigor and principles as any other product, Clearbit has been able to scale the data engineering and data science teams and accelerate product time to market - a best practice that you can learn from.

In this talk, you will learn how to:

Treat any data quality issue or feature enhancement as a product
Define requirements and prioritize engineering work
Build a strategy for proper design, engineering, testing, and user adoption
Create data quality health KPIs based to measure the success of your work

Breakout Session: Driving data culture change with data contracts

Andrew Jones, Tech Lead & Senior Engineer @ GoCardless

At GoCardless we’ve been implementing data contracts since 2021, using it as our vessel to increase the quality of data and drive a culture change in order to become a true data-driven organisation - one that really values its data and uses it to drive our products and business.

Following this talk you’ll understand the problems we’re trying to solve with data contracts, and how through both the tooling and by working with people across the organisation we’re changing our data culture to one where we’re much more deliberate about the data we produce, manage, and consume, leading to better data-driven outcomes across the business.

Breakout Session: Going from DevOps to DataOps

Ali Khalid, Manager, Enterprise Data & Analytics

DevOps has had a massive impact on the web services world, learn how to leverage those lessons and take them further to improve the quality and speed of delivery for analytics solutions.

Ali's talk will serve as a blueprint for the fundamentals of implementing DataOps, laying out some principles to follow from the DevOps world, and importantly adding subject areas required to get to DataOps - which participants can take back and apply to their teams.

Breakout Session: How data mesh unlocks your next growth chapter

Nikolas Schriefer, Director of Product - Global AI @ Hello Fresh

Do you experience ever-growing volumes of data leading to fragile architectures, increasing load times, exploding BI requests, SLA violations, and long times to insight? Do your data analysts spend lots of time on data wrangling due to missing stakeholder discoverability and accessibility? Are your data engineers fixing data downtime issues because of a lack of standardized data governance, operations, or infrastructure?

In this talk, Nikolas describes how the data mesh helped music marketing platform and publisher network Linkfire.com to overcome these growing pains by establishing a decentralized, domain-driven product and data organization with clear ownership and responsibilities while ensuring global standards and a shared infrastructure to maximize effectiveness and efficiency.

Panel Discussion: What habits do successful data engineers have?

Moderator: Loris Marini, Data Scientist, Founder & Host @ Discovering Data Podcast

Swati Vishwanathan, Data Engineer @ Swinerton

Tobias Zwingmann, Data Science Mentor @ Springboard

Vishal Ramrakhyani, VP Engineering @ Zoomcar

Panel Discussion: Bridging the gap: going from individual contributor to manager

Moderator: Matthew Blasa, Data Scientist & YouTuber @ Datalife 360

Monica Kay Royal, Founder & Chief Data Enthusiast @ Nerd Nourishment

Puppy Tsai, Associate Product Manager @ Coach Art

Rob Albritton, VP, AI Practice Lead @ Octo

Breakout Session: Data team leadership - What types of skills do new senior data scientists need to lead a team?

Matthew Blasa, Data Scientist & YouTuber @ Datalife 360

Learning to lead as a senior data science contributor is a challenge. It's a role that requires both individual contributor skills while learning data leadership. It's an important stepping stone that teaches you the fundamentals for a larger leadership role. This session will discuss the key skills you will need to grow, thrive, and lead a team.

This session will help:

Define the responsibilities expected of senior data contributors
Leadership skills needed at the senior data contributor level
Mentoring your reports improves quality, builds their morale, and gets things done
Communication
Delegate and plan your projects better
Steps you can take now to learn these skills

Breakout Session: Evolution of Analytics at Wattpad

Preeti Hemant,Manager, Data Science & Machine Learning @ Wattpad

Join Preeti as she walks through the structure of Wattpad’s data organization as well as the story of their analytics engagement model and the nature of their data tasks.

Preeti will share issues encountered along the way and how the causes were diagnosed, before covering the proposed solutions and their implementations.

Attendees will walk away understanding how to look at the current state of the data org and observe shifts.

Breakout Session: Maximize business results with FinOps

Thiago Gil, Ambassador @ FinOps Foundation

Clinton Ford, Director of Product Marketing @ Unravel Data

As organizations run more data applications and pipelines in the cloud, they look for ways to avoid the hidden costs of cloud adoption and migration. Teams seek to maximize business results through cost visibility, forecast accuracy, and financial predictability.

In this session, learn why observability matters and how a FinOps approach empowers DataOps and business teams to collaboratively achieve shared business goals. This approach uses the FinOps Framework, taking advantage of the cloud’s variable cost model, and distributing ownership and decision-making through shared visibility to get the biggest return on their modern data stack investments.

See how organizations apply agile and lean principles using the FinOps framework to boost efficiency, productivity, and innovation.

Breakout Panel: What's in store for the future of data engineering?

Eric Callahan, Sr. Data Consultant @ Pickaxe

Brandon Beidel, Director of Product Management @ Red Ventures

Shane Murray, Field CTO @ Monte Carlo

With budgets tightening and data use cases skyrocketing, how can data engineering teams set themselves up for success in 2023?

In this panel conversation, Shane Murray, Field CTO at Monte Carlo, Brandon Beidel, Director of Product Management, Red Ventures, and Eric Callahan, Sr. Data Consultant at Pickaxe Foundry and former Head of Data at Understood, will discuss their predictions for the year’s top challenges - and opportunities - facing data engineering teams in the New Year.

Their conversation will include the future of BI tooling, operationalizing distributed environments like data mesh, implementing data quality initiatives like data contracts, making the case for data with your finance team, and other pressing considerations.

Breakout Session: Self-service data governance

Pravin Kedia, Chief Data Architect @ IBM

Join Pravin for a session focused on data governance, data quality, and DataOps for end customers. He’ll walk through real-world examples using Watson Knowledge Catalog and IBM Cloud Pak for Data as an example - showing how governance has been improved and enabled self-service for business users’ profiles without data quality and profiling interventions from the IT team.

Attendees will walk away understanding how DataOps teams can finally bridge DataOps data governance gaps between business users and IT teams.

Breakout Session: The mesh inside the mesh

Christian Barca, Principal Data Engineer @ Elsevier

The challenging journey of migrating a traditional ETL architecture to a domain-oriented decentralized design for processing research data harvested from thousands of open data repositories.

The new architecture enabled the DataMonitor team (a research-data domain and product team within Elsevier) to perform cross-domain data processing, analysis, and linking of tens of millions of records in a semi-streaming fashion! This also allowed them to speed up the update time period and save infrastructure costs.

Attendees will learn from DataMonitor's experience in how to develop a state of art data-domain platform as part of their organizational data-mesh architecture to deliver maximum value to the business.

Breakout Session: Unlocking better data quality with machine learning & visualization

Zach Wright, Solution Engineer @ Anomalo

Technologies in the modern data stack enable enterprises to store, transform, and analyze their data. But the existing rule and metric approaches to monitoring the quality of this data are tedious to set up and maintain, fail to catch unexpected issues and generate false positive alerts that lead to alert fatigue.

In this session, Zach will dive into how data quality can be automatically monitored with machine learning to discover anomalies and find their root causes at scale. He will cover how visualization can enhance data quality monitoring by making it simple enough for all data consumers to use - from data engineers to data scientists to analysts.

Participants will gain an understanding of how intelligent data quality monitoring is emerging as the new pillar in the modern data stack.

Breakout Session: Methods and tools for time series data science problems with InfluxDB

Anais Dotis-Georgiou, Lead Developer Advocate @ Influx Data

In this talk, we’ll explore the ways a time series platform supports data scientists. We’ll learn how you could use Telegraf open source collection agent to perform forecasting at the edge.

We’ll explore how you can use Flux query language to prepare and clean your data as well as some preliminary data analysis.

Next, we’ll learn about integrations with Jupyter and Zeppelin notebooks that currently exist and the future ML integrations that InfluxData will support.

Finally, we’ll learn about some of InfluxData’s customers in the ML space and dive into how they use InfluxDB to build their own ML solutions.

Breakout Session: Building the best data team is not rocket science

Nirmal Budhathoki, Senior Data & Applied Scientist @ Microsoft

Nirmal’s approach to the right data team focuses on aligning the team's responsibilities with the skillsets you’re recruiting for.

In his talk, he’ll cover what type of data team is best for each scenario and discuss the data verticals data teams can support.

Attendees will walk away understanding the differences between centralized, decentralized, and federated data teams as well as ways each type of team can support data strategy, data governance, and data analytics.

Breakout Session: Designing a modern data center of excellence

Ted Sfikas, Senior Director of Digital Strategy & Value Engineering, Americas @ Tealium

Organizations globally have amassed an enormous amount of customer data that now requires careful oversight and timely activation in order to meet modern expectations. Doing so will bring brands as close as possible to the customer and automate some of the best revenue-generating processes available. You’re going to need some technology automation to help. You will increase revenue streams and optimize operational processes. So how can companies unlock this customer data? Who holds the key? Creating a Data Center of Excellence charts the path to successful data management in every business – large or small.

In this session, you'll learn which major stakeholders benefit from increased revenue and operational excellence, what a Data Center of Excellence (DCoE) is, and business practices and topics in scope for optimization by a DCoE.

Breakout Session: DataOps for digital twins

Pragyansmita Nayak, Chief Data Scientist @ Hitachi Vantara Federal

First mentioned in early 2000, in the context of the manufacturing domain, the Digital Twins (DT) concept is gaining wider acceptance today as the definite next step in leveraging data as both a strategic and tactical asset. Real-time data is the key requirement for continuously improving and highly sustainable DT applications. This has been made possible largely by the recent advances in data and computing – including but not limited to, performant data pipelines, big data architectures, data management, cloud-based as-a-service models, artificial intelligence, and advanced analytics (predictive and prescriptive). This advent of analytics and numerical simulations in the operational and monitoring phase of a product lifecycle reinforces the need for trusted real-time data.

This talk will focus on how data management plays a critical role in the core functions of DTs and helps refine the quality of the data assets. The trust and quality of the data need to be maintained and tracked over time by establishing data governance policies and processes. The need for scalable solutions will be stressed – to be specific, the accelerating need for any organization to include a data catalog with automated metadata management as well as lineage for dependency analysis as part of its data strategy.

Panel Discussion: What happens when your infrastructure doesn't scale anymore?

Moderator: Mark Freeman II, Senior Data Scientist @ On the Mark Data

Richad Nieves-Becker, Sr. Assoc. VP, Data Science @ Revantage

Ben Doremus, Chief Technology Officer @ Magenta

Sarah Floris, MS, Senior Data & ML Engineer @ Zwift

Breakout Session: Situational awareness in a technology ecosystem

Charles Boicey, CIO & Co-Founder @ Clearsense

Today’s technology ecosystems are varied and complex, whether found on-premise, public cloud, or a combination of all (multi-cloud) there is a need for continuous 24/7 synthesis of the environment.

Charles will explore the various components of a healthcare-centric data ecosystem and how situational awareness in the clinical environment has been transferred to the technical realm to ensure maximum uptime and efficiency.

Fireside Chat: Enabling strong engineering practices at Maersk

Mark Sear, Head of Data Platform Optimization @ Maersk

Kunal Agarwal, Co-founder & CEO @ Unravel

As DataOps moves along the maturity curve, many organizations are deciphering how to best balance the success of running critical jobs with optimized time and cost governance.

In this fireside chat, Mark Sear, head of data platform optimization for Maersk shares how his team is driving towards enabling strong engineering practices, design tenets, and culture, at one of the largest shipping and logistics companies in the world.

What is the cost to attend and watch the virtual sessions?

Data Team Summit is always free and open for all to attend.

When is Data Teams Summit 2024?

Data Teams Summit 2024 was held on January 24, 2024.

What is Data Teams Summit?

Data Teams Summit is an annual peer-to-peer day of empowerment for data teams that reflects our focus on the teams and individuals running, managing, and monitoring data pipelines.

Data Teams Summit is a full-day virtual conference, led by real-world data practitioners and leaders at future-forward organizations about how they're establishing predictability, increasing reliability, and creating economic efficiencies with their data.

Who comes to the Data Teams Summit?

Data professionals and experts including data engineers, administrators, architects, analysts, AI/ML professionals, and relevant data technology leadership.