let's
empower
data teams,
together
live virtual
peer-to-peer
sessions
jan 25, 2023
Data Teams Summit 2023 Speakers

Matthew Weingarten
Senior Data Engineer
Benjamin Rogojan
Data Consultant

Jonathan Neo
Data Engineer


Sanjeev Mohan
Principal

Julia King
VP, Data & Analytics

Kunal Agarwal
Co-founder & CEO


Andrew Jones
Tech Lead & Senior Engineer


Nikolas Schriefer
Director of Product - Global AI


Tobias Zwingmann
Data Science Mentor


Matthew Blasa
Data Scientist & YouTuber


Elliot Shmukler
Co-Founder & CEO

Stephan Claus
Director of Data Analytics


Preeti Hemant
Manager, Data Science & Machine Learning


Mark Freeman II
Senior Data Scientist

Brandon Beidel
Director of Product Management


Pravin Kedia
Chief Data Architect

Vishal Ramrakhyani
VP Engineering


Alejandra Cabrera
Data Product Manager


Swati Vishwanathan
Data Engineer

Loris Marini
Data Scientist, Founder & Host


Puppy Tsai
Associate Product Manager


Ali Khalid
Manager, Enterprise Data & Analytics

Mona Rakibe
Co-Founder & CEO


Rob Albritton
VP, AI Practice Lead


Monica Kay Royal
Founder & Chief Data Enthusiast

Clinton Ford
Director of Product Marketing


Nirmal Budhathoki
Senior Data & Applied Scientist


Pragyansmita Nayak
Chief Data Scientist


Sarah Floris, MS
Senior Data & ML Engineer


Thiago Gil
Ambassador


Anais Dotis-Georgiou
Lead Developer Advocate

Cristian Barca
Principal Data Engineer


Shane Murray
Field CTO


Richad Nieves-Becker
Sr. Assoc. VP, Data Science


Ben Doremus
Chief Technology Officer


Zach Wright
Solution Engineer

Eric Callahan
Sr. Data Consultant


Ted Sfikas
Senior Director of Digital Strategy & Value Engineering, Americas


Charles Boicey
CIO & Co-Founder

Andrew Gelinas
Co-Founder

Official Peer-Built Agenda | Data Teams Summit | Session Recordings
Keynote Panel: Winning strategies to unleash your data team
Sanjeev Mohan, Principal @ SanjMo
Benjamin Rogojan, Data Consultant @ Seattle Data Guy
Kunal Agarwal, Co-Founder & CEO @ Unravel Data
Great data outcomes depend on successful data teams. Every single day, data teams deal with hundreds of different problems arising from the volume, velocity, variety—and complexity—of the modern data stack.
Learn best practices and winning strategies for what works (and what doesn’t) to help data teams tackle the top day-to-day challenges and unleash innovation.
Keynote Panel: Building flexible data teams to improve product delivery
Julia King, VP, Data & Analytics @ Carta
Elliot Shmukler, Co-Founder & CEO @ Anomalo
Stephan Claus, Director of Data Analytics @ Home to Go
Who should you hire to build your data product? A data scientist, data engineer, MLOps, or data analyst, and don’t forget the good old DBA. With many people specializing and marketing teams pushing new labels on us, it is hard to know how to organize your data team.
Learn from three leaders who have been a part of building data products at companies of all sizes on how they have seen data teams evolve and some ways to build flexible teams to improve data product delivery.
Breakout Session: Becoming a data engineering team lead
Matthew Weingarten, Senior Data Engineer @ Disney Streaming
As you progress up the career ladder for data engineering, responsibilities shift as you start to become more hands-off and look at the overall picture rather than a project in particular.
How do you ensure your team's success? It starts with focusing on the team members themselves.
In this talk, Matt Weingarten, a lead Data Engineer at Disney Streaming, will walk through some of his suggestions and best practices for how to be a leader in the data engineering world.
Breakout Session: The future of data orchestration: asset-based orchestration
Jonathan Neo, Data Engineer @ Canva
Data orchestration is a core component for any batch data processing platform and we’ve been using patterns that haven't changed since the 1980s.
In this talk, I’ll be introducing a new pattern and way of thinking for data orchestration known as asset-based orchestration, with data freshness sensors to trigger pipelines. I will demo this new pattern using popular tools of the modern data stack - dbt, airbyte, and dagster.
Breakout Session: How to build data quality as a product
Alejandra Cabrera, Data Product Manager @ Clearbit
Mona Rakibe, Co-Founder & CEO @ Telmai
With dozens of new data sources and a drive to experiment with data, many data engineers and data product managers are tackling data quality issues on a daily basis.
In this talk, Clearbit’s data product manager, Ale Cabrera, and Telmai’s CEO/Co-Founder, Mona Rakibe, will talk about how to solve this universal problem with a new, pragmatic approach.
In treating data quality as a first-class citizen and applying the same rigor and principles as any other product, Clearbit has been able to scale the data engineering and data science teams and accelerate product time to market - a best practice that you can learn from.
In this talk, you will learn how to:
- Treat any data quality issue or feature enhancement as a product
- Define requirements and prioritize engineering work
- Build a strategy for proper design, engineering, testing, and user adoption
- Create data quality health KPIs based to measure the success of your work
Breakout Session: Driving data culture change with data contracts
Andrew Jones, Tech Lead & Senior Engineer @ GoCardless
At GoCardless we’ve been implementing data contracts since 2021, using it as our vessel to increase the quality of data and drive a culture change in order to become a true data-driven organisation - one that really values its data and uses it to drive our products and business.
Following this talk you’ll understand the problems we’re trying to solve with data contracts, and how through both the tooling and by working with people across the organisation we’re changing our data culture to one where we’re much more deliberate about the data we produce, manage, and consume, leading to better data-driven outcomes across the business.
Breakout Session: Going from DevOps to DataOps
Ali Khalid, Manager, Enterprise Data & Analytics
DevOps has had a massive impact on the web services world, learn how to leverage those lessons and take them further to improve the quality and speed of delivery for analytics solutions.
Ali's talk will serve as a blueprint for the fundamentals of implementing DataOps, laying out some principles to follow from the DevOps world, and importantly adding subject areas required to get to DataOps - which participants can take back and apply to their teams.
Breakout Session: How data mesh unlocks your next growth chapter
Nikolas Schriefer, Director of Product - Global AI @ Hello Fresh
Do you experience ever-growing volumes of data leading to fragile architectures, increasing load times, exploding BI requests, SLA violations, and long times to insight? Do your data analysts spend lots of time on data wrangling due to missing stakeholder discoverability and accessibility? Are your data engineers fixing data downtime issues because of a lack of standardized data governance, operations, or infrastructure?
In this talk, Nikolas describes how the data mesh helped music marketing platform and publisher network Linkfire.com to overcome these growing pains by establishing a decentralized, domain-driven product and data organization with clear ownership and responsibilities while ensuring global standards and a shared infrastructure to maximize effectiveness and efficiency.
Panel Discussion: What habits do successful data engineers have?
Moderator: Loris Marini, Data Scientist, Founder & Host @ Discovering Data Podcast
Swati Vishwanathan, Data Engineer @ Swinerton
Tobias Zwingmann, Data Science Mentor @ Springboard
Vishal Ramrakhyani, VP Engineering @ Zoomcar
Panel Discussion: Bridging the gap: going from individual contributor to manager
Moderator: Matthew Blasa, Data Scientist & YouTuber @ Datalife 360
Monica Kay Royal, Founder & Chief Data Enthusiast @ Nerd Nourishment
Puppy Tsai, Associate Product Manager @ Coach Art
Rob Albritton, VP, AI Practice Lead @ Octo
Breakout Session: Data team leadership - What types of skills do new senior data scientists need to lead a team?
Matthew Blasa, Data Scientist & YouTuber @ Datalife 360
Learning to lead as a senior data science contributor is a challenge. It's a role that requires both individual contributor skills while learning data leadership. It's an important stepping stone that teaches you the fundamentals for a larger leadership role. This session will discuss the key skills you will need to grow, thrive, and lead a team.
This session will help:
- Define the responsibilities expected of senior data contributors
- Leadership skills needed at the senior data contributor level
- Mentoring your reports improves quality, builds their morale, and gets things done
- Communication
- Delegate and plan your projects better
- Steps you can take now to learn these skills
Breakout Session: Evolution of Analytics at Wattpad
Preeti Hemant,Manager, Data Science & Machine Learning @ Wattpad
Join Preeti as she walks through the structure of Wattpad’s data organization as well as the story of their analytics engagement model and the nature of their data tasks.
Preeti will share issues encountered along the way and how the causes were diagnosed, before covering the proposed solutions and their implementations.
Attendees will walk away understanding how to look at the current state of the data org and observe shifts.
Breakout Session: Maximize business results with FinOps
Thiago Gil, Ambassador @ FinOps Foundation
Clinton Ford, Director of Product Marketing @ Unravel Data
As organizations run more data applications and pipelines in the cloud, they look for ways to avoid the hidden costs of cloud adoption and migration. Teams seek to maximize business results through cost visibility, forecast accuracy, and financial predictability.
In this session, learn why observability matters and how a FinOps approach empowers DataOps and business teams to collaboratively achieve shared business goals. This approach uses the FinOps Framework, taking advantage of the cloud’s variable cost model, and distributing ownership and decision-making through shared visibility to get the biggest return on their modern data stack investments.
See how organizations apply agile and lean principles using the FinOps framework to boost efficiency, productivity, and innovation.
Breakout Panel: What's in store for the future of data engineering?
Eric Callahan, Sr. Data Consultant @ Pickaxe
Brandon Beidel, Director of Product Management @ Red Ventures
Shane Murray, Field CTO @ Monte Carlo
With budgets tightening and data use cases skyrocketing, how can data engineering teams set themselves up for success in 2023?
In this panel conversation, Shane Murray, Field CTO at Monte Carlo, Brandon Beidel, Director of Product Management, Red Ventures, and Eric Callahan, Sr. Data Consultant at Pickaxe Foundry and former Head of Data at Understood, will discuss their predictions for the year’s top challenges - and opportunities - facing data engineering teams in the New Year.
Their conversation will include the future of BI tooling, operationalizing distributed environments like data mesh, implementing data quality initiatives like data contracts, making the case for data with your finance team, and other pressing considerations.
Breakout Session: Self-service data governance
Pravin Kedia, Chief Data Architect @ IBM
Join Pravin for a session focused on data governance, data quality, and DataOps for end customers. He’ll walk through real-world examples using Watson Knowledge Catalog and IBM Cloud Pak for Data as an example - showing how governance has been improved and enabled self-service for business users’ profiles without data quality and profiling interventions from the IT team.
Attendees will walk away understanding how DataOps teams can finally bridge DataOps data governance gaps between business users and IT teams.
Breakout Session: The mesh inside the mesh
Christian Barca, Principal Data Engineer @ Elsevier
The challenging journey of migrating a traditional ETL architecture to a domain-oriented decentralized design for processing research data harvested from thousands of open data repositories.
The new architecture enabled the DataMonitor team (a research-data domain and product team within Elsevier) to perform cross-domain data processing, analysis, and linking of tens of millions of records in a semi-streaming fashion! This also allowed them to speed up the update time period and save infrastructure costs.
Attendees will learn from DataMonitor's experience in how to develop a state of art data-domain platform as part of their organizational data-mesh architecture to deliver maximum value to the business.
Breakout Session: Unlocking better data quality with machine learning & visualization
Zach Wright, Solution Engineer @ Anomalo
Technologies in the modern data stack enable enterprises to store, transform, and analyze their data. But the existing rule and metric approaches to monitoring the quality of this data are tedious to set up and maintain, fail to catch unexpected issues and generate false positive alerts that lead to alert fatigue.
In this session, Zach will dive into how data quality can be automatically monitored with machine learning to discover anomalies and find their root causes at scale. He will cover how visualization can enhance data quality monitoring by making it simple enough for all data consumers to use - from data engineers to data scientists to analysts.
Participants will gain an understanding of how intelligent data quality monitoring is emerging as the new pillar in the modern data stack.
Breakout Session: Methods and tools for time series data science problems with InfluxDB
Anais Dotis-Georgiou, Lead Developer Advocate @ Influx Data
In this talk, we’ll explore the ways a time series platform supports data scientists. We’ll learn how you could use Telegraf open source collection agent to perform forecasting at the edge.
We’ll explore how you can use Flux query language to prepare and clean your data as well as some preliminary data analysis.
Next, we’ll learn about integrations with Jupyter and Zeppelin notebooks that currently exist and the future ML integrations that InfluxData will support.
Finally, we’ll learn about some of InfluxData’s customers in the ML space and dive into how they use InfluxDB to build their own ML solutions.
Breakout Session: Building the best data team is not rocket science
Nirmal Budhathoki, Senior Data & Applied Scientist @ Microsoft
Nirmal’s approach to the right data team focuses on aligning the team's responsibilities with the skillsets you’re recruiting for.
In his talk, he’ll cover what type of data team is best for each scenario and discuss the data verticals data teams can support.
Attendees will walk away understanding the differences between centralized, decentralized, and federated data teams as well as ways each type of team can support data strategy, data governance, and data analytics.
Breakout Session: Designing a modern data center of excellence
Ted Sfikas, Senior Director of Digital Strategy & Value Engineering, Americas @ Tealium
Organizations globally have amassed an enormous amount of customer data that now requires careful oversight and timely activation in order to meet modern expectations. Doing so will bring brands as close as possible to the customer and automate some of the best revenue-generating processes available. You’re going to need some technology automation to help. You will increase revenue streams and optimize operational processes. So how can companies unlock this customer data? Who holds the key? Creating a Data Center of Excellence charts the path to successful data management in every business – large or small.
In this session, you'll learn which major stakeholders benefit from increased revenue and operational excellence, what a Data Center of Excellence (DCoE) is, and business practices and topics in scope for optimization by a DCoE.
Breakout Session: DataOps for digital twins
Pragyansmita Nayak, Chief Data Scientist @ Hitachi Vantara Federal
First mentioned in early 2000, in the context of the manufacturing domain, the Digital Twins (DT) concept is gaining wider acceptance today as the definite next step in leveraging data as both a strategic and tactical asset. Real-time data is the key requirement for continuously improving and highly sustainable DT applications. This has been made possible largely by the recent advances in data and computing – including but not limited to, performant data pipelines, big data architectures, data management, cloud-based as-a-service models, artificial intelligence, and advanced analytics (predictive and prescriptive). This advent of analytics and numerical simulations in the operational and monitoring phase of a product lifecycle reinforces the need for trusted real-time data.
This talk will focus on how data management plays a critical role in the core functions of DTs and helps refine the quality of the data assets. The trust and quality of the data need to be maintained and tracked over time by establishing data governance policies and processes. The need for scalable solutions will be stressed – to be specific, the accelerating need for any organization to include a data catalog with automated metadata management as well as lineage for dependency analysis as part of its data strategy.
Panel Discussion: What happens when your infrastructure doesn't scale anymore?
Moderator: Mark Freeman II, Senior Data Scientist @ On the Mark Data
Richad Nieves-Becker, Sr. Assoc. VP, Data Science @ Revantage
Ben Doremus, Chief Technology Officer @ Magenta
Sarah Floris, MS, Senior Data & ML Engineer @ Zwift
Breakout Session: Situational awareness in a technology ecosystem
Charles Boicey, CIO & Co-Founder @ Clearsense
Today’s technology ecosystems are varied and complex, whether found on-premise, public cloud, or a combination of all (multi-cloud) there is a need for continuous 24/7 synthesis of the environment.
Charles will explore the various components of a healthcare-centric data ecosystem and how situational awareness in the clinical environment has been transferred to the technical realm to ensure maximum uptime and efficiency.
Fireside Chat: Enabling strong engineering practices at Maersk
Mark Sear, Head of Data Platform Optimization @ Maersk
Kunal Agarwal, Co-founder & CEO @ Unravel
As DataOps moves along the maturity curve, many organizations are deciphering how to best balance the success of running critical jobs with optimized time and cost governance.
In this fireside chat, Mark Sear, head of data platform optimization for Maersk shares how his team is driving towards enabling strong engineering practices, design tenets, and culture, at one of the largest shipping and logistics companies in the world.
What is the cost to attend and watch the virtual sessions?
Data Team Summit is always free and open for all to attend.
What is Data Teams Summit?
This year, we've taken the peer-to-peer empowerment of data teams one step further and formally transformed DataOps Unleashed into Data Teams Summit to better reflect our focus on the teams and individuals running, managing, and monitoring data pipelines.
Data Teams Summit is an annual, full-day virtual conference, led by data rockstars at future-forward organizations about how they're establishing predictability, increasing reliability, and creating economic efficiencies with their data pipelines.
We'll be back again in 2024 and look forward to another round of data team focused conversations.
Who comes to Data Teams Summit?
Data professionals and experts including data engineers, administrators, architects, analysts, AI/ML professionals, and relevant data technology leadership.
Join us for sessions on:
- Data teams & best practices
- Data pipelines & applications
- DataOps observability
- Data quality & data governance
- Operations observability
- MLOps
- Data modernization & architecture
- Biz/FinOps observability
Interested in speaking at Data Teams Summit 2024?
Submit your talk proposal at datateamssummit.com/cfp/ or send us a note to astronaut@solutionmonday.com.
Would you like to stay updated on agenda announcements for Data Teams Summit 2024? Sign up below!
Thank you to our 2023 sponsors who made this event possible
Interested in speaking at Data Teams Summit or participating as a sponsor?
Please contact mike@solutionmonday.com.