32 subscribers
با برنامه Player FM !
پادکست هایی که ارزش شنیدن دارند
حمایت شده
What Could Go Wrong with a Kafka JDBC Connector?
Manage episode 424666741 series 2510642
Java Database Connectivity (JDBC) is the Java API used to connect to a database. As one of the most popular Kafka connectors, it's important to prevent issues with your integrations.
In this episode, we'll cover how a JDBC connection works, and common issues with your database connection.
Why the Kafka JDBC Connector?
When it comes to streaming database events into Apache Kafka®, the JDBC connector usually represents the first choice for its flexibility and the ability to support a wide variety of databases without requiring custom code. As an experienced data analyst, Francesco Tisiot (Senior Developer Advocate, Aiven) delves into his experience of streaming Kafka data pipeline with JDBC source connector and explains what could go wrong. He discusses alternative options available to avoid these problems, including the Debezium source connector for real-time change data capture.
The JDBC connector is a Java API for Kafka Connect, which streams data between databases and Kafka. If you want to stream data from a rational database into Kafka, once per day or every two hours, the JDBC connector is a simple, batch processing connector to use. You can tell the JDBC connector which query you’d like to execute against the database, and then the connector will take the data into Kafka.
The connector works well with out-of-the-box basic data types, however, when it comes to a database-specific data type, such as geometrical columns and array columns in PostgresSQL, these don’t represent well with the JDBC connector. Perhaps, you might not have any results in Kafka because the column is not within the connector’s supporting capability. Francesco shares other cases that would cause the JDBC connector to go wrong, such as:
- Infrequent snapshot times
- Out-of-order events
- Non-incremental sequences
- Hard deletes
To help avoid these problems and set up a reliable source of events for your real-time streaming pipeline, Francesco suggests other approaches, such as the Debezium source connector for real-time change data capture. The Debezium connector has enhanced metadata, timestamps of the operation, access to all logs, and provides sequence numbers for you to speak the language of a DBA.
They also talk about the governance tool, which Francesco has been building, and how streaming Game of Thrones sentiment analysis with Kafka started his current role as a developer advocate.
EPISODE LINKS
- Kafka Connect Deep Dive – JDBC Source Connector
- JDBC Source Connector: What could go wrong?
- Metadata parser
- Debezium Documentation
- Database Migration with Apache Kafka and Apache Kafka Connect
- Watch the video version of this podcast
- Francesco Tisiot’s Twitter
- Kris Jenkins’ Twitter
- Streaming Audio Playlist
- Join the Confluent Community
- Learn more on Confluent Developer
فصل ها
1. Intro (00:00:00)
2. Game of Thrones Sentiment Analysis (00:06:48)
3. Kafka Integration with JDBC Connector (00:11:34)
4. JDBC Connector – Polling Time (00:16:28)
5. Change Data Capture with Debezium (00:20:18)
6. Manage Data Flows with ksqlDB (00:30:01)
7. metadata-parser (00:32:41)
8. Tips on Getting Started with Debezium (00:34:54)
9. It's a wrap (00:39:22)
265 قسمت
Manage episode 424666741 series 2510642
Java Database Connectivity (JDBC) is the Java API used to connect to a database. As one of the most popular Kafka connectors, it's important to prevent issues with your integrations.
In this episode, we'll cover how a JDBC connection works, and common issues with your database connection.
Why the Kafka JDBC Connector?
When it comes to streaming database events into Apache Kafka®, the JDBC connector usually represents the first choice for its flexibility and the ability to support a wide variety of databases without requiring custom code. As an experienced data analyst, Francesco Tisiot (Senior Developer Advocate, Aiven) delves into his experience of streaming Kafka data pipeline with JDBC source connector and explains what could go wrong. He discusses alternative options available to avoid these problems, including the Debezium source connector for real-time change data capture.
The JDBC connector is a Java API for Kafka Connect, which streams data between databases and Kafka. If you want to stream data from a rational database into Kafka, once per day or every two hours, the JDBC connector is a simple, batch processing connector to use. You can tell the JDBC connector which query you’d like to execute against the database, and then the connector will take the data into Kafka.
The connector works well with out-of-the-box basic data types, however, when it comes to a database-specific data type, such as geometrical columns and array columns in PostgresSQL, these don’t represent well with the JDBC connector. Perhaps, you might not have any results in Kafka because the column is not within the connector’s supporting capability. Francesco shares other cases that would cause the JDBC connector to go wrong, such as:
- Infrequent snapshot times
- Out-of-order events
- Non-incremental sequences
- Hard deletes
To help avoid these problems and set up a reliable source of events for your real-time streaming pipeline, Francesco suggests other approaches, such as the Debezium source connector for real-time change data capture. The Debezium connector has enhanced metadata, timestamps of the operation, access to all logs, and provides sequence numbers for you to speak the language of a DBA.
They also talk about the governance tool, which Francesco has been building, and how streaming Game of Thrones sentiment analysis with Kafka started his current role as a developer advocate.
EPISODE LINKS
- Kafka Connect Deep Dive – JDBC Source Connector
- JDBC Source Connector: What could go wrong?
- Metadata parser
- Debezium Documentation
- Database Migration with Apache Kafka and Apache Kafka Connect
- Watch the video version of this podcast
- Francesco Tisiot’s Twitter
- Kris Jenkins’ Twitter
- Streaming Audio Playlist
- Join the Confluent Community
- Learn more on Confluent Developer
فصل ها
1. Intro (00:00:00)
2. Game of Thrones Sentiment Analysis (00:06:48)
3. Kafka Integration with JDBC Connector (00:11:34)
4. JDBC Connector – Polling Time (00:16:28)
5. Change Data Capture with Debezium (00:20:18)
6. Manage Data Flows with ksqlDB (00:30:01)
7. metadata-parser (00:32:41)
8. Tips on Getting Started with Debezium (00:34:54)
9. It's a wrap (00:39:22)
265 قسمت
همه قسمت ها
×



1 Migrate Your Kafka Cluster with Minimal Downtime 1:01:30









1 Top 6 Worst Apache Kafka JIRA Bugs 1:10:58









1 Optimizing Apache JVMs for Apache Kafka 1:11:42



1 International Podcast Day - Apache Kafka Edition | Streaming Audio Special 1:02:22




1 Capacity Planning Your Apache Kafka Cluster 1:01:54




1 Streaming Analytics and Real-Time Signal Processing with Apache Kafka 1:06:33



1 Common Apache Kafka Mistakes to Avoid 1:09:43













1 Scaling an Apache Kafka Based Architecture at Therapie Clinic 1:10:56





1 The Evolution of Apache Kafka: From In-House Infrastructure to Managed Cloud Service ft. Jay Kreps 46:32



1 Expanding Apache Kafka Multi-Tenancy for Cloud-Native Systems ft. Anna Povzner and Anastasia Vela 31:01



1 From Batch to Real-Time: Tips for Streaming Data Pipelines with Apache Kafka ft. Danica Fine 29:50


















1 How to Build a Strong Developer Community with Global Engagement ft. Robin Moffatt and Ale Murray 35:18







1 Collecting Data with a Custom SIEM System Built on Apache Kafka and Kafka Connect ft. Vitalii Rudenskyi 25:14










1 Engaging Database Partials with Apache Kafka for Distributed System Consistency ft. Pat Helland 42:09

1 The Truth About ZooKeeper Removal and the KIP-500 Release in Apache Kafka ft. Jason Gustafson and Colin McCabe 31:50
















1 Building a Microservices Architecture with Apache Kafka at Nationwide Building Society ft. Rob Jackson 48:54





1 Event Streaming Trends and Predictions for 2021 ft. Gwen Shapira, Ben Stopford, and Michael Noll 44:34


1 Mastering DevOps with Apache Kafka, Kubernetes, and Confluent Cloud ft. Rick Spurgeon and Allison Walther 46:18




1 Tales from the Frontline of Apache Kafka DevOps ft. Jason Bell 1:00:25












1 Using Apache Kafka as the Event-Driven System for 1,500 Microservices at Wix ft. Natan Silnitsky 49:12




1 Disaster Recovery with Multi-Region Clusters in Confluent Platform ft. Anna McDonald and Mitch Henderson 43:04


















1 IoT Integration and Real-Time Data Correlation with Kafka Connect and Kafka Streams ft. Kai Waehner 40:55

















































































به Player FM خوش آمدید!
Player FM در سراسر وب را برای یافتن پادکست های با کیفیت اسکن می کند تا همین الان لذت ببرید. این بهترین برنامه ی پادکست است که در اندروید، آیفون و وب کار می کند. ثبت نام کنید تا اشتراک های شما در بین دستگاه های مختلف همگام سازی شود.