32 subscribers
با برنامه Player FM !
پادکست هایی که ارزش شنیدن دارند
حمایت شده


Flink vs Kafka Streams/ksqlDB: Comparing Stream Processing Tools
Manage episode 424666751 series 2510642
Stream processing can be hard or easy depending on the approach you take, and the tools you choose. This sentiment is at the heart of the discussion with Matthias J. Sax (Apache Kafka® PMC member; Software Engineer, ksqlDB and Kafka Streams, Confluent) and Jeff Bean (Sr. Technical Marketing Manager, Confluent). With immense collective experience in Kafka, ksqlDB, Kafka Streams, and Apache Flink®, they delve into the types of stream processing operations and explain the different ways of solving for their respective issues.
The best stream processing tools they consider are Flink along with the options from the Kafka ecosystem: Java-based Kafka Streams and its SQL-wrapped variant—ksqlDB. Flink and ksqlDB tend to be used by divergent types of teams, since they differ in terms of both design and philosophy.
Why Use Apache Flink?
The teams using Flink are often highly specialized, with deep expertise, and with an absolute focus on stream processing. They tend to be responsible for unusually large, industry-outlying amounts of both state and scale, and they usually require complex aggregations. Flink can excel in these use cases, which potentially makes the difficulty of its learning curve and implementation worthwhile.
Why use ksqlDB/Kafka Streams?
Conversely, teams employing ksqlDB/Kafka Streams require less expertise to get started and also less expertise and time to manage their solutions. Jeff notes that the skills of a developer may not even be needed in some cases—those of a data analyst may suffice. ksqlDB and Kafka Streams seamlessly integrate with Kafka itself, as well as with external systems through the use of Kafka Connect. In addition to being easy to adopt, ksqlDB is also deployed on production stream processing applications requiring large scale and state.
There are also other considerations beyond the strictly architectural. Local support availability, the administrative overhead of using a library versus a separate framework, and the availability of stream processing as a fully managed service all matter.
Choosing a stream processing tool is a fraught decision partially because switching between them isn't trivial: the frameworks are different, the APIs are different, and the interfaces are different. In addition to the high-level discussion, Jeff and Matthias also share lots of details you can use to understand the options, covering employment models, transactions, batching, and parallelism, as well as a few interesting tangential topics along the way such as the tyranny of state and the Turing completeness of SQL.
EPISODE LINKS
- The Future of SQL: Databases Meet Stream Processing
- Building Real-Time Event Streams in the Cloud, On Premises
- Kafka Streams 101 course
- ksqlDB 101 course
- Watch the video version of this podcast
- Kris Jenkins’ Twitter
- Streaming Audio Playlist
- Join the Confluent Community
- Learn more on Confluent Developer
- Use PODCAST100 for additional $100 of Confluent Cloud usage (details)
فصل ها
1. Intro (00:00:00)
2. The world of stream processing (00:02:06)
3. Flink vs ksqlDB (00:06:26)
4. Example use case (00:18:34)
5. SQL was built for static data (00:20:03)
6. Concept of event time (00:25:51)
7. Session based window joins (00:29:30)
8. Processing streaming data with SQL (00:35:47)
9. Scaling Kafka Streams/ksqlDB (00:39:47)
10. Exactly-once semantics (00:45:39)
11. Choosing stream processing tools (00:48:15)
12. It's a wrap (00:53:52)
265 قسمت
Manage episode 424666751 series 2510642
Stream processing can be hard or easy depending on the approach you take, and the tools you choose. This sentiment is at the heart of the discussion with Matthias J. Sax (Apache Kafka® PMC member; Software Engineer, ksqlDB and Kafka Streams, Confluent) and Jeff Bean (Sr. Technical Marketing Manager, Confluent). With immense collective experience in Kafka, ksqlDB, Kafka Streams, and Apache Flink®, they delve into the types of stream processing operations and explain the different ways of solving for their respective issues.
The best stream processing tools they consider are Flink along with the options from the Kafka ecosystem: Java-based Kafka Streams and its SQL-wrapped variant—ksqlDB. Flink and ksqlDB tend to be used by divergent types of teams, since they differ in terms of both design and philosophy.
Why Use Apache Flink?
The teams using Flink are often highly specialized, with deep expertise, and with an absolute focus on stream processing. They tend to be responsible for unusually large, industry-outlying amounts of both state and scale, and they usually require complex aggregations. Flink can excel in these use cases, which potentially makes the difficulty of its learning curve and implementation worthwhile.
Why use ksqlDB/Kafka Streams?
Conversely, teams employing ksqlDB/Kafka Streams require less expertise to get started and also less expertise and time to manage their solutions. Jeff notes that the skills of a developer may not even be needed in some cases—those of a data analyst may suffice. ksqlDB and Kafka Streams seamlessly integrate with Kafka itself, as well as with external systems through the use of Kafka Connect. In addition to being easy to adopt, ksqlDB is also deployed on production stream processing applications requiring large scale and state.
There are also other considerations beyond the strictly architectural. Local support availability, the administrative overhead of using a library versus a separate framework, and the availability of stream processing as a fully managed service all matter.
Choosing a stream processing tool is a fraught decision partially because switching between them isn't trivial: the frameworks are different, the APIs are different, and the interfaces are different. In addition to the high-level discussion, Jeff and Matthias also share lots of details you can use to understand the options, covering employment models, transactions, batching, and parallelism, as well as a few interesting tangential topics along the way such as the tyranny of state and the Turing completeness of SQL.
EPISODE LINKS
- The Future of SQL: Databases Meet Stream Processing
- Building Real-Time Event Streams in the Cloud, On Premises
- Kafka Streams 101 course
- ksqlDB 101 course
- Watch the video version of this podcast
- Kris Jenkins’ Twitter
- Streaming Audio Playlist
- Join the Confluent Community
- Learn more on Confluent Developer
- Use PODCAST100 for additional $100 of Confluent Cloud usage (details)
فصل ها
1. Intro (00:00:00)
2. The world of stream processing (00:02:06)
3. Flink vs ksqlDB (00:06:26)
4. Example use case (00:18:34)
5. SQL was built for static data (00:20:03)
6. Concept of event time (00:25:51)
7. Session based window joins (00:29:30)
8. Processing streaming data with SQL (00:35:47)
9. Scaling Kafka Streams/ksqlDB (00:39:47)
10. Exactly-once semantics (00:45:39)
11. Choosing stream processing tools (00:48:15)
12. It's a wrap (00:53:52)
265 قسمت
همه قسمت ها
×



1 Migrate Your Kafka Cluster with Minimal Downtime 1:01:30









1 Top 6 Worst Apache Kafka JIRA Bugs 1:10:58









1 Optimizing Apache JVMs for Apache Kafka 1:11:42



1 International Podcast Day - Apache Kafka Edition | Streaming Audio Special 1:02:22




1 Capacity Planning Your Apache Kafka Cluster 1:01:54




1 Streaming Analytics and Real-Time Signal Processing with Apache Kafka 1:06:33



1 Common Apache Kafka Mistakes to Avoid 1:09:43













1 Scaling an Apache Kafka Based Architecture at Therapie Clinic 1:10:56

به Player FM خوش آمدید!
Player FM در سراسر وب را برای یافتن پادکست های با کیفیت اسکن می کند تا همین الان لذت ببرید. این بهترین برنامه ی پادکست است که در اندروید، آیفون و وب کار می کند. ثبت نام کنید تا اشتراک های شما در بین دستگاه های مختلف همگام سازی شود.