Artwork

محتوای ارائه شده توسط Confluent, founded by the original creators of Apache Kafka® and Founded by the original creators of Apache Kafka®. تمام محتوای پادکست شامل قسمت‌ها، گرافیک‌ها و توضیحات پادکست مستقیماً توسط Confluent, founded by the original creators of Apache Kafka® and Founded by the original creators of Apache Kafka® یا شریک پلتفرم پادکست آن‌ها آپلود و ارائه می‌شوند. اگر فکر می‌کنید شخصی بدون اجازه شما از اثر دارای حق نسخه‌برداری شما استفاده می‌کند، می‌توانید روندی که در اینجا شرح داده شده است را دنبال کنید.https://fa.player.fm/legal
Player FM - برنامه پادکست
با برنامه Player FM !

Scaling an Apache Kafka Based Architecture at Therapie Clinic

1:10:56
 
اشتراک گذاری
 

Manage episode 424666761 series 2510642
محتوای ارائه شده توسط Confluent, founded by the original creators of Apache Kafka® and Founded by the original creators of Apache Kafka®. تمام محتوای پادکست شامل قسمت‌ها، گرافیک‌ها و توضیحات پادکست مستقیماً توسط Confluent, founded by the original creators of Apache Kafka® and Founded by the original creators of Apache Kafka® یا شریک پلتفرم پادکست آن‌ها آپلود و ارائه می‌شوند. اگر فکر می‌کنید شخصی بدون اجازه شما از اثر دارای حق نسخه‌برداری شما استفاده می‌کند، می‌توانید روندی که در اینجا شرح داده شده است را دنبال کنید.https://fa.player.fm/legal

Scaling Apache Kafka® can be tricky, let alone scaling a team. When he was first hired, Domenico Fioravanti of Therapie Clinic was given the challenging task of assembling a sizable tech team from scratch, while simultaneously building a scalable and decoupled architecture from the ground up. In addition, he wanted to deliver value to the company from day one. One way that Domenico ultimately accomplished these goals was by focusing on managed solutions in order to avoid large investments in engineering know-how. Another way was to deliver quickly to production by using the existing knowledge of his team.

Domenico's biggest initial priority was to make a real-time reporting dashboard that collated data generated by third-party systems, such as call centers and front-of-house software solutions that managed bookings and transactions. (Before Domenico's arrival, all reporting had been done by aggregating data from different sources through an expensive, manual, error-prone, and slow process—which tended to result in late and incomplete insights.)

Establishing an initial stack with AWS and a BI/analytics tool only took a month and required minimal DevOps resources, but Domenico's team ended up wanting to leverage their efforts to free up third-party data for more than just the reporting/data insights use case.

So they began considering Apache Kafka® as a central repository for their data. For Kafka itself, they investigated Amazon MSK vs. Confluent, carefully weighing setup and time costs, maintenance costs, limitations, security, availability, risks, migration costs, Kafka updates frequency, observability, and errors and troubleshooting needs.

Domenico's team settled on Confluent Cloud and built the following stack:

  • AWS AppSync, a managed GraphQL layer to interact with and abstract third-party APIs (data sources)
  • AWS Lambdas for extracting data and producing to Kafka topics
  • Kafka topics for the raw as well as transformed data
  • Kafka Streams for data transformation
  • Kafka Redshift sink connector for loading data
  • ​​AWS Redshift as the destination cloud data warehouse
  • Looker for business intelligence and big data analytics

This stack allowed the company's data to be consumed by multiple teams in a scalable way. Eventually, DynamoDB was added and by the end of a year, along with a scalable architecture, Domenico had successfully grown his staff to 45 members on six teams.
EPISODE LINKS

  continue reading

فصل ها

1. Intro (00:00:00)

2. Build an internal tech team from scratch (00:01:21)

3. Get to know Therapie Clinic (00:05:12)

4. Reduce manual data processes (00:07:50)

5. Building a data pipeline with Apache Kafka (00:10:52)

6. Kafka Connect (00:15:28)

7. Build a dedicated pipeline in Python on serverless (00:21:33)

8. From custom-built to Kafka-based data pipelines (00:29:03)

9. Cloud providers (00:33:32)

10. Managed cloud services (00:37:35)

11. Event sourcing (00:53:32)

12. It's a wrap! (01:08:25)

265 قسمت

Artwork
iconاشتراک گذاری
 
Manage episode 424666761 series 2510642
محتوای ارائه شده توسط Confluent, founded by the original creators of Apache Kafka® and Founded by the original creators of Apache Kafka®. تمام محتوای پادکست شامل قسمت‌ها، گرافیک‌ها و توضیحات پادکست مستقیماً توسط Confluent, founded by the original creators of Apache Kafka® and Founded by the original creators of Apache Kafka® یا شریک پلتفرم پادکست آن‌ها آپلود و ارائه می‌شوند. اگر فکر می‌کنید شخصی بدون اجازه شما از اثر دارای حق نسخه‌برداری شما استفاده می‌کند، می‌توانید روندی که در اینجا شرح داده شده است را دنبال کنید.https://fa.player.fm/legal

Scaling Apache Kafka® can be tricky, let alone scaling a team. When he was first hired, Domenico Fioravanti of Therapie Clinic was given the challenging task of assembling a sizable tech team from scratch, while simultaneously building a scalable and decoupled architecture from the ground up. In addition, he wanted to deliver value to the company from day one. One way that Domenico ultimately accomplished these goals was by focusing on managed solutions in order to avoid large investments in engineering know-how. Another way was to deliver quickly to production by using the existing knowledge of his team.

Domenico's biggest initial priority was to make a real-time reporting dashboard that collated data generated by third-party systems, such as call centers and front-of-house software solutions that managed bookings and transactions. (Before Domenico's arrival, all reporting had been done by aggregating data from different sources through an expensive, manual, error-prone, and slow process—which tended to result in late and incomplete insights.)

Establishing an initial stack with AWS and a BI/analytics tool only took a month and required minimal DevOps resources, but Domenico's team ended up wanting to leverage their efforts to free up third-party data for more than just the reporting/data insights use case.

So they began considering Apache Kafka® as a central repository for their data. For Kafka itself, they investigated Amazon MSK vs. Confluent, carefully weighing setup and time costs, maintenance costs, limitations, security, availability, risks, migration costs, Kafka updates frequency, observability, and errors and troubleshooting needs.

Domenico's team settled on Confluent Cloud and built the following stack:

  • AWS AppSync, a managed GraphQL layer to interact with and abstract third-party APIs (data sources)
  • AWS Lambdas for extracting data and producing to Kafka topics
  • Kafka topics for the raw as well as transformed data
  • Kafka Streams for data transformation
  • Kafka Redshift sink connector for loading data
  • ​​AWS Redshift as the destination cloud data warehouse
  • Looker for business intelligence and big data analytics

This stack allowed the company's data to be consumed by multiple teams in a scalable way. Eventually, DynamoDB was added and by the end of a year, along with a scalable architecture, Domenico had successfully grown his staff to 45 members on six teams.
EPISODE LINKS

  continue reading

فصل ها

1. Intro (00:00:00)

2. Build an internal tech team from scratch (00:01:21)

3. Get to know Therapie Clinic (00:05:12)

4. Reduce manual data processes (00:07:50)

5. Building a data pipeline with Apache Kafka (00:10:52)

6. Kafka Connect (00:15:28)

7. Build a dedicated pipeline in Python on serverless (00:21:33)

8. From custom-built to Kafka-based data pipelines (00:29:03)

9. Cloud providers (00:33:32)

10. Managed cloud services (00:37:35)

11. Event sourcing (00:53:32)

12. It's a wrap! (01:08:25)

265 قسمت

همه قسمت ها

×
 
Loading …

به Player FM خوش آمدید!

Player FM در سراسر وب را برای یافتن پادکست های با کیفیت اسکن می کند تا همین الان لذت ببرید. این بهترین برنامه ی پادکست است که در اندروید، آیفون و وب کار می کند. ثبت نام کنید تا اشتراک های شما در بین دستگاه های مختلف همگام سازی شود.

 

راهنمای مرجع سریع

در حین کاوش به این نمایش گوش دهید
پخش