32 subscribers
با برنامه Player FM !
پادکست هایی که ارزش شنیدن دارند
حمایت شده


Real-time Threat Detection Using Machine Learning and Apache Kafka
Manage episode 424666723 series 2510642
Can we use machine learning to detect security threats in real-time? As organizations increasingly rely on distributed systems, it is becoming more important to analyze the traffic that passes through those systems quickly. Confluent Hackathon ’22 finalist, Géraud Dugé de Bernonville (Data Consultant, Zenika Bordeaux), shares how his team used TensorFlow (machine learning) and Neo4j (graph database) to analyze and detect network traffic data in real-time. What started as a research and development exercise turned into ZIEM, a full-blown internal project using ksqlDB to manipulate, export, and visualize data from Apache Kafka®.
Géraud and his team noticed that large amounts of data passed through their network, and they were curious to see if they could detect threats as they happened. As a hackathon project, they built ZIEM, a network mapping and intrusion detection platform that quickly generates network diagrams. Using Kafka, the system captures network packets, processes the data in ksqlDB, and uses a Neo4j Sink Connector to send it to a Neo4j instance. Using the Neo4j browser, users can see instant network diagrams showing who's on the network, allowing them to detect anomalies quickly in real time.
The Ziem project was initially conceived as an experiment to explore the potential of using Kafka for data processing and manipulation. However, it soon became apparent that there was great potential for broader applications (banking, security, etc.). As a result, the focus shifted to developing a tool for exporting data from Kafka, which is helpful in transforming data for deeper analysis, moving it from one database to another, or creating powerful visualizations.
Géraud goes on to talk about how the success of this project has helped them better understand the potential of using Kafka for data processing. Zenika plans to continue working to build a pipeline that can handle more robust visualizations, expose more learning opportunities, and detect patterns.
EPISODE LINKS
- Ziem Project on GitHub
- ksqlDB 101 course
- ksqlDB Fundamentals: How Apache Kafka, SQL, and ksqlDB Work together ft. Simon Aubury
- Real-Time Stream Processing, Monitoring, and Analytics with Apache Kafka
- Application Data Streaming with Apache Kafka and Swim
- Watch the video version of this podcast
- Kris Jenkins’ Twitter
- Streaming Audio Playlist
- Join the Confluent Community
- Learn more with Kafka tutorials, resources, and guides at Confluent Developer
- Live demo: Intro to Event-Driven Microservices with Confluent
- Use PODCAST100 to get an additional $100 of free Confluent Cloud usage (details)
فصل ها
1. Intro (00:00:00)
2. What is the Ziem Project? (00:05:03)
3. How do you use ksqlDB? (00:12:15)
4. Creating network visualizations with Neo4j and Neovis.js (00:14:00)
5. Machine learning plans with Ziem (00:17:50)
6. Supervised vs. non-supervised machine learning (00:20:07)
7. Future use cases for Ziem (00:22:30)
8. How to get started with TensorFlow (00:24:29)
9. It's a wrap! (00:27:33)
265 قسمت
Manage episode 424666723 series 2510642
Can we use machine learning to detect security threats in real-time? As organizations increasingly rely on distributed systems, it is becoming more important to analyze the traffic that passes through those systems quickly. Confluent Hackathon ’22 finalist, Géraud Dugé de Bernonville (Data Consultant, Zenika Bordeaux), shares how his team used TensorFlow (machine learning) and Neo4j (graph database) to analyze and detect network traffic data in real-time. What started as a research and development exercise turned into ZIEM, a full-blown internal project using ksqlDB to manipulate, export, and visualize data from Apache Kafka®.
Géraud and his team noticed that large amounts of data passed through their network, and they were curious to see if they could detect threats as they happened. As a hackathon project, they built ZIEM, a network mapping and intrusion detection platform that quickly generates network diagrams. Using Kafka, the system captures network packets, processes the data in ksqlDB, and uses a Neo4j Sink Connector to send it to a Neo4j instance. Using the Neo4j browser, users can see instant network diagrams showing who's on the network, allowing them to detect anomalies quickly in real time.
The Ziem project was initially conceived as an experiment to explore the potential of using Kafka for data processing and manipulation. However, it soon became apparent that there was great potential for broader applications (banking, security, etc.). As a result, the focus shifted to developing a tool for exporting data from Kafka, which is helpful in transforming data for deeper analysis, moving it from one database to another, or creating powerful visualizations.
Géraud goes on to talk about how the success of this project has helped them better understand the potential of using Kafka for data processing. Zenika plans to continue working to build a pipeline that can handle more robust visualizations, expose more learning opportunities, and detect patterns.
EPISODE LINKS
- Ziem Project on GitHub
- ksqlDB 101 course
- ksqlDB Fundamentals: How Apache Kafka, SQL, and ksqlDB Work together ft. Simon Aubury
- Real-Time Stream Processing, Monitoring, and Analytics with Apache Kafka
- Application Data Streaming with Apache Kafka and Swim
- Watch the video version of this podcast
- Kris Jenkins’ Twitter
- Streaming Audio Playlist
- Join the Confluent Community
- Learn more with Kafka tutorials, resources, and guides at Confluent Developer
- Live demo: Intro to Event-Driven Microservices with Confluent
- Use PODCAST100 to get an additional $100 of free Confluent Cloud usage (details)
فصل ها
1. Intro (00:00:00)
2. What is the Ziem Project? (00:05:03)
3. How do you use ksqlDB? (00:12:15)
4. Creating network visualizations with Neo4j and Neovis.js (00:14:00)
5. Machine learning plans with Ziem (00:17:50)
6. Supervised vs. non-supervised machine learning (00:20:07)
7. Future use cases for Ziem (00:22:30)
8. How to get started with TensorFlow (00:24:29)
9. It's a wrap! (00:27:33)
265 قسمت
Tutti gli episodi
×



1 Migrate Your Kafka Cluster with Minimal Downtime 1:01:30









1 Top 6 Worst Apache Kafka JIRA Bugs 1:10:58









1 Optimizing Apache JVMs for Apache Kafka 1:11:42



1 International Podcast Day - Apache Kafka Edition | Streaming Audio Special 1:02:22




1 Capacity Planning Your Apache Kafka Cluster 1:01:54




1 Streaming Analytics and Real-Time Signal Processing with Apache Kafka 1:06:33



1 Common Apache Kafka Mistakes to Avoid 1:09:43













1 Scaling an Apache Kafka Based Architecture at Therapie Clinic 1:10:56





1 The Evolution of Apache Kafka: From In-House Infrastructure to Managed Cloud Service ft. Jay Kreps 46:32



1 Expanding Apache Kafka Multi-Tenancy for Cloud-Native Systems ft. Anna Povzner and Anastasia Vela 31:01



1 From Batch to Real-Time: Tips for Streaming Data Pipelines with Apache Kafka ft. Danica Fine 29:50


















1 How to Build a Strong Developer Community with Global Engagement ft. Robin Moffatt and Ale Murray 35:18







1 Collecting Data with a Custom SIEM System Built on Apache Kafka and Kafka Connect ft. Vitalii Rudenskyi 25:14










1 Engaging Database Partials with Apache Kafka for Distributed System Consistency ft. Pat Helland 42:09

1 The Truth About ZooKeeper Removal and the KIP-500 Release in Apache Kafka ft. Jason Gustafson and Colin McCabe 31:50
















1 Building a Microservices Architecture with Apache Kafka at Nationwide Building Society ft. Rob Jackson 48:54





1 Event Streaming Trends and Predictions for 2021 ft. Gwen Shapira, Ben Stopford, and Michael Noll 44:34


1 Mastering DevOps with Apache Kafka, Kubernetes, and Confluent Cloud ft. Rick Spurgeon and Allison Walther 46:18




1 Tales from the Frontline of Apache Kafka DevOps ft. Jason Bell 1:00:25












1 Using Apache Kafka as the Event-Driven System for 1,500 Microservices at Wix ft. Natan Silnitsky 49:12




1 Disaster Recovery with Multi-Region Clusters in Confluent Platform ft. Anna McDonald and Mitch Henderson 43:04


















1 IoT Integration and Real-Time Data Correlation with Kafka Connect and Kafka Streams ft. Kai Waehner 40:55

















































































به Player FM خوش آمدید!
Player FM در سراسر وب را برای یافتن پادکست های با کیفیت اسکن می کند تا همین الان لذت ببرید. این بهترین برنامه ی پادکست است که در اندروید، آیفون و وب کار می کند. ثبت نام کنید تا اشتراک های شما در بین دستگاه های مختلف همگام سازی شود.