Artwork

محتوای ارائه شده توسط Christoph Neumann and Nate Jones, Christoph Neumann, and Nate Jones. تمام محتوای پادکست شامل قسمت‌ها، گرافیک‌ها و توضیحات پادکست مستقیماً توسط Christoph Neumann and Nate Jones, Christoph Neumann, and Nate Jones یا شریک پلتفرم پادکست آن‌ها آپلود و ارائه می‌شوند. اگر فکر می‌کنید شخصی بدون اجازه شما از اثر دارای حق نسخه‌برداری شما استفاده می‌کند، می‌توانید روندی که در اینجا شرح داده شده است را دنبال کنید.https://fa.player.fm/legal
Player FM - برنامه پادکست
با برنامه Player FM !

Ep 029: Problem Unknown: Log Lines

35:06
 
اشتراک گذاری
 

Manage episode 233999238 series 2463849
محتوای ارائه شده توسط Christoph Neumann and Nate Jones, Christoph Neumann, and Nate Jones. تمام محتوای پادکست شامل قسمت‌ها، گرافیک‌ها و توضیحات پادکست مستقیماً توسط Christoph Neumann and Nate Jones, Christoph Neumann, and Nate Jones یا شریک پلتفرم پادکست آن‌ها آپلود و ارائه می‌شوند. اگر فکر می‌کنید شخصی بدون اجازه شما از اثر دارای حق نسخه‌برداری شما استفاده می‌کند، می‌توانید روندی که در اینجا شرح داده شده است را دنبال کنید.https://fa.player.fm/legal

Nate is dropped in the middle of a huge log file and hunts for the source of the errors.

  • We dig into the world of DonutGram.
  • "We are the devops."
  • Problem: we start seeing errors in the log.
  • "The kind of error that makes total sense to whoever wrote the log line."
  • (04:10) We want to use Clojure to characterize the errors.
  • Why not grep?
  • Well...we grep out the lines, count them, and we get 10,351 times. Is that a lot? Hard to say. We need more info.
  • (05:50) Developers think it's safe to ignore, but it's bugging us.
  • We want to come up with an incident count by user.
  • "Important thing we do a lot in the devops area: we see data and then we have to figure out, what does it mean?"
  • "What story is the data telling us?"
  • The "forensic data" versus "operating data" (Ep 022, 027)
  • "With forensic data, you're trying to save the state of the imprint of the known universe over time."
  • (07:40) The log file is the most common forensic data for applications.
  • To characterize, we need to manipulate the data at a higher level than grep.
  • First goal: make the data useful in Clojure
  • Get a nice sample from the production log, put it in a file, load up the REPL, and start writing code to parse it.
  • Need to parse out the data from the log file and turn it into something more structured.
  • (12:05) Let's make a function, parse-line that separates out all the common elements for each line:
    • timestamp
    • log level
    • thread name
    • package name
    • freeform "message"
  • "A message has arrived from the developers, lovingly bundled in a long log line."
  • (13:10) We can parse every line generically, but we need to parse the message and make sense of that.
  • "We want to lift up the unstructured data into structured data that we can map, filter, and reduce on."
  • We will have different kinds of log lines, so we need to detect them and parse appropriately.
  • We want to amplify the map to include the details for the specific kind of log line.
  • We'll use a kind field to identify which kind of log line it is.
  • There are two steps:
    1. recognize which kind of log line the "message" is for
    2. parse the data out of that message
  • "Maybe we'll have a podcast that's constantly about constantly. It's not just juxt."
  • (18:45) How do we do this concisely?
  • Let's use cond:
    • flatten out all the cases
    • code to detect kind (the condition)
    • code to parse the details (the expression)
  • Can use includes? in the condition to detect the kind and re-matches to parse.
  • (20:00) Why not use the regex to detect the kind too?
  • Can avoid writing the regex twice by using a def.
  • It's less code, but now we're running the regex twice: once in the condition and once to parse.
  • We can't capture the result of the test in cond. No "cond-let" macro.
  • We could write our own cond-let macro, but should we?
  • "When you feel like you should write a macro, you should step back and assess your current state of being. Am I tired? Did I have a bad day? Do I really need this?"
  • (24:05) New goals for our problem:
    • one regex literal
    • only run the regex once
    • keep the code that uses the matches close to the regex
  • Similar problem to "routes" for web endpoints: want the route definition next to the code that executes using the data for that route.
  • Feels like an index or table of options.
  • (25:20) Let's make a "table". A vector of regex and handler-code pairs.
  • We need a coffee mug: "Clojure. It's just a list."
  • The code can be a function literal that takes the matches and returns a map of parsed data.
  • Write a function parse-details which goes through each regex until one matches and then invokes the handler for that one. (See below.)
  • (30:15) Once we have higher-level data, it's straight forward to filter and group-by to get our user count.
  • Once again, the goal is to take unstructured data and turn it into structured data.
  • "You have just up-leveled unstructured information into a sequence of structured information."
  • Can slurp, split, map, filter, and then aggregate.
  • (32:10) What happens when we try to open a 10 GB log file?
  • Sounds like a problem for next week.
  • "Fixing production is always a problem for right now."
  • "If a server falls over and no one outside of the devops team knows about it, did the server really fall over?"
  • "Who watches the watchers?"

Related episodes:

Clojure in this episode:

  • #""
  • re-matches
  • case
  • cond
  • if-let
  • ->>
  • slurp
  • filter
  • group-by
  • def
  • constantly
  • juxt
  • clojure.string/
    • includes?
    • split

Related links:

Code sample from this episode:

(ns devops.week-01  (:require  [clojure.java.io :as io]  [clojure.string :as string]  ))  ;; General parsing  (def general-re #"(\d\d\d\d-\d\d-\d\d)\s+(\d\d:\d\d:\d\d)\s+\|\s+(\S+)\s+\|\s+(\S+)\s+\|\s+(\S+)\s+\|\s(.*)")  (defn parse-line  [line]  (when-let [[whole dt tm thread-name level ns message] (re-matches general-re line)]  {:raw/line whole  :log/date dt  :log/time tm  :log/thread thread-name  :log/level level  :log/namespace ns  :log/message message  }))  (defn general-parse  [lines]  (->> lines  (map parse-line)  (filter some?)))   ;; Detailed parsing  (def detail-specs  [[#"transaction failed while updating user ([^:]+): code 357"  (fn [[_whole user]] {:kind :code-357 :code-357/user user})]  ])  (defn try-detail-spec  [message [re fn]]  (when-some [matches (re-matches re message)]  (fn matches)))  (defn parse-details  [entry]  (let [{:keys [log/message]} entry]  (if-some [extra (->> detail-specs  (map (partial try-detail-spec message))  (filter some?)  (first))]  (merge entry extra)  entry)))   ;; Log analysis  (defn lines  [filename]  (->> (slurp filename)  (string/split-lines)))  (defn summarize  [filename calc]  (->> (lines filename)  (general-parse)  (map parse-details)  (calc)))   ;; data summarizing  (defn code-357-by-user  [entries]  (->> entries  (filter #(= :code-357 (:kind %)))  (map :code-357/user)  (frequencies)))   (comment  (summarize "sample.log" code-357-by-user)  )

Log file sample:

2019-05-14 16:48:55 | process-Poster | INFO | com.donutgram.poster | transaction failed while updating user joe: code 357 2019-05-14 16:48:56 | process-Poster | INFO | com.donutgram.poster | transaction failed while updating user sally: code 357 2019-05-14 16:48:57 | process-Poster | INFO | com.donutgram.poster | transaction failed while updating user joe: code 357
  continue reading

118 قسمت

Artwork
iconاشتراک گذاری
 
Manage episode 233999238 series 2463849
محتوای ارائه شده توسط Christoph Neumann and Nate Jones, Christoph Neumann, and Nate Jones. تمام محتوای پادکست شامل قسمت‌ها، گرافیک‌ها و توضیحات پادکست مستقیماً توسط Christoph Neumann and Nate Jones, Christoph Neumann, and Nate Jones یا شریک پلتفرم پادکست آن‌ها آپلود و ارائه می‌شوند. اگر فکر می‌کنید شخصی بدون اجازه شما از اثر دارای حق نسخه‌برداری شما استفاده می‌کند، می‌توانید روندی که در اینجا شرح داده شده است را دنبال کنید.https://fa.player.fm/legal

Nate is dropped in the middle of a huge log file and hunts for the source of the errors.

  • We dig into the world of DonutGram.
  • "We are the devops."
  • Problem: we start seeing errors in the log.
  • "The kind of error that makes total sense to whoever wrote the log line."
  • (04:10) We want to use Clojure to characterize the errors.
  • Why not grep?
  • Well...we grep out the lines, count them, and we get 10,351 times. Is that a lot? Hard to say. We need more info.
  • (05:50) Developers think it's safe to ignore, but it's bugging us.
  • We want to come up with an incident count by user.
  • "Important thing we do a lot in the devops area: we see data and then we have to figure out, what does it mean?"
  • "What story is the data telling us?"
  • The "forensic data" versus "operating data" (Ep 022, 027)
  • "With forensic data, you're trying to save the state of the imprint of the known universe over time."
  • (07:40) The log file is the most common forensic data for applications.
  • To characterize, we need to manipulate the data at a higher level than grep.
  • First goal: make the data useful in Clojure
  • Get a nice sample from the production log, put it in a file, load up the REPL, and start writing code to parse it.
  • Need to parse out the data from the log file and turn it into something more structured.
  • (12:05) Let's make a function, parse-line that separates out all the common elements for each line:
    • timestamp
    • log level
    • thread name
    • package name
    • freeform "message"
  • "A message has arrived from the developers, lovingly bundled in a long log line."
  • (13:10) We can parse every line generically, but we need to parse the message and make sense of that.
  • "We want to lift up the unstructured data into structured data that we can map, filter, and reduce on."
  • We will have different kinds of log lines, so we need to detect them and parse appropriately.
  • We want to amplify the map to include the details for the specific kind of log line.
  • We'll use a kind field to identify which kind of log line it is.
  • There are two steps:
    1. recognize which kind of log line the "message" is for
    2. parse the data out of that message
  • "Maybe we'll have a podcast that's constantly about constantly. It's not just juxt."
  • (18:45) How do we do this concisely?
  • Let's use cond:
    • flatten out all the cases
    • code to detect kind (the condition)
    • code to parse the details (the expression)
  • Can use includes? in the condition to detect the kind and re-matches to parse.
  • (20:00) Why not use the regex to detect the kind too?
  • Can avoid writing the regex twice by using a def.
  • It's less code, but now we're running the regex twice: once in the condition and once to parse.
  • We can't capture the result of the test in cond. No "cond-let" macro.
  • We could write our own cond-let macro, but should we?
  • "When you feel like you should write a macro, you should step back and assess your current state of being. Am I tired? Did I have a bad day? Do I really need this?"
  • (24:05) New goals for our problem:
    • one regex literal
    • only run the regex once
    • keep the code that uses the matches close to the regex
  • Similar problem to "routes" for web endpoints: want the route definition next to the code that executes using the data for that route.
  • Feels like an index or table of options.
  • (25:20) Let's make a "table". A vector of regex and handler-code pairs.
  • We need a coffee mug: "Clojure. It's just a list."
  • The code can be a function literal that takes the matches and returns a map of parsed data.
  • Write a function parse-details which goes through each regex until one matches and then invokes the handler for that one. (See below.)
  • (30:15) Once we have higher-level data, it's straight forward to filter and group-by to get our user count.
  • Once again, the goal is to take unstructured data and turn it into structured data.
  • "You have just up-leveled unstructured information into a sequence of structured information."
  • Can slurp, split, map, filter, and then aggregate.
  • (32:10) What happens when we try to open a 10 GB log file?
  • Sounds like a problem for next week.
  • "Fixing production is always a problem for right now."
  • "If a server falls over and no one outside of the devops team knows about it, did the server really fall over?"
  • "Who watches the watchers?"

Related episodes:

Clojure in this episode:

  • #""
  • re-matches
  • case
  • cond
  • if-let
  • ->>
  • slurp
  • filter
  • group-by
  • def
  • constantly
  • juxt
  • clojure.string/
    • includes?
    • split

Related links:

Code sample from this episode:

(ns devops.week-01  (:require  [clojure.java.io :as io]  [clojure.string :as string]  ))  ;; General parsing  (def general-re #"(\d\d\d\d-\d\d-\d\d)\s+(\d\d:\d\d:\d\d)\s+\|\s+(\S+)\s+\|\s+(\S+)\s+\|\s+(\S+)\s+\|\s(.*)")  (defn parse-line  [line]  (when-let [[whole dt tm thread-name level ns message] (re-matches general-re line)]  {:raw/line whole  :log/date dt  :log/time tm  :log/thread thread-name  :log/level level  :log/namespace ns  :log/message message  }))  (defn general-parse  [lines]  (->> lines  (map parse-line)  (filter some?)))   ;; Detailed parsing  (def detail-specs  [[#"transaction failed while updating user ([^:]+): code 357"  (fn [[_whole user]] {:kind :code-357 :code-357/user user})]  ])  (defn try-detail-spec  [message [re fn]]  (when-some [matches (re-matches re message)]  (fn matches)))  (defn parse-details  [entry]  (let [{:keys [log/message]} entry]  (if-some [extra (->> detail-specs  (map (partial try-detail-spec message))  (filter some?)  (first))]  (merge entry extra)  entry)))   ;; Log analysis  (defn lines  [filename]  (->> (slurp filename)  (string/split-lines)))  (defn summarize  [filename calc]  (->> (lines filename)  (general-parse)  (map parse-details)  (calc)))   ;; data summarizing  (defn code-357-by-user  [entries]  (->> entries  (filter #(= :code-357 (:kind %)))  (map :code-357/user)  (frequencies)))   (comment  (summarize "sample.log" code-357-by-user)  )

Log file sample:

2019-05-14 16:48:55 | process-Poster | INFO | com.donutgram.poster | transaction failed while updating user joe: code 357 2019-05-14 16:48:56 | process-Poster | INFO | com.donutgram.poster | transaction failed while updating user sally: code 357 2019-05-14 16:48:57 | process-Poster | INFO | com.donutgram.poster | transaction failed while updating user joe: code 357
  continue reading

118 قسمت

همه قسمت ها

×
 
Loading …

به Player FM خوش آمدید!

Player FM در سراسر وب را برای یافتن پادکست های با کیفیت اسکن می کند تا همین الان لذت ببرید. این بهترین برنامه ی پادکست است که در اندروید، آیفون و وب کار می کند. ثبت نام کنید تا اشتراک های شما در بین دستگاه های مختلف همگام سازی شود.

 

راهنمای مرجع سریع

در حین کاوش به این نمایش گوش دهید
پخش