Talk Python to Me is a weekly podcast hosted by developer and entrepreneur Michael Kennedy. We dive deep into the popular packages and software developers, data scientists, and incredible hobbyists doing amazing things with Python. If you're new to Python, you'll quickly learn the ins and outs of the community by hearing from the leaders. And if you've been Pythoning for years, you'll learn about your favorite packages and the hot new ones coming out of open source.
…
continue reading
محتوای ارائه شده توسط Christoph Neumann and Nate Jones, Christoph Neumann, and Nate Jones. تمام محتوای پادکست شامل قسمتها، گرافیکها و توضیحات پادکست مستقیماً توسط Christoph Neumann and Nate Jones, Christoph Neumann, and Nate Jones یا شریک پلتفرم پادکست آنها آپلود و ارائه میشوند. اگر فکر میکنید شخصی بدون اجازه شما از اثر دارای حق نسخهبرداری شما استفاده میکند، میتوانید روندی که در اینجا شرح داده شده است را دنبال کنید.https://fa.player.fm/legal
Player FM - برنامه پادکست
با برنامه Player FM !
با برنامه Player FM !
Ep 034: Break the Mold
Manage episode 236576529 series 2463849
محتوای ارائه شده توسط Christoph Neumann and Nate Jones, Christoph Neumann, and Nate Jones. تمام محتوای پادکست شامل قسمتها، گرافیکها و توضیحات پادکست مستقیماً توسط Christoph Neumann and Nate Jones, Christoph Neumann, and Nate Jones یا شریک پلتفرم پادکست آنها آپلود و ارائه میشوند. اگر فکر میکنید شخصی بدون اجازه شما از اثر دارای حق نسخهبرداری شما استفاده میکند، میتوانید روندی که در اینجا شرح داده شده است را دنبال کنید.https://fa.player.fm/legal
Christoph finds exceptional log lines and takes a more literal approach.
- Previously, we upgraded our log parsing to handle finding two errors instead of just one.
- "It's amazing what you don't find when you're not looking."
- We ended up with a set of functions that can parse out multiple errors.
- The result is a nice mixed sequence of errors that we can aggregate and pivot with Clojure core.
- "By the power of Clojure core, I will
map
andreduce
!" - (02:43) New Problem: there are exceptions in the log, and they span multiple lines!
- The exception continuation lines don't parse.
- They don't have a date, a log level, or anything else that regular log lines have.
- How do we collect up all those lines? They are all part of one logical error.
- Our error parsers don't get a chance to see these lines, because the general line parser threw them away.
- (06:27) Solution 1: Don't pre-parse the lines, but instead have each error parsing function do the general parse.
- Each parsing function would receive the raw, unparsed, line as a string.
- Each function would have to run
parse-line
on the inputs before doing their specific parsing. - Now each function in our inventory needs to re-do the general parsing of the line.
- This ends up being much more inefficient.
- (10:13) Solution 2: Relax the general parsing.
- What is the general parser to do with a line that doesn't match the regexp?
- "How about if the general parse does the best job it can, and whatever it can't do, it doesn't do?"
- One of the keys in the data returned from
parse-line
is:raw/line
, which is the entire line, and that's always there. - So, if the regexp fails, we can always return at least that.
- The map doesn't have the keys that are not found.
- Now,
parse-line
will always return a map for each line. - We've uplifted the data, at least slightly, from a string to a map.
- We can know two things
- Every map will have
:raw/line
- If a map doesn't have the other keys (like
:log/date
), that means it didn't parse, and is probably a continuation of a previous line.
- Every map will have
- This is us expanding our program's view of reality to more closely match actual reality.
- We left out part of reality, and got stuck because of it.
- Being more literal to the data source gives us more flexibility.
- Clojure's dynamic nature shines in this situation. We don't have to have all the keys all the time.
- "A map is a bucket of dimensions."
- Each function that operates on the data will inspect the data and see if it has all the keys it needs. It doesn't matter if there are other keys.
- Namespaced keys really help in this situation, each new level of parsing can add keys without overwriting keys from other dimensions.
- How does this impact our parsing functions?
- They are unchanged, because they grab the message out using
some->>
, which shortcuts onnil
. - (19:08) Our exception error parsing function detect the start in a number of ways.
- The next line is a bare line?
- The first line ends in an opening curly brace?
- Then, to find the end of the exception, it can use
take-while
to find all the lines that are bare. - To assist in finding bare lines, we can introduce a predicate function instead of embedding that logic.
- When all lines are found, it then combines them into a new
:log/message
key. - When updating the error map in our parsing functions, we are careful to only operate on the data we know is there, so that other keys are not impacted.
Related episodes:
- 028: Fail Donut
- 029: Problem Unknown: Log Lines
- 030: Lazy Does It
- 031: Eager Abstraction
- 032: Call Me Lazy
- 033: Cake or Ice Cream? Yes!
Clojure in this episode:
some->>
,take-while
Code sample from this episode:
(ns devops.week-06 (:require [clojure.string :as string] [devops.week-02 :refer [process-log]] [devops.week-05 :refer [parse-357-error parse-sprinkle]] )) (def general-re #"(\d\d\d\d-\d\d-\d\d)\s+(\d\d:\d\d:\d\d)\s+\|\s+(\S+)\s+\|\s+(\S+)\s+\|\s+(\S+)\s+\|\s(.*)") (defn parse-line [line] (if-let [[whole dt tm thread-name level ns message] (re-matches general-re line)] {:raw/line whole :log/date dt :log/time tm :log/thread thread-name :log/level level :log/namespace ns :log/message message} {:raw/line line :log/message line})) (defn bare-line? [line] (nil? (:log/date line))) (defn parse-exception-info [lines] (let [first-line (first lines) [_whole classname] (some->> first-line :log/message (re-matches #"([^ ]+) #error \{"))] (when classname (let [error-lines (cons first-line (take-while bare-line? (rest lines))) error-str (string/join "\n" (map :log/message error-lines))] (merge first-line {:kind :error :error/class classname :log/message error-str}))))) (defn parse-next [lines] (or (parse-357-error lines) (parse-sprinkle lines) (parse-exception-info lines))) (defn parse-all [lines] (lazy-seq (when (seq lines) (if-some [found (parse-next lines)] (cons found (parse-all (rest lines))) (parse-all (rest lines)))))) (comment (process-log "sample.log" #(->> % (map parse-line) parse-all doall)) )
Log file sample:
2019-05-14 16:48:57 | process-Poster | INFO | com.donutgram.poster | transaction failed while updating user joe: code 357 2019-05-14 16:48:55 | process-Poster | INFO | com.donutgram.poster | failed to add sprinkle to donut 50493 2019-05-14 16:48:55 | process-Poster | INFO | com.donutgram.poster | sprinkle fail reason: unknown state 2019-05-14 16:48:55 | process-Poster | INFO | com.donutgram.poster | Poster #error { :cause "Failed to lock the synchronizer" :data {} :via [{:type clojure.lang.ExceptionInfo :message "Failed to lock the synchronizer" :data {} :at [process.poster$eval50560 invokeStatic "poster.clj" 40]}] :trace [[process.poster$eval50560 invokeStatic "poster.clj" 40] [clojure.lang.AFn run "AFn.java" 22] [java.lang.Thread run "Thread.java" 748]]}
118 قسمت
Manage episode 236576529 series 2463849
محتوای ارائه شده توسط Christoph Neumann and Nate Jones, Christoph Neumann, and Nate Jones. تمام محتوای پادکست شامل قسمتها، گرافیکها و توضیحات پادکست مستقیماً توسط Christoph Neumann and Nate Jones, Christoph Neumann, and Nate Jones یا شریک پلتفرم پادکست آنها آپلود و ارائه میشوند. اگر فکر میکنید شخصی بدون اجازه شما از اثر دارای حق نسخهبرداری شما استفاده میکند، میتوانید روندی که در اینجا شرح داده شده است را دنبال کنید.https://fa.player.fm/legal
Christoph finds exceptional log lines and takes a more literal approach.
- Previously, we upgraded our log parsing to handle finding two errors instead of just one.
- "It's amazing what you don't find when you're not looking."
- We ended up with a set of functions that can parse out multiple errors.
- The result is a nice mixed sequence of errors that we can aggregate and pivot with Clojure core.
- "By the power of Clojure core, I will
map
andreduce
!" - (02:43) New Problem: there are exceptions in the log, and they span multiple lines!
- The exception continuation lines don't parse.
- They don't have a date, a log level, or anything else that regular log lines have.
- How do we collect up all those lines? They are all part of one logical error.
- Our error parsers don't get a chance to see these lines, because the general line parser threw them away.
- (06:27) Solution 1: Don't pre-parse the lines, but instead have each error parsing function do the general parse.
- Each parsing function would receive the raw, unparsed, line as a string.
- Each function would have to run
parse-line
on the inputs before doing their specific parsing. - Now each function in our inventory needs to re-do the general parsing of the line.
- This ends up being much more inefficient.
- (10:13) Solution 2: Relax the general parsing.
- What is the general parser to do with a line that doesn't match the regexp?
- "How about if the general parse does the best job it can, and whatever it can't do, it doesn't do?"
- One of the keys in the data returned from
parse-line
is:raw/line
, which is the entire line, and that's always there. - So, if the regexp fails, we can always return at least that.
- The map doesn't have the keys that are not found.
- Now,
parse-line
will always return a map for each line. - We've uplifted the data, at least slightly, from a string to a map.
- We can know two things
- Every map will have
:raw/line
- If a map doesn't have the other keys (like
:log/date
), that means it didn't parse, and is probably a continuation of a previous line.
- Every map will have
- This is us expanding our program's view of reality to more closely match actual reality.
- We left out part of reality, and got stuck because of it.
- Being more literal to the data source gives us more flexibility.
- Clojure's dynamic nature shines in this situation. We don't have to have all the keys all the time.
- "A map is a bucket of dimensions."
- Each function that operates on the data will inspect the data and see if it has all the keys it needs. It doesn't matter if there are other keys.
- Namespaced keys really help in this situation, each new level of parsing can add keys without overwriting keys from other dimensions.
- How does this impact our parsing functions?
- They are unchanged, because they grab the message out using
some->>
, which shortcuts onnil
. - (19:08) Our exception error parsing function detect the start in a number of ways.
- The next line is a bare line?
- The first line ends in an opening curly brace?
- Then, to find the end of the exception, it can use
take-while
to find all the lines that are bare. - To assist in finding bare lines, we can introduce a predicate function instead of embedding that logic.
- When all lines are found, it then combines them into a new
:log/message
key. - When updating the error map in our parsing functions, we are careful to only operate on the data we know is there, so that other keys are not impacted.
Related episodes:
- 028: Fail Donut
- 029: Problem Unknown: Log Lines
- 030: Lazy Does It
- 031: Eager Abstraction
- 032: Call Me Lazy
- 033: Cake or Ice Cream? Yes!
Clojure in this episode:
some->>
,take-while
Code sample from this episode:
(ns devops.week-06 (:require [clojure.string :as string] [devops.week-02 :refer [process-log]] [devops.week-05 :refer [parse-357-error parse-sprinkle]] )) (def general-re #"(\d\d\d\d-\d\d-\d\d)\s+(\d\d:\d\d:\d\d)\s+\|\s+(\S+)\s+\|\s+(\S+)\s+\|\s+(\S+)\s+\|\s(.*)") (defn parse-line [line] (if-let [[whole dt tm thread-name level ns message] (re-matches general-re line)] {:raw/line whole :log/date dt :log/time tm :log/thread thread-name :log/level level :log/namespace ns :log/message message} {:raw/line line :log/message line})) (defn bare-line? [line] (nil? (:log/date line))) (defn parse-exception-info [lines] (let [first-line (first lines) [_whole classname] (some->> first-line :log/message (re-matches #"([^ ]+) #error \{"))] (when classname (let [error-lines (cons first-line (take-while bare-line? (rest lines))) error-str (string/join "\n" (map :log/message error-lines))] (merge first-line {:kind :error :error/class classname :log/message error-str}))))) (defn parse-next [lines] (or (parse-357-error lines) (parse-sprinkle lines) (parse-exception-info lines))) (defn parse-all [lines] (lazy-seq (when (seq lines) (if-some [found (parse-next lines)] (cons found (parse-all (rest lines))) (parse-all (rest lines)))))) (comment (process-log "sample.log" #(->> % (map parse-line) parse-all doall)) )
Log file sample:
2019-05-14 16:48:57 | process-Poster | INFO | com.donutgram.poster | transaction failed while updating user joe: code 357 2019-05-14 16:48:55 | process-Poster | INFO | com.donutgram.poster | failed to add sprinkle to donut 50493 2019-05-14 16:48:55 | process-Poster | INFO | com.donutgram.poster | sprinkle fail reason: unknown state 2019-05-14 16:48:55 | process-Poster | INFO | com.donutgram.poster | Poster #error { :cause "Failed to lock the synchronizer" :data {} :via [{:type clojure.lang.ExceptionInfo :message "Failed to lock the synchronizer" :data {} :at [process.poster$eval50560 invokeStatic "poster.clj" 40]}] :trace [[process.poster$eval50560 invokeStatic "poster.clj" 40] [clojure.lang.AFn run "AFn.java" 22] [java.lang.Thread run "Thread.java" 748]]}
118 قسمت
All episodes
×به Player FM خوش آمدید!
Player FM در سراسر وب را برای یافتن پادکست های با کیفیت اسکن می کند تا همین الان لذت ببرید. این بهترین برنامه ی پادکست است که در اندروید، آیفون و وب کار می کند. ثبت نام کنید تا اشتراک های شما در بین دستگاه های مختلف همگام سازی شود.