A podcast for developers interested in building great software products. Every episode, Adam Wathan is joined by a guest to talk about everything from product design and user experience to unit testing and system administration.
…
continue reading
محتوای ارائه شده توسط Christoph Neumann and Nate Jones, Christoph Neumann, and Nate Jones. تمام محتوای پادکست شامل قسمتها، گرافیکها و توضیحات پادکست مستقیماً توسط Christoph Neumann and Nate Jones, Christoph Neumann, and Nate Jones یا شریک پلتفرم پادکست آنها آپلود و ارائه میشوند. اگر فکر میکنید شخصی بدون اجازه شما از اثر دارای حق نسخهبرداری شما استفاده میکند، میتوانید روندی که در اینجا شرح داده شده است را دنبال کنید.https://fa.player.fm/legal
Player FM - برنامه پادکست
با برنامه Player FM !
با برنامه Player FM !
Ep 029: Problem Unknown: Log Lines
Manage episode 233999238 series 2463849
محتوای ارائه شده توسط Christoph Neumann and Nate Jones, Christoph Neumann, and Nate Jones. تمام محتوای پادکست شامل قسمتها، گرافیکها و توضیحات پادکست مستقیماً توسط Christoph Neumann and Nate Jones, Christoph Neumann, and Nate Jones یا شریک پلتفرم پادکست آنها آپلود و ارائه میشوند. اگر فکر میکنید شخصی بدون اجازه شما از اثر دارای حق نسخهبرداری شما استفاده میکند، میتوانید روندی که در اینجا شرح داده شده است را دنبال کنید.https://fa.player.fm/legal
Nate is dropped in the middle of a huge log file and hunts for the source of the errors.
- We dig into the world of DonutGram.
- "We are the devops."
- Problem: we start seeing errors in the log.
- "The kind of error that makes total sense to whoever wrote the log line."
- (04:10) We want to use Clojure to characterize the errors.
- Why not grep?
- Well...we grep out the lines, count them, and we get 10,351 times. Is that a lot? Hard to say. We need more info.
- (05:50) Developers think it's safe to ignore, but it's bugging us.
- We want to come up with an incident count by user.
- "Important thing we do a lot in the devops area: we see data and then we have to figure out, what does it mean?"
- "What story is the data telling us?"
- The "forensic data" versus "operating data" (Ep 022, 027)
- "With forensic data, you're trying to save the state of the imprint of the known universe over time."
- (07:40) The log file is the most common forensic data for applications.
- To characterize, we need to manipulate the data at a higher level than grep.
- First goal: make the data useful in Clojure
- Get a nice sample from the production log, put it in a file, load up the REPL, and start writing code to parse it.
- Need to parse out the data from the log file and turn it into something more structured.
- (12:05) Let's make a function,
parse-line
that separates out all the common elements for each line:- timestamp
- log level
- thread name
- package name
- freeform "message"
- "A message has arrived from the developers, lovingly bundled in a long log line."
- (13:10) We can parse every line generically, but we need to parse the message and make sense of that.
- "We want to lift up the unstructured data into structured data that we can map, filter, and reduce on."
- We will have different kinds of log lines, so we need to detect them and parse appropriately.
- We want to amplify the map to include the details for the specific kind of log line.
- We'll use a
kind
field to identify which kind of log line it is. - There are two steps:
- recognize which kind of log line the "message" is for
- parse the data out of that message
- "Maybe we'll have a podcast that's constantly about
constantly
. It's not justjuxt
." - (18:45) How do we do this concisely?
- Let's use
cond
:- flatten out all the cases
- code to detect kind (the condition)
- code to parse the details (the expression)
- Can use
includes?
in the condition to detect the kind andre-matches
to parse. - (20:00) Why not use the regex to detect the kind too?
- Can avoid writing the regex twice by using a
def
. - It's less code, but now we're running the regex twice: once in the condition and once to parse.
- We can't capture the result of the test in
cond
. No "cond-let" macro. - We could write our own
cond-let
macro, but should we? - "When you feel like you should write a macro, you should step back and assess your current state of being. Am I tired? Did I have a bad day? Do I really need this?"
- (24:05) New goals for our problem:
- one regex literal
- only run the regex once
- keep the code that uses the matches close to the regex
- Similar problem to "routes" for web endpoints: want the route definition next to the code that executes using the data for that route.
- Feels like an index or table of options.
- (25:20) Let's make a "table". A vector of regex and handler-code pairs.
- We need a coffee mug: "Clojure. It's just a list."
- The code can be a function literal that takes the matches and returns a map of parsed data.
- Write a function
parse-details
which goes through each regex until one matches and then invokes the handler for that one. (See below.) - (30:15) Once we have higher-level data, it's straight forward to
filter
andgroup-by
to get our user count. - Once again, the goal is to take unstructured data and turn it into structured data.
- "You have just up-leveled unstructured information into a sequence of structured information."
- Can
slurp
,split
,map
,filter
, and then aggregate. - (32:10) What happens when we try to open a 10 GB log file?
- Sounds like a problem for next week.
- "Fixing production is always a problem for right now."
- "If a server falls over and no one outside of the devops team knows about it, did the server really fall over?"
- "Who watches the watchers?"
Related episodes:
Clojure in this episode:
#""
re-matches
case
cond
if-let
->>
slurp
filter
group-by
def
constantly
juxt
clojure.string/
includes?
split
Related links:
Code sample from this episode:
(ns devops.week-01 (:require [clojure.java.io :as io] [clojure.string :as string] )) ;; General parsing (def general-re #"(\d\d\d\d-\d\d-\d\d)\s+(\d\d:\d\d:\d\d)\s+\|\s+(\S+)\s+\|\s+(\S+)\s+\|\s+(\S+)\s+\|\s(.*)") (defn parse-line [line] (when-let [[whole dt tm thread-name level ns message] (re-matches general-re line)] {:raw/line whole :log/date dt :log/time tm :log/thread thread-name :log/level level :log/namespace ns :log/message message })) (defn general-parse [lines] (->> lines (map parse-line) (filter some?))) ;; Detailed parsing (def detail-specs [[#"transaction failed while updating user ([^:]+): code 357" (fn [[_whole user]] {:kind :code-357 :code-357/user user})] ]) (defn try-detail-spec [message [re fn]] (when-some [matches (re-matches re message)] (fn matches))) (defn parse-details [entry] (let [{:keys [log/message]} entry] (if-some [extra (->> detail-specs (map (partial try-detail-spec message)) (filter some?) (first))] (merge entry extra) entry))) ;; Log analysis (defn lines [filename] (->> (slurp filename) (string/split-lines))) (defn summarize [filename calc] (->> (lines filename) (general-parse) (map parse-details) (calc))) ;; data summarizing (defn code-357-by-user [entries] (->> entries (filter #(= :code-357 (:kind %))) (map :code-357/user) (frequencies))) (comment (summarize "sample.log" code-357-by-user) )
Log file sample:
2019-05-14 16:48:55 | process-Poster | INFO | com.donutgram.poster | transaction failed while updating user joe: code 357 2019-05-14 16:48:56 | process-Poster | INFO | com.donutgram.poster | transaction failed while updating user sally: code 357 2019-05-14 16:48:57 | process-Poster | INFO | com.donutgram.poster | transaction failed while updating user joe: code 357
118 قسمت
Manage episode 233999238 series 2463849
محتوای ارائه شده توسط Christoph Neumann and Nate Jones, Christoph Neumann, and Nate Jones. تمام محتوای پادکست شامل قسمتها، گرافیکها و توضیحات پادکست مستقیماً توسط Christoph Neumann and Nate Jones, Christoph Neumann, and Nate Jones یا شریک پلتفرم پادکست آنها آپلود و ارائه میشوند. اگر فکر میکنید شخصی بدون اجازه شما از اثر دارای حق نسخهبرداری شما استفاده میکند، میتوانید روندی که در اینجا شرح داده شده است را دنبال کنید.https://fa.player.fm/legal
Nate is dropped in the middle of a huge log file and hunts for the source of the errors.
- We dig into the world of DonutGram.
- "We are the devops."
- Problem: we start seeing errors in the log.
- "The kind of error that makes total sense to whoever wrote the log line."
- (04:10) We want to use Clojure to characterize the errors.
- Why not grep?
- Well...we grep out the lines, count them, and we get 10,351 times. Is that a lot? Hard to say. We need more info.
- (05:50) Developers think it's safe to ignore, but it's bugging us.
- We want to come up with an incident count by user.
- "Important thing we do a lot in the devops area: we see data and then we have to figure out, what does it mean?"
- "What story is the data telling us?"
- The "forensic data" versus "operating data" (Ep 022, 027)
- "With forensic data, you're trying to save the state of the imprint of the known universe over time."
- (07:40) The log file is the most common forensic data for applications.
- To characterize, we need to manipulate the data at a higher level than grep.
- First goal: make the data useful in Clojure
- Get a nice sample from the production log, put it in a file, load up the REPL, and start writing code to parse it.
- Need to parse out the data from the log file and turn it into something more structured.
- (12:05) Let's make a function,
parse-line
that separates out all the common elements for each line:- timestamp
- log level
- thread name
- package name
- freeform "message"
- "A message has arrived from the developers, lovingly bundled in a long log line."
- (13:10) We can parse every line generically, but we need to parse the message and make sense of that.
- "We want to lift up the unstructured data into structured data that we can map, filter, and reduce on."
- We will have different kinds of log lines, so we need to detect them and parse appropriately.
- We want to amplify the map to include the details for the specific kind of log line.
- We'll use a
kind
field to identify which kind of log line it is. - There are two steps:
- recognize which kind of log line the "message" is for
- parse the data out of that message
- "Maybe we'll have a podcast that's constantly about
constantly
. It's not justjuxt
." - (18:45) How do we do this concisely?
- Let's use
cond
:- flatten out all the cases
- code to detect kind (the condition)
- code to parse the details (the expression)
- Can use
includes?
in the condition to detect the kind andre-matches
to parse. - (20:00) Why not use the regex to detect the kind too?
- Can avoid writing the regex twice by using a
def
. - It's less code, but now we're running the regex twice: once in the condition and once to parse.
- We can't capture the result of the test in
cond
. No "cond-let" macro. - We could write our own
cond-let
macro, but should we? - "When you feel like you should write a macro, you should step back and assess your current state of being. Am I tired? Did I have a bad day? Do I really need this?"
- (24:05) New goals for our problem:
- one regex literal
- only run the regex once
- keep the code that uses the matches close to the regex
- Similar problem to "routes" for web endpoints: want the route definition next to the code that executes using the data for that route.
- Feels like an index or table of options.
- (25:20) Let's make a "table". A vector of regex and handler-code pairs.
- We need a coffee mug: "Clojure. It's just a list."
- The code can be a function literal that takes the matches and returns a map of parsed data.
- Write a function
parse-details
which goes through each regex until one matches and then invokes the handler for that one. (See below.) - (30:15) Once we have higher-level data, it's straight forward to
filter
andgroup-by
to get our user count. - Once again, the goal is to take unstructured data and turn it into structured data.
- "You have just up-leveled unstructured information into a sequence of structured information."
- Can
slurp
,split
,map
,filter
, and then aggregate. - (32:10) What happens when we try to open a 10 GB log file?
- Sounds like a problem for next week.
- "Fixing production is always a problem for right now."
- "If a server falls over and no one outside of the devops team knows about it, did the server really fall over?"
- "Who watches the watchers?"
Related episodes:
Clojure in this episode:
#""
re-matches
case
cond
if-let
->>
slurp
filter
group-by
def
constantly
juxt
clojure.string/
includes?
split
Related links:
Code sample from this episode:
(ns devops.week-01 (:require [clojure.java.io :as io] [clojure.string :as string] )) ;; General parsing (def general-re #"(\d\d\d\d-\d\d-\d\d)\s+(\d\d:\d\d:\d\d)\s+\|\s+(\S+)\s+\|\s+(\S+)\s+\|\s+(\S+)\s+\|\s(.*)") (defn parse-line [line] (when-let [[whole dt tm thread-name level ns message] (re-matches general-re line)] {:raw/line whole :log/date dt :log/time tm :log/thread thread-name :log/level level :log/namespace ns :log/message message })) (defn general-parse [lines] (->> lines (map parse-line) (filter some?))) ;; Detailed parsing (def detail-specs [[#"transaction failed while updating user ([^:]+): code 357" (fn [[_whole user]] {:kind :code-357 :code-357/user user})] ]) (defn try-detail-spec [message [re fn]] (when-some [matches (re-matches re message)] (fn matches))) (defn parse-details [entry] (let [{:keys [log/message]} entry] (if-some [extra (->> detail-specs (map (partial try-detail-spec message)) (filter some?) (first))] (merge entry extra) entry))) ;; Log analysis (defn lines [filename] (->> (slurp filename) (string/split-lines))) (defn summarize [filename calc] (->> (lines filename) (general-parse) (map parse-details) (calc))) ;; data summarizing (defn code-357-by-user [entries] (->> entries (filter #(= :code-357 (:kind %))) (map :code-357/user) (frequencies))) (comment (summarize "sample.log" code-357-by-user) )
Log file sample:
2019-05-14 16:48:55 | process-Poster | INFO | com.donutgram.poster | transaction failed while updating user joe: code 357 2019-05-14 16:48:56 | process-Poster | INFO | com.donutgram.poster | transaction failed while updating user sally: code 357 2019-05-14 16:48:57 | process-Poster | INFO | com.donutgram.poster | transaction failed while updating user joe: code 357
118 قسمت
All episodes
×به Player FM خوش آمدید!
Player FM در سراسر وب را برای یافتن پادکست های با کیفیت اسکن می کند تا همین الان لذت ببرید. این بهترین برنامه ی پادکست است که در اندروید، آیفون و وب کار می کند. ثبت نام کنید تا اشتراک های شما در بین دستگاه های مختلف همگام سازی شود.