Artwork

محتوای ارائه شده توسط ELC and The Engineering Leadership Community (ELC). تمام محتوای پادکست شامل قسمت‌ها، گرافیک‌ها و توضیحات پادکست مستقیماً توسط ELC and The Engineering Leadership Community (ELC) یا شریک پلتفرم پادکست آن‌ها آپلود و ارائه می‌شوند. اگر فکر می‌کنید شخصی بدون اجازه شما از اثر دارای حق نسخه‌برداری شما استفاده می‌کند، می‌توانید روندی که در اینجا شرح داده شده است را دنبال کنید.https://fa.player.fm/legal
Player FM - برنامه پادکست
با برنامه Player FM !

Reflections on Incidents & Resilience with Nick Rockwell #42

42:07
 
اشتراک گذاری
 

Manage episode 293371377 series 2869838
محتوای ارائه شده توسط ELC and The Engineering Leadership Community (ELC). تمام محتوای پادکست شامل قسمت‌ها، گرافیک‌ها و توضیحات پادکست مستقیماً توسط ELC and The Engineering Leadership Community (ELC) یا شریک پلتفرم پادکست آن‌ها آپلود و ارائه می‌شوند. اگر فکر می‌کنید شخصی بدون اجازه شما از اثر دارای حق نسخه‌برداری شما استفاده می‌کند، می‌توانید روندی که در اینجا شرح داده شده است را دنبال کنید.https://fa.player.fm/legal

Nick Rockwell, SVP of Engineering & Infrastructure @ Fastly shares his recent reflections on incidents, resiliency, blamelessness, and accountability. You’ll hear why the heroic model of incident response is unsustainable, how to improve reliability by closing the long-feedback loop, plus opportunities to maximize post-mortems for process improvement AND emotional processing.

"We started doing a biweekly meeting. We talk about resilience. We revisit everything that has not been closed, whether it's a year old, or it's a day old, , we're forced to keep coming back to it. So how to move away from that incident based post-mortem to something that's more like a continual revisiting of every thread or pathway that's been opened until they're not even open anymore. So that's the lines I'm thinking along."
ABOUT NICK ROCKWELL

Nick Rockwell is SVP of Engineering & Infrastructure @ Fastly helping build the next-generation edge infrastructure for a faster, safer, more resilient Internet. Nick was formerly Chief Technology Officer at The New York Times, overseeing product engineering, infrastructure and R&D. Previously he was Chief Technology Officer of Conde Nast, and Digital CTO at MTV Networks. Throughout his career, Nick has worked at the intersection of media and the Internet, building digital products at scale. Nick graduated from Yale in 1990 with a B.A in Literary Theory.

SHOWNOTES

  • Nick’s story of why incidents, resiliency, accountability & blamelessness are top of mind (2:20)
  • The “heroic model” of incident mitigation and it’s emotional impact (6:41)
  • Building a resilient system & transitioning away from heroics to a more mechanistic incident management model (12:12)
  • “The long feedback loop” of incidents (15:57)
  • Grappling with the risks of a more process-driven, mechanistic model of incident management (21:27)
  • Dedicated vs. distributed incident response teams & how incident management evolves over time (24:43)
  • Balancing individual accountability and a culture of blamelessness (28:37)
  • Why you need to talk about incidents and process their residual emotions (33:12)
  • On maximizing post-mortems for process improvement & emotional processing (37:01)
  • Takeaways (40:15)

Special thanks to our exclusive accessibility partner Mesmer! Mesmer's AI-bots automate mobile app accessibility testing to ensure your app is always accessible to everybody.

To jump-start, your accessibility and inclusion initiative, visit mesmerhq.com/ELC


Hosted by Simplecast, an AdsWizz company. See pcm.adswizz.com for information about our collection and use of personal data for advertising.

  continue reading

244 قسمت

Artwork
iconاشتراک گذاری
 
Manage episode 293371377 series 2869838
محتوای ارائه شده توسط ELC and The Engineering Leadership Community (ELC). تمام محتوای پادکست شامل قسمت‌ها، گرافیک‌ها و توضیحات پادکست مستقیماً توسط ELC and The Engineering Leadership Community (ELC) یا شریک پلتفرم پادکست آن‌ها آپلود و ارائه می‌شوند. اگر فکر می‌کنید شخصی بدون اجازه شما از اثر دارای حق نسخه‌برداری شما استفاده می‌کند، می‌توانید روندی که در اینجا شرح داده شده است را دنبال کنید.https://fa.player.fm/legal

Nick Rockwell, SVP of Engineering & Infrastructure @ Fastly shares his recent reflections on incidents, resiliency, blamelessness, and accountability. You’ll hear why the heroic model of incident response is unsustainable, how to improve reliability by closing the long-feedback loop, plus opportunities to maximize post-mortems for process improvement AND emotional processing.

"We started doing a biweekly meeting. We talk about resilience. We revisit everything that has not been closed, whether it's a year old, or it's a day old, , we're forced to keep coming back to it. So how to move away from that incident based post-mortem to something that's more like a continual revisiting of every thread or pathway that's been opened until they're not even open anymore. So that's the lines I'm thinking along."
ABOUT NICK ROCKWELL

Nick Rockwell is SVP of Engineering & Infrastructure @ Fastly helping build the next-generation edge infrastructure for a faster, safer, more resilient Internet. Nick was formerly Chief Technology Officer at The New York Times, overseeing product engineering, infrastructure and R&D. Previously he was Chief Technology Officer of Conde Nast, and Digital CTO at MTV Networks. Throughout his career, Nick has worked at the intersection of media and the Internet, building digital products at scale. Nick graduated from Yale in 1990 with a B.A in Literary Theory.

SHOWNOTES

  • Nick’s story of why incidents, resiliency, accountability & blamelessness are top of mind (2:20)
  • The “heroic model” of incident mitigation and it’s emotional impact (6:41)
  • Building a resilient system & transitioning away from heroics to a more mechanistic incident management model (12:12)
  • “The long feedback loop” of incidents (15:57)
  • Grappling with the risks of a more process-driven, mechanistic model of incident management (21:27)
  • Dedicated vs. distributed incident response teams & how incident management evolves over time (24:43)
  • Balancing individual accountability and a culture of blamelessness (28:37)
  • Why you need to talk about incidents and process their residual emotions (33:12)
  • On maximizing post-mortems for process improvement & emotional processing (37:01)
  • Takeaways (40:15)

Special thanks to our exclusive accessibility partner Mesmer! Mesmer's AI-bots automate mobile app accessibility testing to ensure your app is always accessible to everybody.

To jump-start, your accessibility and inclusion initiative, visit mesmerhq.com/ELC


Hosted by Simplecast, an AdsWizz company. See pcm.adswizz.com for information about our collection and use of personal data for advertising.

  continue reading

244 قسمت

همه قسمت ها

×
 
Loading …

به Player FM خوش آمدید!

Player FM در سراسر وب را برای یافتن پادکست های با کیفیت اسکن می کند تا همین الان لذت ببرید. این بهترین برنامه ی پادکست است که در اندروید، آیفون و وب کار می کند. ثبت نام کنید تا اشتراک های شما در بین دستگاه های مختلف همگام سازی شود.

 

راهنمای مرجع سریع

در حین کاوش به این نمایش گوش دهید
پخش