Performance Engineering — The Reliability Edition


Can we improve the reliability of a system by employing various performance engineering techniques to different stages of the development process?

  • A mechanism to run load against an application or system
  • A way of measuring how they performed
  • A way of comparing the results against what we believe is the ideal state

In a Nutshell

Each Performance execution and analysis piece should look to be guided by the Engineering Efficiency, DevOps and Reliability principles that apply to software development

  • Reliability Engineering(RE) attempts to predict and prevent the risk of there being a failure whether that be a component or an entire system of services
  • Performance Engineering(PE) states we should start earlier in the SDLC to get faster feedback, but also extends into Operations and Support to use real world data to build/update of the performance models (scripts and analysis)
  • Performance Testing (PT) is all about determining what the performance of an application is (baselining) or comparing to how you believe it should be(delta analysis) under various conditions and situations in the ‘test’ environment

A Look at Performance Engineering

PE looks incorporate the methodologies of ‘Agile’ and use these in conjunction with ‘DevOps’ idealisms in order to provide a improved approach that adds value rather than one that tends to hinder delivery velocity

The Performance Engineering Model

PE is all about applying process and strategies at each step of the SDLC, the following are example actions/options that can be applied within each vertical

Traditional Performance Testing

Quite often done within the “test phase” and entails a big bang approach that consists of many pods/VM’s to generate load against an application/system

Shift Left Approach

Reducing the SDLC feedback loop to uncover and rectify potential system and environment issues early

Move Right Approach

A “Move Right” approach extends testing out to include user feedback and metrics from your production environment. This can then be used to update the performance model that’s developed as a consequence

Measurements and Observability

The use of performance metrics from each environment (Dev/Test/Prod) are used to determine whether they are within SLO’s limits.

  • API / UI response times
  • DB transaction times
  • Pod / VM scaling events
  • CPU use / Network activity / Memory usage




Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store
Scott Griffiths

Scott Griffiths

Engineer, Consultant, Trainer, Learner, SRE, DevOps and Hiker