APM Will Rock for You
Application performance management (APM), historically, has been an afterthought. Why has it been an afterthought? We have application testing, DevOps, development, infrastructure, networking, and scrum teams that all have different roles and responsibilities. Typically, there isn’t one team that has an end-to-end view of the IT landscape from a performance perspective; Various teams are responsible for HTTP requests, dependency calls to external services or databases, network latency, and system resource utilization monitoring. Having these different teams collaborate to identify top KPIs across the IT landscape allows us to link accurate metrics with performance or scalability issues. For example, Is the end-user response time linear as the number of users increase or does it shoot up like a hockey stick? APM is necessary and should be used consistently across our Test and Production environments.
Here are three reasons why:
1. Process before tools. “If you quit on the process, you are quitting on the result (Idowu Koyenikan).” Frequently, as IT professionals, we get fixated on searching, finding and implementing the newest tool because we believe it has EVERYTHING we are looking for; We expect tools to fix our process. May I suggest, however, using our expertise to solidify our processes before trying tools and shortcuts. Then, we can influence and partner with IT stakeholders and end-users to further refine our processes. For example, an IT landscape diagram digitizes the tribal knowledge and also allows our key stakeholders to better understand and validate the environment–Other process examples are end-user workflow, application architecture, integration, performance testing, and unit testing.
Every company has a different culture and the conversations will vary in difficulty, but be persistent because these conversations are essential to our team members, technology vendors and end-users
2. KPIs are your friend. Now we can assign KPIs to the objects/components in our process diagrams. Peter Drucker has said, “If you can’t measure it, you can’t improve it.” Leading development and infrastructure teams I have learned many times his statement is true, but one of the toughest conversations to have in technology. When you ask an IT professional which KPIs they prioritize, the classic response is “There are hundreds” or “it depends.” Their response is vague and we must be results-oriented. It has worked for me to frame the question this way, “Which top KPIs, if a threshold is exceeded, would you investigate immediately?” Although some KPI thresholds are not known, defining testing baselines and deviations is helpful for production-monitoring. Identifying the most relevant KPIs can be tedious, so categorize them:
• Performance - transaction throughput, application response time, latency, query performance
• Resource Utilization - CPU, memory, disk
• Operational - error code logging (infrastructure and application)
• Availability - up/down monitoring of system and critical services or processes
This information will help us understand our capacity planning, best performance coding practices and other APM strengths and weaknesses. Every company has a different culture and the conversations will vary in difficulty, but be persistent because these conversations are essential to our team members, technology vendors and end-users.
3. Automate. Automate. Automate. Once we have our methodology and procedure, finding efficient tools to meet our requirements will be easier. Plus, we will have more buy-in from our key stakeholders after getting their input. The tooling and monitoring should make it so easy for developers and other technical teams that deviating is actually a harder process. Key capabilities to consider are: an at-a-glance dashboard, correlative KPI metrics, graphical trends, “drill-down” triage and diagnosis, synthetic transactions, and proactive alerting. Design and implement “APM Will Rock for You” training sessions, and we will have a bunch of ninjas that can detect scalability and performance issues at the speed of light–well maybe not that fast.
This is a tough nut to crack that requires determination and patience, yet the result will help you meet end-user performance expectations consistently while giving your company a competitive advantage.