Root cause analysis
Root cause analysis (RCA) is a structured technique used to identify the underlying reasons a problem, defect, or unexpected outcome occurred. It focuses on causes rather than symptoms to prevent recurrence and improve performance.
Definition
See definition above.
Key Points
- RCA looks beyond immediate symptoms to find underlying, systemic causes.
- Common tools include 5 Whys, fishbone (Ishikawa) diagrams, Pareto analysis, and fault tree analysis.
- It is collaborative and evidence-driven, using data and stakeholder input.
- Results inform corrective and preventive actions to stop recurrence.
- Applicable across domains for defects, delays, cost variances, risks, and incidents.
- Often triggers updates to plans, processes, and the lessons learned repository.
Purpose of Analysis
- Prevent repeat issues by addressing real causes rather than treating symptoms.
- Improve quality, reliability, and flow of work across the system.
- Reduce waste and cost associated with rework, delays, and defects.
- Enable informed decision-making on corrective actions and risk responses.
- Strengthen organizational learning through documented insights.
Method Steps
- Define the problem clearly: what happened, where, when, and impact.
- Gather evidence: data, logs, metrics, observations, and stakeholder input.
- Map the process or workflow to see where the issue manifests.
- Identify possible causes using brainstorming and a cause-and-effect (fishbone) diagram.
- Drill down with 5 Whys (or similar) to trace symptoms to deeper causes.
- Analyze data to validate suspected causes; look for patterns and correlations.
- Confirm root causes with the team and, if possible, test or replicate findings.
- Develop and prioritize corrective and preventive actions addressing root causes.
- Implement actions, assign owners and due dates, and track effectiveness.
- Document results and update lessons learned and relevant plans.
Inputs Needed
- Clear problem statement and acceptance/definition-of-done criteria.
- Performance data: metrics, logs, defect reports, incident tickets, and trend charts.
- Process artifacts: process maps, SOPs, checklists, and work instructions.
- Stakeholder insights: interviews, observations, and team feedback.
- Project documents: risk register, issue log, change log, and assumptions/constraints.
- Historical information and lessons learned from similar work.
Outputs Produced
- Validated root cause statements and supporting evidence.
- Cause-and-effect diagrams, 5 Whys records, and analysis notes.
- Recommended corrective and preventive actions with owners and timelines.
- Change requests or updates to plans, processes, and checklists.
- Updates to the risk register, issue log, and lessons learned repository.
- Follow-up measures and metrics to verify effectiveness.
Interpretation Tips
- Differentiate between contributing factors and true root causes; there may be multiple.
- Validate causes with data; avoid relying solely on opinions or anecdotes.
- Look for systemic issues in process, tools, environment, and governance, not just people.
- Ensure identified causes are actionable and within the team’s influence or escalate appropriately.
- Test the logic: if the cause is removed, would the problem likely not recur?
- Reassess after actions; lack of improvement may indicate missed or deeper causes.
Example
A project experiences repeated late handoffs between design and development. Initial fixes (adding reminders) do not help. The team conducts RCA.
They create a fishbone diagram and apply 5 Whys. Evidence shows frequent rework due to unclear requirements and overallocated reviewers. Root causes include vague acceptance criteria, no standard review checklist, and conflicting resource assignments.
Actions: define acceptance criteria templates, introduce a review checklist, adjust resource allocation, and add a WIP limit. Subsequent sprints show on-time handoffs with fewer defects.
Pitfalls
- Jumping to solutions before fully understanding the problem.
- Stopping at the first apparent cause and not probing deeper.
- Blaming individuals instead of examining processes and systems.
- Ignoring data that contradicts assumptions (confirmation bias).
- Conducting RCA without the right stakeholders or process owners.
- Failing to verify effectiveness of actions and capture lessons learned.
PMP Example Question
A team fixes the same defect type across several iterations, but it keeps returning. What should the project manager do next?
- Increase testing effort and add more testers to catch defects earlier.
- Conduct a root cause analysis with stakeholders using tools like 5 Whys and a fishbone diagram.
- Escalate to the sponsor to request additional budget for rework.
- Retrain the developer responsible for the most recent defect.
Correct Answer: B — Conduct a root cause analysis with stakeholders using tools like 5 Whys and a fishbone diagram.
Explanation: RCA targets underlying causes to prevent recurrence, which is more effective than adding tests, escalating, or retraining without evidence of the true cause.
HKSM