← Use cases

Troubleshooting a problem

From something's wrong to fixed or escalated.

The case

Something is wrong. You don’t know exactly what, you’re not sure how bad it is, and whatever you do next will either solve it or send you down a path that wastes an hour. Most troubleshooting goes wrong not because people are bad at fixing things, but because they don’t slow down enough at the start to describe what’s actually happening.

The instinct is to start trying things immediately. Restart it. Reinstall it. Google the symptom before you’ve fully articulated the symptom. This works sometimes, by luck, and when it doesn’t you’ve introduced new variables into an already unclear situation.

A troubleshooting routine is really just a decision tree with a consistent entry point. You describe the problem precisely, you try to reproduce it, you check what changed, you search deliberately, you try the best options in order. At each branch, you either move forward or you skip ahead. If you reach the end without a fix, you escalate — but you escalate with something useful rather than a shrug.

The other thing a routine gives you is a stopping condition. Without one, troubleshooting tends to expand to fill whatever time is available. The decision point at step #12 forces the question: is this still a good use of your time, or is this someone else’s problem now.

Troubleshooting a Problem

  1. Write down the steps you took, what you expected to happen, and what actually happened.
  2. Try to reproduce the problem by following your list exactly. If you can't reproduce it, return to @1 — your list isn't complete yet.
  3. Check what changed recently. Software update, new input, configuration change, anything. If nothing changed, skip to @7.
  4. Undo the change. Revert to the previous state or version. If you managed to undo it, go back to @2 to see if the problem still exists.
  5. Isolate the change. Disable, disconnect, or remove only the thing that changed and test again. The goal is to confirm whether that change is the cause before doing anything else. If this fixes it, skip to @13.
  6. Reapply the change. You've confirmed it's the culprit. Note it down. Skip to @13.
  7. Search for the exact error message or symptom. Copy the error verbatim if there is one. Vague searches return vague results.
  8. Note which fixes appear most often in the results. Frequency is a starting point, not proof.
  9. Check the date on anything you're about to try. A fix from five years ago may no longer apply.
  10. Try the most credible fix. If that fixes it, skip to @13.
  11. Try the next most credible fix. If that fixes it, skip to @13.
  12. Decide: keep trying or escalate. How long have you spent? How critical is this? If escalating, skip to @14. If continuing, go back to @10.
  13. Document what fixed it. One sentence. You will forget, and you will hit this again.
  14. Write up the problem for whoever's taking it on. What you tried, what you ruled out, what you think it might be. Hand it over cleanly.

Make it yours

The most important step is #13, and it's the one most people skip. Fixing a problem and documenting the fix are not the same task — the second one is what stops you spending forty minutes on the same thing six months from now. One sentence is enough.

Steps #10 and #11 are placeholders for a loop that could run longer. Some problems take ten attempts before something sticks. Add steps as needed, and make the call at #12 about whether to keep going or hand it off.

The escalation path at #14 is worth taking seriously. A clear handoff — what you tried, what you ruled out — is genuinely useful to whoever picks it up. Arriving with "it's broken" is not.

If you're using this for a specific domain (software, hardware, a piece of equipment), customise the middle steps to match your usual diagnostic sequence. The shape stays the same; the specifics are yours to fill in.