Usability testing is straightforward: give people realistic tasks and watch what happens. Then fix what hurts and watch your profits grow with a better design that meets your business goals better. We’re in the insights business, and user testing is the best way to gain these insights.
Don’t get intimidated by the 12 steps discussed in this article. At heart, usability testing has two components: (1) watch users while they (2) perform tasks. Everything else is padding to maximize your ROI from these two. (GPT-Image 1)
The difference between opinion soup and a study that yields durable, defensible decisions lies in the process. A systematic sequence gives you three compounding advantages.
Do not fish for design insights in the opinion soup. Run a study to catch some real fish. (GPT Image-1)
First, it controls bias. Clear goals, stable tasks, consistent facilitation, and a predeclared analysis plan reduce the all‑too‑human tendencies to see what you expect, over-help participants, or cherry-pick clips that support pet solutions.
Second, a sequence of defined steps creates traceability. When stakeholders ask, “Why should we change this flow?” you can trace the recommendation back through severity, frequency, and impact to the tasks and scenarios, and finally to the business objective the study was designed to address. That chain only exists if you start with a precise problem statement and end with an action plan and follow-through.
Third, predefined steps protect velocity. Pilot sessions catch brittle scripts and flaky tech before your calendar is on fire. Prebuilt note templates, observer debriefs, and a prioritization rubric speed analysis and decision‑making. Practices drawn from both industry and academia let you summarize the “how usable” story quickly without hand-waving.
Prevent surprises: conduct frequent user research. (GPT Image-1)
Finally, a sequence turns testing from a one-off ceremony into a habit. After reporting study findings, you still have the last two steps of my process: ensure changes are implemented and schedule the next research iteration. Teams that make user contact routine (weekly or per sprint) and manage research operations intentionally ship better experiences, faster. Continuous discovery and ResearchOps are not fads; they are the scaffolding that keeps good research happening as products evolve.
In short: if you want user insights that withstand scrutiny and drive real product change, treat usability testing like any other engineering discipline with structured inputs, controlled execution, and explicit outputs. Follow the 12 steps I outline in this article for the best results. Once you have completed multiple projects according to the process, many of the steps can be abbreviated, partly by reusing materials from previous rounds.
Here’s an overview of the 12 steps:
My whole process contains 12 steps: this infographic summarizes the most important half (steps 2, 4, 5, 8, 8, and 10). It really is an easy-enough progression. (GPT Image-1)
Step 1. Define the Problem, the Context, and Existing Data
Defining the problem is the most critical step in usability testing. Think of it as laying the foundation for a house: skip this step, and everything else becomes a wobbly tower of wasted time. Without a clear, evidence-backed problem statement, your study risks solving phantom problems or chasing low-priority concerns. This leads to wasted resources, irrelevant findings, and researchers reaching for resume-polishing websites.
A well-defined problem statement serves as your project’s North Star, ensuring every decision aligns with a single purpose. It prevents teams from prescribing solutions before diagnosing diseases (AKA, “solutionitis”). An effective problem statement isn’t speculation or wishful thinking; it’s a precise articulation of actual user struggles, grounded in genuine data.
This process also performs organizational therapy by forcing departments to share their data treasures. When UX researchers synthesize insights from support tickets, analytics dashboards, and customer feedback, they create a unified view of the user experience. Teams that struggle to access or share data reveal deeper dysfunction. Successfully completing this step often sparks a shared understanding of users across the entire organization, like a company-wide “aha” moment.
First, synthesize all existing data before commissioning new research. This prevents redundant work and ensures you’re addressing real knowledge gaps rather than rehashing known issues. Key sources include:
Second, craft a user-centric problem statement using your synthesized data. Strong problem statements must be:
Use frameworks like the Five Ws: Who is affected? What is the problem? Where does it occur? When does it occur? Why is it important?
Be a wise owl and use the 5 Ws as one possible structuring framework when defining your research problem statement. (GPT Image-1)
This problem-definition stage is complete when:
Remember: proper problem definition prevents pathetic performance.
Step 2. Define Clear Research Goals, Objectives, and Scope
If the problem statement (from Step 1) tells us “why” we’re researching, then goals, objectives, and scope reveal the “what” and “how.” This step transforms fuzzy, user-centric problems into concrete, manageable research projects. Without clear goals, studies drift into unfocused fishing expeditions that catch interesting but inedible insights.
Well-defined objectives provide specific questions research must answer, ensuring collected data actually addresses the problem rather than wandering into fascinating but futile territories.
Equally critical is defining project scope. Scope creep (that sneaky, silent study saboteur) remains a primary project killer. It springs from stakeholder ambiguity, poor communication, or “just one more thing” syndrome, where tiny requests gradually transform your trim study into a bloated behemoth. A formal scope document acts as a contract between researchers and stakeholders, establishing boundaries about what will and won’t be tested. This clarity prevents misunderstandings, manages expectations, and protects your timeline from well-meaning meddlers.
It's easy to fall victim to scope creep, because each extra question seems innocent enough. However, the cumulative effect is to divert your research from the main goal. (GPT Image-1)
Defining scope transcends procedural paperwork; it becomes a powerful prioritization tool. When stakeholders review scope documents, they confront trade-offs inherent in resource-limited projects. The document transforms wish lists into priority rankings. When someone suggests adding research questions, you can respond strategically: “Excellent idea! To accommodate it within our timeline, we’ll need to remove another objective. Which existing goal should go?” This elevates researchers from order-takers to strategic partners, facilitating focused decisions about resource allocation.
First, translate your problem statement into clear research goals. These broad statements describe what your study aims to learn. Keep it to three to five goals to maintain focus. Goals must remain neutral, avoiding language that assumes outcomes.
Second, develop specific research questions for each goal. These granular, observable questions bridge high-level goals and actual test tasks. For the newsletter signup goal, questions might include:
Third, establish and document project scope. Create a formal document that stakeholders review and approve, explicitly stating:
This stage succeeds when you’ve produced:
Remember: scope documents save sanity, schedules, and stakeholder relationships.
Step 3. Choose Appropriate Usability Testing Methods, Tools, and Metrics
Selecting research methods, tools, and metrics is like choosing the right utensils for dinner. You wouldn’t eat soup with a fork (unless you enjoy frustration), and you shouldn’t use A/B tests to understand emotional responses. There’s no universal “best” method; the right choice depends on your research goals, product stage, and available resources.
Choosing quantitative methods when you need the “why” behind behavior yields statistically significant but contextually starved results. Using qualitative interviews to measure task success rates across hundreds of users is like counting grains of sand with tweezers: technically possible but terrifically tedious. Wrong methods mean wrong answers, wasted work, and wondering why your research flopped.
Tools matter too. Using basic video conferencing for moderated testing without proper recording capabilities turns your study into a memory test for researchers. Without predefined metrics, analysis becomes subjective storytelling rather than systematic science. Setting specific targets (like completion rates or usability scores) transforms fuzzy feelings into factual findings that even spreadsheet-loving executives appreciate.
First, select your testing method by considering three key dimensions:
Qualitative vs. Quantitative: This fundamental dichotomy determines everything.
Moderated vs. Unmoderated:
Remote vs. In-person:
Remote studies enable you to include a much broader user population than if you were testing locally in a single facility. (GPT Image-1)
Second, choose tools that actually work for your method. Evaluate platforms based on:
Third, define metrics covering both performance and perception:
Behavioral Metrics (what users do):
Attitudinal Metrics (what users feel):
This stage succeeds when:
Remember: methods maketh the study. Choose wisely, and your research will reward you with revelations rather than regrets.
There is no single user research method that’s perfect for everything. We have many choices, and even though making these choices may seem complicated, don’t despair: there should always be another study later, where you can use a different methodology and approach the problem from a new angle. (GPT Image-1)
Step 4. Identify and Recruit Representative Test Participants
The validity of your usability study hinges entirely on who participates. Testing with the wrong people is like asking vegetarians to review a steakhouse: you’ll get feedback, but it won’t be particularly useful. If participants don’t represent your actual audience, the data collected ranges from misleading to completely invalid. Test a complex financial analysis tool with novice investors, and you’ll discover “problems” that are really just knowledge gaps. The resulting product dumbing-down will frustrate your actual users (expert analysts who know their derivatives from their dividends).
Rigorous recruitment requires creating detailed personas, crafting clever screening questions, and selecting suitable sourcing channels. Relying solely on your power users provides feedback as balanced as a unicycle. Poorly worded screeners attract professional study participants rather than genuine users, people who’ve perfected the art of qualifying for questionnaires.
Unfortunately, the growth of research panels has brought an increasing number of “professional participants” who sign up for as many studies as possible to cash in on the incentives. A good screener should attempt to weed out these people. (GPT Image-1)
Sample size isn’t arbitrary either. Qualitative studies seeking deep insights can strike gold with just five participants, uncovering about 85% of common usability issues. Quantitative studies chasing statistical significance need 40 or more participants to ensure results aren’t just random noise masquerading as meaningful metrics.
First, define your target audience beyond basic demographics. Create recruitment personas including:
From this list, user behavior is overwhelmingly the most important in defining your study participants.
Transform these characteristics into 1–3 concise recruitment personas serving as participant profiles.
Second, write screening questions that separate suitable subjects from survey surfing specialists:
Ask elimination questions first to quickly disqualify unsuitable candidates. Avoid yes/no questions that participants can easily game. Instead of “Do you use our banking app?” ask “Which banking apps have you used this month?” and include yours among competitors plus “None of the above.”
Prioritize behavioral questions over attitudinal ones. “How many times did you order food online last week?” beats “How often do you plan to order food online?” every time. Include open-ended questions like “Describe your vacation planning process” to identify articulate participants who provide thoughtful feedback.
Third, choose recruitment channels matching your audience and budget:
Guerrilla studies grab people where they are. You can only run short sessions this way, and the sampling will be rather arbitrary, but it’s fast, easy, and cheap. Intercepting employees in the company cafeteria is particularly useful for intranet usability studies. (GPT Image-1)
Fourth, determine appropriate sample sizes:
This stage succeeds when:
Remember: recruiting right participants prevents research regret.
Step 5. Develop the Usability Test Plan, Scenarios, Tasks, and Script
The usability test plan is your study’s supreme scripture, the single source of truth detailing every aspect from goals to logistics. Creating it isn’t paperwork purgatory; it’s a critical act of synthesis and alignment. A comprehensive test plan forces teams to think through every detail in advance, ensuring consistent, efficient, and rigorous research. It provides a roadmap for researchers and sets stakeholder expectations about purpose, methodology, and outcomes. Without a formal plan, studies drift into improvisation island, where inconsistency reigns and crucial data disappears.
Developing realistic scenarios and goal-oriented tasks forms the heart of your test plan. This transforms abstract research questions into concrete actions participants actually perform.
Task quality directly determines insight quality. Poorly designed tasks (too vague, too prescriptive, or disconnected from reality) produce artificial behavior and misleading data. Well-crafted scenarios provide just enough context to make tasks relatable without revealing solutions. This focus on realistic goals rather than feature instructions allows researchers to observe natural behavior and uncover genuine usability issues.
The moderator script ensures every session runs like a well-rehearsed recipe. This consistency is crucial for reliable findings, minimizing facilitator variation and ensuring all participants receive identical instructions. The script includes more than just tasks: it covers rapport-building introductions, think-aloud instructions, questions, and debriefing prompts. A solid script guarantees consistency while helping moderators maintain neutrality and systematically cover all objectives.
First, create your comprehensive usability test plan, consolidating all previous decisions. Essential components include:
Second, develop realistic scenarios and goal-oriented tasks. This is where rubber meets road:
Third, write your complete moderator script guiding facilitators from start to finish:
This planning phase is complete when:
Remember: proper planning prevents particularly poor performance. Your test plan transforms good intentions into great insights.
Step 6. Prepare the Test Environment and Essential Materials
Meticulous preparation of your test environment and materials is fundamental for smooth, professional, and effective usability studies. This step controls variables and eliminates potential failure points before participants arrive or log on. Poor preparation compromises data quality directly. In-person tests suffer when recording equipment glitches, outside noise intrudes, or consent forms go missing. These distractions break session flow and corrupt data collection. Remote tests fail when internet connections falter, software refuses to cooperate, or setup instructions confuse participants. Sessions simply stop starting, wasting time and testing everyone’s patience.
Beyond technical troubles, well-prepared environments foster professionalism and participant peace of mind. When participants enter organized spaces or join sessions with calm, prepared facilitators, trust builds and honest feedback flows freely. Chaotic settings create anxiety and suggest their time isn’t treasured, potentially poisoning their responses.
Assembling materials in advance ensures consistency and efficiency. Having scripts, note-taking templates, and scenarios ready lets facilitators focus fully on participant behavior rather than fumbling through folders. This preparation transcends logistics; it’s research rigor in action. Standardizing environments and materials minimizes variables, ensuring observed differences stem from product interaction, not process inconsistencies.
Physical setup proves paramount. Create a testing room for participants and facilitators, plus an observation room for stakeholders.
Testing Room Setup:
To ensure against interruptions, paste a sign on the testing room door while sessions are in progress. And do have a door: testing in cubicles or open office layouts compromises participant privacy and is too disruptive. (GPT Image-1)
Observation Room Setup:
Physical Materials:
Focus on stable technical setups and crystal-clear communication.
Facilitator’s Setup:
Participant’s Setup:
Preparation is complete when:
Environment Ready:
Materials Assembled: All documents printed or bookmarked and organized for instant access, including forms, scripts, templates, and incentives
Team Confirmed: Everyone has calendar invitations with correct times, locations, or links, plus necessary materials like observer guidelines
Remember: perfect preparation prevents particularly problematic performances. Your organized environment encourages authentic answers while chaos creates confused contributions.
Human factors research shows that using checklists ensures consistent procedures. Even people who know better often miss a step when proceeding without a checklist. Even though a qualitative usability study is not a science study (and you should not intimidate the participant by wearing a lab coat), it’s still best to write (and use) checklists for repeatable steps, such as the test setup or the introduction you give to the participant. (GPT Image-1)
Step 7. Conduct Pilot Testing for Procedural Refinement
Pilot tests are your study’s dress rehearsal, a chance to practice before the performance that counts. Their primary purpose is testing your research plan’s feasibility in a low-stakes setting. Skipping this step is like proposing without practicing: technically possible but potentially disastrous. Better to discover problems with one practice participant whose data gets discarded than during your first real session, when stakes are higher and budgets are watching.
Pilot tests serve several critical functions. First, they validate whether your materials make sense to actual humans. Tasks that seem crystal clear to researchers might confuse real users like assembly instructions written in ancient Greek. Pilots reveal if instructions mislead, if task wording accidentally reveals answers, or if questions fail to generate useful feedback.
Second, they test timing and flow. Pilots provide realistic estimates of task duration, essential for ensuring sessions fit scheduled slots and helping prioritize if pruning proves necessary.
Third, they uncover technical troubles before they become tragedies. Links break, features fail, recording software rebels. These gremlins must be banished before the real research begins.
Finally, pilots give facilitators practice opportunities, helping them refine questions and become comfortable with scripts, leading to smoother sessions when it matters most.
Ideally, plan on more than one pilot test session, and allow time between the first pilot and the second to make the inevitable changes in your test plan and materials. In essence, the first session is the pilot for the second session. (GPT Image-1)
First, recruit one or two pilot participants matching your target user profile. You can use co-workers for pilot sessions, since the goal is to test the study plan and not the user interface, but there is always a risk that they will not behave in the same way as the real users. Best practice is to recruit one or two extra qualified participants and schedule them two days before official sessions begin. This provides one day in the middle (between pilots and real testing) for necessary adjustments.
Second, conduct the complete test session exactly as planned, from welcome through wrap-up. Note-takers and observers should participate normally. Watch vigilantly for:
Third, gather feedback and refine ruthlessly. Hold immediate team debriefs discussing observations. Ask pilot participants directly: “Were instructions clear?” “Did you feel rushed?” Based on feedback, make concrete revisions: reword confusing tasks, remove lengthy segments, add probing questions, or fix critical bugs.
If the pilot sessions proceed perfectly, their data can be included in the final analysis, but don’t plan on this, since you will almost certainly want to make changes to the test plan after each pilot user. The goal of a pilot is to test the test, not to test the design.
Pilot testing succeeds when:
The team should feel confident that the study will run smoothly and generate valid, valuable data.
Remember: a pilot today avoids panic tomorrow. Your dress rehearsal determines whether opening night delights or disappoints.
Step 8. Facilitate the Usability Test Sessions
Facilitation is where planning pays off and data collection comes alive. The quality of gathered data depends on facilitator skill. Skilled facilitators create comfortable, neutral environments encouraging participants to behave naturally and share thoughts honestly. They guide without leading, observe without influencing, and probe without pushing bias into the conversation. Great facilitators transform standard sessions into insight goldmines. Poor facilitators accidentally influence behavior, ask leading questions, or miss critical moments, producing shallow, skewed data that undermines everything.
The think-aloud protocol, qualitative testing’s crown jewel, requires careful cultivation. Participants verbalize thoughts, feelings, and reasoning while using the product, providing windows into their mental machinery. However, thinking aloud feels unnatural to most humans. People fall silent when concentrating, confused, or simply forgetting to narrate their neural activity. Facilitators must gently prompt participants to keep talking using neutral nudges like “What’s going through your mind?” or “Tell me what you’re seeing.”
People usually keep their thoughts to themselves, and it is unnatural to keep up a thinking-aloud stream of verbalized thoughts. This is why you need instructions for study participants on how to do this and why the facilitator usually needs to remind users several times during a session to keep saying what they are thinking in the moment (rather than storing up their thoughts for retrospective reporting, which is usually biased by subsequent events in the study). (Imagen 4 Ultra)
Asking effective, non-leading questions separates mediocre moderators from magnificent ones. When participants exhibit unexpected behavior or make intriguing comments, well-timed questions uncover underlying understanding. Leading questions like “Was that confusing?” poison the well with suggested answers. Neutral probes like “What did you expect would happen?” open doors for authentic explanations yielding valuable, unbiased insights. Disciplined facilitation combined with systematic note-taking captures rich, nuanced data from every session.
Leading questions, such as directing the user’s attention to a particular UI element, can doom a usability study. (GPT Image-1)
Effective facilitation strikes a balance between script adherence and responsive moderation across three phases: setting the stage, conducting tasks, and wrapping up.
First, set the stage and build rapport (5 minutes):
Welcome participants warmly by name and engage in light conversation to melt initial ice. Use your script’s introduction to explain the session’s purpose. Crucially, emphasize you’re testing the product, not the person. This simple statement slays performance anxiety.
Explain what will happen, how long it takes, and what’s expected. Clearly describe thinking aloud with an example: “As you work, say whatever crosses your mind. You might say things like ‘I’m looking for the menu... This button seems odd... I thought this would take me home.’”
Obtain informed consent verbally, confirming recording permission and data usage understanding, even if forms were previously signed.
Second, conduct tasks and facilitate observation (30–40 minutes):
Read scenarios and instructions verbatim from your script ensuring session consistency. Maintain strict neutrality. Avoid reacting to participant actions or comments. Use neutral acknowledgments like “thank you” or “that’s helpful.” When participants ask how something works, respond with “What would you try if you were alone?”
A study facilitator should maintain a neutral, poker-faced demeanor, like this capybara, regardless of what the test user does. (GPT Image-1)
When silence stretches beyond 15 seconds, provide gentle prompts:
When observing critical incidents (hesitation, errors, frustrated sighs, surprising statements), deploy open-ended questions after task completion:
Keep one eye on the clock ensuring critical tasks get coverage. If participants struggle excessively, gently guide them forward.
Third, conduct post-test interviews and wrap-up (5–10 minutes):
Ask scripted questions gathering overall impressions and reflections. Administer any final questionnaire. Allow participants to ask their own questions about the product or study. Thank them sincerely for valuable feedback, explain payment procedures, and end positively.
Throughout sessions, note-takers document observations chronologically, capturing quotes verbatim, noting specific behaviors (“squinted at screen, clicked back button three times”), and recording timestamps for critical moments needing later review. Observers should record facts, not interpretations, using structured templates.
Observers are invaluable, both for building team buy-in during the research process and for identifying issues that may have escaped the facilitator’s notice. However, observers should remain quiet and refrain from influencing the test participant. Having a dedicated study facility with an observation room and a one-way mirror is one way to guard against observer interference. (GPT Image-1)
Sessions succeed when:
Remember: fantastic facilitators find the fine balance between friendly and formal, guiding gracefully while gathering genuine gems of insight.
Step 9. Analyze and Synthesize the Findings
The analysis and synthesis stage transforms raw, random recordings into actionable answers. This is arguably the most mentally demanding part of research. Simply collecting countless clips and copious notes isn’t enough; value emerges only when this raw material is systematically sorted, scrutinized, and shaped into sensible stories. Without rigorous analysis, findings become anecdotal accidents based on whoever complained loudest rather than weighted evidence across all sessions. Structured approaches ensure conclusions are credible, defensible, and directly tied to observed data.
Synthesis differs from analysis like cooking differs from chopping. Analysis breaks data into bite-sized bits: individual observations, problems, and quotes. Synthesis reassembles these pieces into a palatable whole, connecting scattered sightings to form fuller findings and constructing narratives explaining why problems persist. This process elevates individual incidents into insights that inform strategic solutions.
Critical to this stage is prioritizing problems properly. Not all usability issues are created equal. Some are minor mosquito bites while others are massive migraines preventing users from completing critical tasks. A systematic prioritization framework considering frequency, impact, and severity ensures teams focus finite resources on fixes that matter most. Without prioritization, teams risk polishing doorknobs while the foundation crumbles.
Analysis and synthesis should start immediately after session one and continue throughout testing. The process moves from individual observations to prioritized, actionable insights.
First, conduct post-session debriefs. Immediately after each test, gather your team for a 15-minute memory dump while details remain fresh. Focus on significant struggles participants faced and surprising statements they made. This informal synthesis helps teams spot patterns promptly and build shared understanding.
During the test sessions, as many team members as possible should observe and take notes. An immediate observer debrief, while each session is fresh in the observers’ minds, is a great way to quickly capture insights that can be fed into later analysis. (GPT Image-1)
Second, organize and analyze raw data. This core work involves processing qualitative content to identify patterns.
Affinity diagrams are traditionally made by moving sticky notes around on a large wall, but can also be done by moving colored squares around in cloud-based software like Miro, which supports remote team collaboration. (GPT Image-1)
Third, create a prioritized problem list. Rate each issue on three factors:
Assign severity scores (0=no problem to 4=catastrophic calamity). This provides objective ordering for tackling troubles, ensuring critical issues get attention before cosmetic concerns.
Analysis succeeds when you’ve produced:
Remember: successful synthesis transforms scattered snippets into stories, random reactions into reliable recommendations, and countless comments into clear conclusions that compel constructive change.
Step 10. Report Findings and Actionable Recommendations to Stakeholders
The final report is your research’s grand finale, the vehicle that transforms testing into tangible change. Research is only as impactful as its ability to be understood, believed, and acted upon. A poorly presented report can cause brilliant breakthroughs to be buried or butchered, rendering weeks of work worthless. Your goal isn’t simply presenting data; it’s telling a compelling, evidence-based story that generates empathy, provides direction, and influences decisions.
Effective reporting requires reading the room and recognizing your readers. Executives are time-starved and focused on business benefits. They need concise summaries highlighting critical findings and bottom-line impacts. Designers and developers need detailed directions showing exactly where users struggled and why. A single serving won’t satisfy these different dietary needs. Successful communication often involves multiple meals: executive appetizers, detailed main courses for core teams, and raw ingredients (video clips, data) for those wanting to cook from scratch.
Showing video clips of real user behavior is one of the best ways to generate stakeholder buy-in. (GPT Image-1)
Reports must propose clear, specific solutions, not just identify issues. “Users find navigation confusing” is an observation floating in space. “Change ‘Resources’ to ‘Help Center’ since 80% of users looked there for support” is an actionable answer. Recommendations should link directly to findings, be practical to implement, and be prioritized by problem severity. This focus on solutions transforms research from diagnosis into treatment.
First, structure your story sensibly. Whether writing documents or designing decks, narratives need natural flow:
Second, show rather than simply saying. Evidence elevates everything:
Third, frame fixes that are actually actionable:
Finally, present with pizzazz. Keep presentations under 30 minutes. Use storytelling to frame user journeys. Leave time for discussion, questions, and collaborative planning.
Reporting succeeds when:
Remember: powerful presentations persuade people, brilliant briefings build buy-in, and compelling communication converts skeptics into champions who crave constructive change.
Step 11. Follow Up on Implementation of Findings to Ensure Changes are Made
Conducting research and delivering dazzling reports is only half the battle. The ultimate goal isn’t producing papers but improving products. The follow-up stage is where research reaches reality. Without systematic tracking of recommendations, even brilliant insights become buried beneath backlogs, technical debt, and shifting strategies. This step closes the loop, ensuring your investment in understanding users translates into tangible improvements. It bridges insight and impact.
Effective follow-up requires researchers to transform from analysts into advocates. They must clarify recommendations, provide context, and help developers understand the “why” behind changes. This prevents fixes that technically tick boxes but miss the mark. If users can’t find the save button and you recommend “making it more visible,” developers might simply make it bigger. But perhaps the real problem is that it’s labeled “Archive” when users expect “Save.” Continuous collaboration ensures solutions actually solve problems.
Tracking impact is essential for demonstrating research ROI and fostering data-driven decisions. When you show that checkout completion jumped 30% after implementing your recommendations, you provide undeniable proof that usability work works. This justifies current spending and builds bulletproof business cases for future research.
First, translate recommendations into trackable tasks. Work with product managers to convert your prioritized findings into tickets in their tracking tools (Jira, Asana, or whatever alphabet soup they prefer).
Create detailed tickets including:
Assign priorities matching your severity scores. Attend planning meetings to advocate for fixes and answer questions.
Second, collaborate closely during implementation. Your involvement shouldn’t stop at report delivery.
Provide context explaining why changes matter. When developers understand that users expect “Cart” not “Basket” because they’re shopping, not picnicking, implementation improves.
Review mockups before coding begins. Confirm proposed solutions solve actual problems. Discuss technical constraints and brainstorm alternatives when ideal solutions prove impossible. Perform final reviews in staging environments ensuring implementations match intentions.
Third, measure and report impact. After fixes go live, measure their effectiveness.
Rerun usability tests using original tasks and metrics. Compare the before-and-after data to demonstrate clear improvement. Monitor key analytics, such as completion rates and error frequencies. Create follow-up presentations showcasing success. Nothing beats a slide showing “Task success rate: 45% → 82%” for securing future funding.
Follow-up succeeds when:
Remember: fantastic follow-through transforms findings into features, ensuring excellent efforts evolve into enhanced experiences that excite everyone.
Step 12. Plan for the Next Iteration and Future Usability Studies
This final step transforms usability testing from a one-time task into a continuous cycle of curiosity and improvement. The goal isn’t achieving the mythical unicorn of “perfect” usability, but fostering an iterative loop of designing, testing, and learning. Findings from one study should feed the next, creating a virtuous vortex of user-centered development. Planning ahead ensures momentum maintains itself. It embeds user feedback into product development’s daily dance, making research proactive rather than panic-driven.
User research should keep ticking like a metronome. Ideally, pre-schedule user sessions for a specific day every week (say, Wednesday). When that day approaches, you can be sure that your team will have something they want to learn. If not, take the opportunity to conduct bottom-up exploratory research or a competitive study. (GPT Image-1)
Creating a research roadmap is your formal framework for continuity. This strategic document outlines planned research activities over coming quarters. It aligns research with product plans and business objectives, ensuring studies happen when insights can actually influence decisions. Schedule exploratory interviews before designers start sketching. Plan usability tests before developers write code. Timing is everything, and everything needs timing.
Strategic planning provides powerful perks. It gives stakeholders visibility into research direction, creating shared vision for how insights shape products. It forces prioritization, focusing finite resources on questions with maximum impact. It helps with resource allocation, allowing leaders to plan budgets, tools, and teams in advance. By treating research as a strategic program rather than sporadic scrambles, organizations mature from assumption-based to evidence-embraced decision making.
User research isn’t a one-time deal. You should research before designing anything, and multiple times throughout the prototyping and iterative design phases. (GPT Image-1)
First, identify and document new research questions. Every study spawns fresh mysteries.
Review “parking lot” items: those interesting issues that fell outside your recent study’s scope. Consolidate these curious questions for future exploration.
A parking lot is helpful in any team meeting or discussion: it’s a place to “park” items that can’t be handled right now (and would be distracting if discussed further). Parking something ensures that it’s not forgotten and can be revisited when appropriate, and also prevents the person making that suggestion from feeling overheard. (GPT Image-1)
Analyze findings for deeper dilemmas. If users struggle with search, the deeper question might be “What are users actually trying to accomplish?”
Solicit stakeholder suggestions. Meet with teams to discuss findings and brainstorm emerging uncertainties needing investigation.
Second, create or update your research roadmap. This living document deserves quarterly reviews.
Structure by time: “Now” (current quarter), “Next” (following quarter), “Later” (beyond six months). This provides perspective without premature promises.
Frame initiatives as questions, not methods. Write “Understanding why users abandon carts” not “Conducting cart analysis.”
Link each initiative to product goals. Show how research delivers insights when decisions need them.
Include essential details: brief descriptions, proposed methods, priority levels, current status, and involved teams.
A research roadmap keeps you on track to continuously move forward. (GPT Image-1)
Third, share and socialize your roadmap. Communication creates collaboration.
Present the roadmap to stakeholders, explaining priorities and connections to company goals. Gather feedback ensuring alignment with organizational objectives. Store it somewhere accessible so anyone can see what’s planned.
This final phase succeeds when:
Remember: persistent planning produces powerful progress, roadmaps reap recurring rewards, and iterative insights inspire infinite improvements.
Conclusion
The path to a usable product is paved with empirical evidence. I’ve given you the steps to gather such evidence with the number-one user research method: user testing. Other research methods can provide even more insights, but they are not suitable for companies with low UX maturity. Start with user testing, and once you have mastered this method and proven the value of user research, you will likely gain the staff, budget, and experience to employ additional research methods.
Usability testing works because it puts reality between your team and your assumptions. But reality is messy. A systematic sequence makes testing manageable and repeatable, especially in companies that lack a deep tradition of user research.
This is how you turn a usability test from a show-and-tell into a decision engine. The sequence reduces bias, increases traceability, and speeds iteration. It also builds trust: when product leads can see how the next UI change will be tested and when they’ll know if it worked, they commit. When engineers receive clear issues with evidence and acceptance criteria, they fix. When leadership sees before/after curves and fewer support tickets, they fund the next round. Each usability project feeds the next, creating cycles of continuous improvement.
The systematic process allows leadership to trace how redesign recommendations originated from their business objectives, making it profitable for the company to implement usability improvements. (GPT Image-1)
Keep the 12 steps lightweight but disciplined. Your users will get faster, easier paths; your team will ship with more confidence; and your organization will gradually replace opinion battles with evidence-based decisions, which are the true mark of mature product practice.
Remember: persistent processes produce powerful products, systematic studies spawn spectacular solutions, iterative insights inspire incredible interfaces, and continuous cycles create customer-centered cultures that consistently conquer confusion through collaborative, coordinated, and courageous commitment to understanding users utterly.
Summary of the 12 Steps to Running a Usability Study
The 12 steps recommended in this article. (Napkin)
Step 1. Define the Problem, the Context, and Existing Data
Synthesize existing data to create an evidence-based, user-centric problem statement. Avoid prescribing solutions. Get stakeholder consensus before proceeding.
Step 2. Define Clear Research Goals, Objectives, and Scope
Set 3–5 research goals with specific questions. Document scope boundaries to prevent mission creep. Include change control process.
Step 3. Choose Appropriate Usability Testing Methods, Tools, and Metrics
Match methods to goals: qualitative for insights, quantitative for metrics. Select appropriate tools and define success metrics with benchmarks.
Step 4. Identify and Recruit Representative Test Participants
Define target users by behaviors. Create ungameable screeners. Recruit 5–8 for qualitative, 40+ for quantitative studies. Include backups.
Step 5. Develop the Usability Test Plan, Scenarios, Tasks, and Script
Document everything in a test plan. Create goal-based scenarios with non-leading tasks. Write complete scripts ensuring session consistency.
Step 6. Prepare the Test Environment and Essential Materials
Prepare distraction-free environments with tested technology. Organize all materials including scripts, forms, and payments. Professional setup ensures quality.
Step 7. Conduct Pilot Testing for Procedural Refinement
Test everything with 1–2 pilot participants. Identify and fix confusing tasks, timing issues, and technical problems before starting.
Step 8. Facilitate the Usability Test Sessions
Facilitate neutrally while encouraging think-aloud. Ask open-ended questions. Follow scripts consistently. Document everything with timestamps and direct quotes.
Step 9. Analyze and Synthesize the Findings
Analyze data systematically to identify themes. Prioritize problems by frequency, impact, and severity. Support findings with concrete evidence.
Step 10. Report Findings and Actionable Recommendations to Stakeholders
Tailor reports to audiences. Use compelling evidence like videos and quotes. Provide specific, prioritized recommendations tied to findings.
Step 11. Follow Up on Implementation of Findings to Ensure Changes are Made
Convert recommendations into development tickets. Collaborate during implementation. Measure impact with follow-up tests. Demonstrate ROI to stakeholders.
Step 12. Plan for the Next Iteration and Future Usability Studies
Document emerging questions. Create research roadmaps aligned with product plans. Update quarterly. Make research a continuous, proactive cycle.
In combination, these 12 steps become an evidence-in, action-out machine. (GPT Image-1)
About the Author
Jakob Nielsen, Ph.D., is a usability pioneer with 42 years experience in UX and the Founder of UX Tigers. He founded the discount usability movement for fast and cheap iterative design, including heuristic evaluation and the 10 usability heuristics. He formulated the eponymous Jakob’s Law of the Internet User Experience. Named “the king of usability” by Internet Magazine, “the guru of Web page usability” by The New York Times, and “the next best thing to a true time machine” by USA Today.
Previously, Dr. Nielsen was a Sun Microsystems Distinguished Engineer and a Member of Research Staff at Bell Communications Research, the branch of Bell Labs owned by the Regional Bell Operating Companies. He is the author of 8 books, including the best-selling Designing Web Usability: The Practice of Simplicity (published in 22 languages), the foundational Usability Engineering (28,897 citations in Google Scholar), and the pioneering Hypertext and Hypermedia (published two years before the Web launched).
Dr. Nielsen holds 79 United States patents, mainly on making the Internet easier to use. He received the Lifetime Achievement Award for Human–Computer Interaction Practice from ACM SIGCHI and was named a “Titan of Human Factors” by the Human Factors and Ergonomics Society.
4 Likes
Discussion about this post
Getting Started with AI for UXUse generative-AI tools to support and enhance your UX skills — not to replace them. Start with small UX tasks, and watch out for hallucinations and bad…Oct 18, 2023 • Jakob Nielsen1464 Hello AI Agents: Goodbye UI Design, RIP AccessibilityAutonomous agents will transform user experience by automating interactions, making traditional UI design obsolete, as users stop visiting websites in…Feb 21 • Jakob Nielsen16011 UX Angst of 2023-24You think UX is tanking? Think again. The UX sky’s not falling; it’s reshaping. AI integration will facilitate a UX renaissance, while the profession’s…Sep 13, 2023 • Jakob Nielsen1275 © 2025 Jakob NielsenPrivacy ∙ Terms ∙ Collection noticeStart writingGet the appSubstack is the home for great culture
More Stuff I Like
More Stuff tagged usability testing , user experience research
MyHub.ai saves very few cookies onto your device: we need some to monitor site traffic using Google Analytics, while another protects you from a cross-site request forgeries. Nevertheless, you can disable the usage of cookies by changing the settings of your browser. By browsing our website without changing the browser settings, you grant us permission to store that information on your device. More details in our Privacy Policy.