Curated Resource ( ? )

12 Steps for Usability Testing: Plan, Run, Analyze, Report

Like usability testing , user experience research

Curated: 04/09/2025 from jakobnielsenphd.substack.com/p/user-testing

my notes ( ? )

Usability testing is straightforward: give people realistic tasks and watch what happens. Then fix what hurts and watch your profits grow with a better design that meets your business goals better. We’re in the insights business, and user testing is the best way to gain these insights.

Don’t get intimidated by the 12 steps discussed in this article. At heart, usability testing has two components: (1) watch users while they (2) perform tasks. Everything else is padding to maximize your ROI from these two. (GPT-Image 1)

The difference between opinion soup and a study that yields durable, defensible decisions lies in the process. A systematic sequence gives you three compounding advantages.

Do not fish for design insights in the opinion soup. Run a study to catch some real fish. (GPT Image-1)

First, it controls bias. Clear goals, stable tasks, consistent facilitation, and a predeclared analysis plan reduce the all‑too‑human tendencies to see what you expect, over-help participants, or cherry-pick clips that support pet solutions.

Second, a sequence of defined steps creates traceability. When stakeholders ask, “Why should we change this flow?” you can trace the recommendation back through severity, frequency, and impact to the tasks and scenarios, and finally to the business objective the study was designed to address. That chain only exists if you start with a precise problem statement and end with an action plan and follow-through.

Third, predefined steps protect velocity. Pilot sessions catch brittle scripts and flaky tech before your calendar is on fire. Prebuilt note templates, observer debriefs, and a prioritization rubric speed analysis and decision‑making. Practices drawn from both industry and academia let you summarize the “how usable” story quickly without hand-waving.

Prevent surprises: conduct frequent user research. (GPT Image-1)

Finally, a sequence turns testing from a one-off ceremony into a habit. After reporting study findings, you still have the last two steps of my process: ensure changes are implemented and schedule the next research iteration. Teams that make user contact routine (weekly or per sprint) and manage research operations intentionally ship better experiences, faster. Continuous discovery and ResearchOps are not fads; they are the scaffolding that keeps good research happening as products evolve.

In short: if you want user insights that withstand scrutiny and drive real product change, treat usability testing like any other engineering discipline with structured inputs, controlled execution, and explicit outputs. Follow the 12 steps I outline in this article for the best results. Once you have completed multiple projects according to the process, many of the steps can be abbreviated, partly by reusing materials from previous rounds.

Here’s an overview of the 12 steps:

Problem Definition: Gather data, articulate issues, stakeholder buy-in
Set Goals: Create objectives, scope
Choose Methods: Select qual/quant, lab/remote, tools
Recruit Participants: Personas, screeners, logistics
Develop Plan: Scenarios, tasks, scripts
Prepare Environment: Setup, materials, tests
Pilot Testing: Mock sessions, refinements
Facilitate Sessions: Guide, observe, debrief
Analyze Findings: Theme, metrics, prioritize
Report: Summary, visuals, meetings
Follow Up: Plans, tracking, validation
Plan Next: Reflect, cadence, archive

My whole process contains 12 steps: this infographic summarizes the most important half (steps 2, 4, 5, 8, 8, and 10). It really is an easy-enough progression. (GPT Image-1)

Step 1. Define the Problem, the Context, and Existing Data

Defining the problem is the most critical step in usability testing. Think of it as laying the foundation for a house: skip this step, and everything else becomes a wobbly tower of wasted time. Without a clear, evidence-backed problem statement, your study risks solving phantom problems or chasing low-priority concerns. This leads to wasted resources, irrelevant findings, and researchers reaching for resume-polishing websites.

A well-defined problem statement serves as your project’s North Star, ensuring every decision aligns with a single purpose. It prevents teams from prescribing solutions before diagnosing diseases (AKA, “solutionitis”). An effective problem statement isn’t speculation or wishful thinking; it’s a precise articulation of actual user struggles, grounded in genuine data.

This process also performs organizational therapy by forcing departments to share their data treasures. When UX researchers synthesize insights from support tickets, analytics dashboards, and customer feedback, they create a unified view of the user experience. Teams that struggle to access or share data reveal deeper dysfunction. Successfully completing this step often sparks a shared understanding of users across the entire organization, like a company-wide “aha” moment.

What Should Be Done

First, synthesize all existing data before commissioning new research. This prevents redundant work and ensures you’re addressing real knowledge gaps rather than rehashing known issues. Key sources include:

Previous User Research: Gather all prior research, including past usability tests, surveys, and market analyses.
Stakeholder Knowledge: Interview stakeholders from product, marketing, engineering, and sales to capture their understanding of user problems.
Customer Support and Feedback Channels: Review support tickets, chat logs, app reviews, and social media comments. Look for recurring complaints and frustrations. Notice the specific language customers use; their vocabulary becomes your testing terminology.
Analytics Data: Examine quantitative patterns from your analytics platforms. Hunt for objective problem evidence: high abandonment rates on checkout pages, features gathering digital dust, or users clicking back buttons like they’re playing whack-a-mole. This data shows what users do and where they struggle.

Second, craft a user-centric problem statement using your synthesized data. Strong problem statements must be:

Human-Centered: Frame it from the user’s perspective, not business objectives.
Solution-Free: Describe problems without prescribing fixes. “Users need a better search bar” constrains creativity before exploration begins.
Properly Scoped: Broad enough for creative solutions yet narrow enough for focus.
Assumption-Free: Every element traces back to actual data.

Use frameworks like the Five Ws: Who is affected? What is the problem? Where does it occur? When does it occur? Why is it important?

Be a wise owl and use the 5 Ws as one possible structuring framework when defining your research problem statement. (GPT Image-1)

Completion Criteria

This problem-definition stage is complete when:

Your problem statement links to specific data from multiple sources.
The statement speaks from the user’s perspective without mentioning features or metrics.
Key stakeholders formally agree it represents a priority worth pursuing.
Documentation captures this crucial consensus.

Remember: proper problem definition prevents pathetic performance.

Step 2. Define Clear Research Goals, Objectives, and Scope

If the problem statement (from Step 1) tells us “why” we’re researching, then goals, objectives, and scope reveal the “what” and “how.” This step transforms fuzzy, user-centric problems into concrete, manageable research projects. Without clear goals, studies drift into unfocused fishing expeditions that catch interesting but inedible insights.

Well-defined objectives provide specific questions research must answer, ensuring collected data actually addresses the problem rather than wandering into fascinating but futile territories.

Equally critical is defining project scope. Scope creep (that sneaky, silent study saboteur) remains a primary project killer. It springs from stakeholder ambiguity, poor communication, or “just one more thing” syndrome, where tiny requests gradually transform your trim study into a bloated behemoth. A formal scope document acts as a contract between researchers and stakeholders, establishing boundaries about what will and won’t be tested. This clarity prevents misunderstandings, manages expectations, and protects your timeline from well-meaning meddlers.

It's easy to fall victim to scope creep, because each extra question seems innocent enough. However, the cumulative effect is to divert your research from the main goal. (GPT Image-1)

Defining scope transcends procedural paperwork; it becomes a powerful prioritization tool. When stakeholders review scope documents, they confront trade-offs inherent in resource-limited projects. The document transforms wish lists into priority rankings. When someone suggests adding research questions, you can respond strategically: “Excellent idea! To accommodate it within our timeline, we’ll need to remove another objective. Which existing goal should go?” This elevates researchers from order-takers to strategic partners, facilitating focused decisions about resource allocation.

What Should Be Done

First, translate your problem statement into clear research goals. These broad statements describe what your study aims to learn. Keep it to three to five goals to maintain focus. Goals must remain neutral, avoiding language that assumes outcomes.

Good Goal: “Understand what confuses users when signing up for newsletters.” This stays open and learning-focused.
Bad Goal: “Prove that removing the phone number field improves signups.” This presupposes solutions and introduces bias.

Second, develop specific research questions for each goal. These granular, observable questions bridge high-level goals and actual test tasks. For the newsletter signup goal, questions might include:

Where do users pause or look puzzled on the form?
Can users find the privacy policy link?
What do users expect after clicking submit?

Third, establish and document project scope. Create a formal document that stakeholders review and approve, explicitly stating:

In-Scope Items: Specific features you’ll evaluate. Example: “We’ll test the journey from clicking ‘Subscribe’ through seeing the confirmation message.”
Out-of-Scope Items: Related features you won’t test. Example: “We won’t evaluate email preferences, unsubscribe flows, or mobile layouts.”
Change Control Process: Define procedures for handling scope changes. Require written requests, impact analysis, and formal approval from designated decision-makers.

Completion Criteria

This stage succeeds when you’ve produced:

Three to five clear, non-leading research goals documented in your test plan
Specific, observable research questions supporting each goal
An approved scope document detailing what’s in and out
A documented change control process communicated to all team members

Remember: scope documents save sanity, schedules, and stakeholder relationships.

Step 3. Choose Appropriate Usability Testing Methods, Tools, and Metrics

Selecting research methods, tools, and metrics is like choosing the right utensils for dinner. You wouldn’t eat soup with a fork (unless you enjoy frustration), and you shouldn’t use A/B tests to understand emotional responses. There’s no universal “best” method; the right choice depends on your research goals, product stage, and available resources.

Choosing quantitative methods when you need the “why” behind behavior yields statistically significant but contextually starved results. Using qualitative interviews to measure task success rates across hundreds of users is like counting grains of sand with tweezers: technically possible but terrifically tedious. Wrong methods mean wrong answers, wasted work, and wondering why your research flopped.

Tools matter too. Using basic video conferencing for moderated testing without proper recording capabilities turns your study into a memory test for researchers. Without predefined metrics, analysis becomes subjective storytelling rather than systematic science. Setting specific targets (like completion rates or usability scores) transforms fuzzy feelings into factual findings that even spreadsheet-loving executives appreciate.

What Should Be Done

First, select your testing method by considering three key dimensions:

Qualitative vs. Quantitative: This fundamental dichotomy determines everything.

Qualitative methods (interviews, think-aloud protocols) reveal the “why” behind behavior with smaller samples, perfect for understanding motivations and pain points.
Quantitative methods (A/B tests, surveys, analytics) collect numbers about “what” and “how many,” ideal for measuring performance at scale.
Mixed methods combine both for comprehensive coverage.
I recommend allocating about 90% of usability-testing resources to qualitative studies and 10% to quantitative studies.

Moderated vs. Unmoderated:

Moderated testing features a facilitator who guides participants and probes deeper, perfect for complex tasks or early prototypes needing rich context.
Unmoderated testing lets participants complete tasks independently, offering speed and scale for simple workflows or tight timelines.

Remote vs. In-person:

Remote testing reaches global audiences cost-effectively, capturing behavior in natural habitats.
In-person testing provides controlled conditions where facilitators catch subtle signals like confused eyebrows or frustrated sighs. Dedicated usability labs offer an ideal setting for in-person studies, particularly for companies that conduct extensive user research.

Remote studies enable you to include a much broader user population than if you were testing locally in a single facility. (GPT Image-1)

Second, choose tools that actually work for your method. Evaluate platforms based on:

Method support (Does it handle your chosen approach?)
Participant sourcing (Can you find your people?)
Data collection features (Video? Heatmaps? Automatic transcription?)
Collaboration capabilities (Can teammates observe and note together?)
Integration options (Does it play nicely with your existing tools?)
Pricing that won’t provoke panic attacks

Third, define metrics covering both performance and perception:

Behavioral Metrics (what users do):

Task Success Rate: Percentage completing tasks successfully
Time on Task: How long tasks take (shorter usually means simpler)
Error Rate: Mistakes made (reveals friction points)

Attitudinal Metrics (what users feel):

Single Ease Question (SEQ): One simple rating after each task
Customer Satisfaction (CSAT) or Net Promotor Score (NPS): Broader satisfaction measures

Methodology Cheat Sheet

Interviews: Best for understanding motivations with 5–8 people. Rich insights, but time-intensive.
Unmoderated Tests: Great for validating flows quickly with many users. Fast but lacks follow-up opportunities.
A/B Tests: Perfect for comparing designs with thousands of users per version. Statistical significance without the “why.”
Card Sorting: Reveals how users organize information with 15–30 participants.
5-Second Tests: Captures first impressions from 20–50 people. Quick clarity check.

Completion Criteria

This stage succeeds when:

Your chosen methods align with research goals and are documented with justification
Tools are selected, tested, and ready to roll
Specific metrics are defined with baseline benchmarks for comparison

Remember: methods maketh the study. Choose wisely, and your research will reward you with revelations rather than regrets.

There is no single user research method that’s perfect for everything. We have many choices, and even though making these choices may seem complicated, don’t despair: there should always be another study later, where you can use a different methodology and approach the problem from a new angle. (GPT Image-1)

Step 4. Identify and Recruit Representative Test Participants

The validity of your usability study hinges entirely on who participates. Testing with the wrong people is like asking vegetarians to review a steakhouse: you’ll get feedback, but it won’t be particularly useful. If participants don’t represent your actual audience, the data collected ranges from misleading to completely invalid. Test a complex financial analysis tool with novice investors, and you’ll discover “problems” that are really just knowledge gaps. The resulting product dumbing-down will frustrate your actual users (expert analysts who know their derivatives from their dividends).

Rigorous recruitment requires creating detailed personas, crafting clever screening questions, and selecting suitable sourcing channels. Relying solely on your power users provides feedback as balanced as a unicycle. Poorly worded screeners attract professional study participants rather than genuine users, people who’ve perfected the art of qualifying for questionnaires.

Unfortunately, the growth of research panels has brought an increasing number of “professional participants” who sign up for as many studies as possible to cash in on the incentives. A good screener should attempt to weed out these people. (GPT Image-1)

Sample size isn’t arbitrary either. Qualitative studies seeking deep insights can strike gold with just five participants, uncovering about 85% of common usability issues. Quantitative studies chasing statistical significance need 40 or more participants to ensure results aren’t just random noise masquerading as meaningful metrics.

What Should Be Done

First, define your target audience beyond basic demographics. Create recruitment personas including:

User behaviors: Critical habits like “shops online for groceries weekly”
Demographics: Age, gender, location
Psychographics: Lifestyle, values, interests that reveal motivations
Technical Proficiency: Comfort with technology and preferred devices
Domain Knowledge: Expertise relevant to your product

From this list, user behavior is overwhelmingly the most important in defining your study participants.

Transform these characteristics into 1–3 concise recruitment personas serving as participant profiles.

Second, write screening questions that separate suitable subjects from survey surfing specialists:

Ask elimination questions first to quickly disqualify unsuitable candidates. Avoid yes/no questions that participants can easily game. Instead of “Do you use our banking app?” ask “Which banking apps have you used this month?” and include yours among competitors plus “None of the above.”

Prioritize behavioral questions over attitudinal ones. “How many times did you order food online last week?” beats “How often do you plan to order food online?” every time. Include open-ended questions like “Describe your vacation planning process” to identify articulate participants who provide thoughtful feedback.

Third, choose recruitment channels matching your audience and budget:

Recruitment Agencies: Platforms like User Interviews provide pre-screened pools (fastest but priciest)
Existing Customers: Use email lists and in-app messages (cost-effective for current users)
Social Media: Target ads on LinkedIn (B2B), Facebook (B2C), X (AI creators)
Guerrilla Recruiting: Approach people in coffee shops for quick feedback
Internal Employees: Use colleagues for pilot tests (but beware built-in bias)

Guerrilla studies grab people where they are. You can only run short sessions this way, and the sampling will be rather arbitrary, but it’s fast, easy, and cheap. Intercepting employees in the company cafeteria is particularly useful for intranet usability studies. (GPT Image-1)

Fourth, determine appropriate sample sizes:

Qualitative Testing: 5–8 participants per user group identifies most issues before diminishing returns
Quantitative Testing: At least 40 participants for statistical significance

Completion Criteria

This stage succeeds when:

Target audience definition with demographics, psychographics, and behaviors is documented
Screener survey with non-leading questions effectively filters qualified participants
Required participants are recruited (5 for qualitative, 40 for quantitative)
Backup participants stand ready for no-shows

Remember: recruiting right participants prevents research regret.

Step 5. Develop the Usability Test Plan, Scenarios, Tasks, and Script

The usability test plan is your study’s supreme scripture, the single source of truth detailing every aspect from goals to logistics. Creating it isn’t paperwork purgatory; it’s a critical act of synthesis and alignment. A comprehensive test plan forces teams to think through every detail in advance, ensuring consistent, efficient, and rigorous research. It provides a roadmap for researchers and sets stakeholder expectations about purpose, methodology, and outcomes. Without a formal plan, studies drift into improvisation island, where inconsistency reigns and crucial data disappears.

Developing realistic scenarios and goal-oriented tasks forms the heart of your test plan. This transforms abstract research questions into concrete actions participants actually perform.

Task quality directly determines insight quality. Poorly designed tasks (too vague, too prescriptive, or disconnected from reality) produce artificial behavior and misleading data. Well-crafted scenarios provide just enough context to make tasks relatable without revealing solutions. This focus on realistic goals rather than feature instructions allows researchers to observe natural behavior and uncover genuine usability issues.

The moderator script ensures every session runs like a well-rehearsed recipe. This consistency is crucial for reliable findings, minimizing facilitator variation and ensuring all participants receive identical instructions. The script includes more than just tasks: it covers rapport-building introductions, think-aloud instructions, questions, and debriefing prompts. A solid script guarantees consistency while helping moderators maintain neutrality and systematically cover all objectives.

What Should Be Done

First, create your comprehensive usability test plan, consolidating all previous decisions. Essential components include:

Background and Research Goals: Restate your problem and list specific research questions the study will answer.
Methodology: Detail chosen methods (remote moderated testing), rationale for selection, and tools you’ll use (say, Zoom for video, Miro for notes).
Participants: Describe recruitment personas, participant numbers, and screening criteria.
Logistics: Outline testing schedule, location or virtual setup, session duration (typically 45–60 minutes), and participant compensation.
Team Roles: Assign responsibilities for each member (facilitator, note-taker, tech support).
Metrics: List specific metrics you’ll collect (task success, satisfaction scores, memorable quotes).

Second, develop realistic scenarios and goal-oriented tasks. This is where rubber meets road:

Start with User Goals: Define what users want to achieve, not features they’ll use. The goal is “find a birthday gift for mom,” not “navigate the gift category page.”
Write Scenarios: Create short, relatable stories providing context and motivation. For example: “Your mom’s birthday is next week. She loves gardening, and you have a $50 budget. Please find an appropriate gift on this website.”
Craft Clear, Non-Leading Tasks: Instructions should be actionable without using interface terminology, avoiding word-matching exercises.
- Terrible Task (Extremely Leading): “Click the ‘Create Account’ button to register.”
- Bad Task (Still Leading): “Please register for a new account.”
- Good Task (Non-Leading): “Show me how you’d get started as a new user.”
Ensure Completability: Verify every task has a clear, reachable endpoint within your prototype or site.
Limit Task Numbers: Plan five to seven tasks for a 60-minute session. Better to explore fewer tasks thoroughly than rush through many superficially. Particularly with a new design, expect users to complete tasks much more slowly than you would have thought possible.

Third, write your complete moderator script guiding facilitators from start to finish:

Welcome and Introduction (5–10 minutes): Build rapport, explain the session’s purpose, emphasize you’re testing the system not the user, explain think-aloud protocol, and obtain recording consent.
Pre-Test Questions (5 minutes): Gather background information about participant context and experience to help interpret later behavior.
Task Section (30–40 minutes): List each scenario and task instruction verbatim. Include planned follow-up questions for deeper insights.
Post-Test Questions (5–10 minutes): Collect overall impressions, experience feedback, and administer standardized questionnaires like the System Usability Scale.
Wrap-up and Debrief: Thank participants, answer questions, and explain the compensation process. (If in-person, preferably give people cash on the spot.)

Completion Criteria

This planning phase is complete when:

A comprehensive test plan exists with stakeholder approval containing all components from goals to logistics
Finalized scenarios and tasks are written, reviewed for clarity, and mapped to research questions
A complete word-for-word moderator script covers everything from introduction through wrap-up
At least one uninvolved UX professional has reviewed all documents for bias, ambiguity, or confusion

Remember: proper planning prevents particularly poor performance. Your test plan transforms good intentions into great insights.

Step 6. Prepare the Test Environment and Essential Materials

Meticulous preparation of your test environment and materials is fundamental for smooth, professional, and effective usability studies. This step controls variables and eliminates potential failure points before participants arrive or log on. Poor preparation compromises data quality directly. In-person tests suffer when recording equipment glitches, outside noise intrudes, or consent forms go missing. These distractions break session flow and corrupt data collection. Remote tests fail when internet connections falter, software refuses to cooperate, or setup instructions confuse participants. Sessions simply stop starting, wasting time and testing everyone’s patience.

Beyond technical troubles, well-prepared environments foster professionalism and participant peace of mind. When participants enter organized spaces or join sessions with calm, prepared facilitators, trust builds and honest feedback flows freely. Chaotic settings create anxiety and suggest their time isn’t treasured, potentially poisoning their responses.

Assembling materials in advance ensures consistency and efficiency. Having scripts, note-taking templates, and scenarios ready lets facilitators focus fully on participant behavior rather than fumbling through folders. This preparation transcends logistics; it’s research rigor in action. Standardizing environments and materials minimizes variables, ensuring observed differences stem from product interaction, not process inconsistencies.

What Should Be Done for In-Person Testing

Physical setup proves paramount. Create a testing room for participants and facilitators, plus an observation room for stakeholders.

Testing Room Setup:

Test all technology thoroughly: computers, prototypes, recording software, microphones, and cameras. Verify observer room streaming works.
Create calm conditions: quiet, well-lit spaces without distractions. Post “User Testing in Progress: Do Not Disturb” signs. Remove company logos that might bias participants.
Ensure comfort: provide good chairs, water, and pleasant temperatures.

To ensure against interruptions, paste a sign on the testing room door while sessions are in progress. And do have a door: testing in cubicles or open office layouts compromises participant privacy and is too disruptive. (GPT Image-1)

Observation Room Setup:

Arrange adequate seating and supplies (pens, sticky notes) for observers.
Distribute printed scripts, note-taking forms, and observer rules (stay silent, save questions for later).

Physical Materials:

Print everything: scripts, templates, consent forms, questionnaires, and task scenarios.
Prepare participant payments (gift cards or cash) with thank-you notes.

What Should Be Done for Remote Testing

Focus on stable technical setups and crystal-clear communication.

Facilitator’s Setup:

Find quiet quarters free from barking dogs or doorbell disasters.
Use wired internet connections when possible. Deploy quality headsets for clear audio.
Arrange dual monitors: one for video, another for scripts and notes.
Prepare digital materials with links ready to paste (prototypes, surveys).

Participant’s Setup:

Send setup instructions detailing how to download software, share screens, and find quiet spaces.
Schedule brief technical checks before actual sessions to verify audio, video, and screen sharing work properly. This prevents panic during precious testing time.

Completion Criteria

Preparation is complete when:

Environment Ready:

In-person: Rooms arranged, technology tested, signs posted
Remote: Quiet location secured, stable setup confirmed, participants successfully completed tech checks

Materials Assembled: All documents printed or bookmarked and organized for instant access, including forms, scripts, templates, and incentives

Team Confirmed: Everyone has calendar invitations with correct times, locations, or links, plus necessary materials like observer guidelines

Remember: perfect preparation prevents particularly problematic performances. Your organized environment encourages authentic answers while chaos creates confused contributions.

Human factors research shows that using checklists ensures consistent procedures. Even people who know better often miss a step when proceeding without a checklist. Even though a qualitative usability study is not a science study (and you should not intimidate the participant by wearing a lab coat), it’s still best to write (and use) checklists for repeatable steps, such as the test setup or the introduction you give to the participant. (GPT Image-1)

Step 7. Conduct Pilot Testing for Procedural Refinement

Pilot tests are your study’s dress rehearsal, a chance to practice before the performance that counts. Their primary purpose is testing your research plan’s feasibility in a low-stakes setting. Skipping this step is like proposing without practicing: technically possible but potentially disastrous. Better to discover problems with one practice participant whose data gets discarded than during your first real session, when stakes are higher and budgets are watching.

Pilot tests serve several critical functions. First, they validate whether your materials make sense to actual humans. Tasks that seem crystal clear to researchers might confuse real users like assembly instructions written in ancient Greek. Pilots reveal if instructions mislead, if task wording accidentally reveals answers, or if questions fail to generate useful feedback.

Second, they test timing and flow. Pilots provide realistic estimates of task duration, essential for ensuring sessions fit scheduled slots and helping prioritize if pruning proves necessary.

Third, they uncover technical troubles before they become tragedies. Links break, features fail, recording software rebels. These gremlins must be banished before the real research begins.

Finally, pilots give facilitators practice opportunities, helping them refine questions and become comfortable with scripts, leading to smoother sessions when it matters most.

Ideally, plan on more than one pilot test session, and allow time between the first pilot and the second to make the inevitable changes in your test plan and materials. In essence, the first session is the pilot for the second session. (GPT Image-1)

What Should Be Done

First, recruit one or two pilot participants matching your target user profile. You can use co-workers for pilot sessions, since the goal is to test the study plan and not the user interface, but there is always a risk that they will not behave in the same way as the real users. Best practice is to recruit one or two extra qualified participants and schedule them two days before official sessions begin. This provides one day in the middle (between pilots and real testing) for necessary adjustments.

Second, conduct the complete test session exactly as planned, from welcome through wrap-up. Note-takers and observers should participate normally. Watch vigilantly for:

Task clarity (Do participants understand what you’re asking?)
Wording issues (Does phrasing lead participants or use interface terminology?)
System functionality (Can tasks be completed without hitting dead ends?)
Timing troubles (Are tasks taking reasonable time?)
Data quality (Are responses providing rich insights?)
Technical terrors (Is everything recording properly?)

Third, gather feedback and refine ruthlessly. Hold immediate team debriefs discussing observations. Ask pilot participants directly: “Were instructions clear?” “Did you feel rushed?” Based on feedback, make concrete revisions: reword confusing tasks, remove lengthy segments, add probing questions, or fix critical bugs.

If the pilot sessions proceed perfectly, their data can be included in the final analysis, but don’t plan on this, since you will almost certainly want to make changes to the test plan after each pilot user. The goal of a pilot is to test the test, not to test the design.

Completion Criteria

Pilot testing succeeds when:

At least one complete pilot session simulates the entire protocol.
You have conducted a debrief of the pilot session and analyzed both your own observations and the participants’ feedback on the study process itself.
All identified issues get addressed and study materials updated accordingly.

The team should feel confident that the study will run smoothly and generate valid, valuable data.

Remember: a pilot today avoids panic tomorrow. Your dress rehearsal determines whether opening night delights or disappoints.

Step 8. Facilitate the Usability Test Sessions

Facilitation is where planning pays off and data collection comes alive. The quality of gathered data depends on facilitator skill. Skilled facilitators create comfortable, neutral environments encouraging participants to behave naturally and share thoughts honestly. They guide without leading, observe without influencing, and probe without pushing bias into the conversation. Great facilitators transform standard sessions into insight goldmines. Poor facilitators accidentally influence behavior, ask leading questions, or miss critical moments, producing shallow, skewed data that undermines everything.

The think-aloud protocol, qualitative testing’s crown jewel, requires careful cultivation. Participants verbalize thoughts, feelings, and reasoning while using the product, providing windows into their mental machinery. However, thinking aloud feels unnatural to most humans. People fall silent when concentrating, confused, or simply forgetting to narrate their neural activity. Facilitators must gently prompt participants to keep talking using neutral nudges like “What’s going through your mind?” or “Tell me what you’re seeing.”

People usually keep their thoughts to themselves, and it is unnatural to keep up a thinking-aloud stream of verbalized thoughts. This is why you need instructions for study participants on how to do this and why the facilitator usually needs to remind users several times during a session to keep saying what they are thinking in the moment (rather than storing up their thoughts for retrospective reporting, which is usually biased by subsequent events in the study). (Imagen 4 Ultra)

Asking effective, non-leading questions separates mediocre moderators from magnificent ones. When participants exhibit unexpected behavior or make intriguing comments, well-timed questions uncover underlying understanding. Leading questions like “Was that confusing?” poison the well with suggested answers. Neutral probes like “What did you expect would happen?” open doors for authentic explanations yielding valuable, unbiased insights. Disciplined facilitation combined with systematic note-taking captures rich, nuanced data from every session.

Leading questions, such as directing the user’s attention to a particular UI element, can doom a usability study. (GPT Image-1)

What Should Be Done

Effective facilitation strikes a balance between script adherence and responsive moderation across three phases: setting the stage, conducting tasks, and wrapping up.

First, set the stage and build rapport (5 minutes):

Welcome participants warmly by name and engage in light conversation to melt initial ice. Use your script’s introduction to explain the session’s purpose. Crucially, emphasize you’re testing the product, not the person. This simple statement slays performance anxiety.

Explain what will happen, how long it takes, and what’s expected. Clearly describe thinking aloud with an example: “As you work, say whatever crosses your mind. You might say things like ‘I’m looking for the menu... This button seems odd... I thought this would take me home.’”

Obtain informed consent verbally, confirming recording permission and data usage understanding, even if forms were previously signed.

Second, conduct tasks and facilitate observation (30–40 minutes):

Read scenarios and instructions verbatim from your script ensuring session consistency. Maintain strict neutrality. Avoid reacting to participant actions or comments. Use neutral acknowledgments like “thank you” or “that’s helpful.” When participants ask how something works, respond with “What would you try if you were alone?”

A study facilitator should maintain a neutral, poker-faced demeanor, like this capybara, regardless of what the test user does. (GPT Image-1)

When silence stretches beyond 15 seconds, provide gentle prompts:

“What are you thinking right now?”
“Can you describe your current goal?”

When observing critical incidents (hesitation, errors, frustrated sighs, surprising statements), deploy open-ended questions after task completion:

“Tell me more about that experience.”
“What did you expect when you clicked there?”
“You said ‘weird.’ What made it weird?”
- For this last example, also try the trailing-off technique, where you simply echo the user’s word that you want elaborated, in a questioning tone: “Weird …?”

Keep one eye on the clock ensuring critical tasks get coverage. If participants struggle excessively, gently guide them forward.

Third, conduct post-test interviews and wrap-up (5–10 minutes):

Ask scripted questions gathering overall impressions and reflections. Administer any final questionnaire. Allow participants to ask their own questions about the product or study. Thank them sincerely for valuable feedback, explain payment procedures, and end positively.

Throughout sessions, note-takers document observations chronologically, capturing quotes verbatim, noting specific behaviors (“squinted at screen, clicked back button three times”), and recording timestamps for critical moments needing later review. Observers should record facts, not interpretations, using structured templates.

Observers are invaluable, both for building team buy-in during the research process and for identifying issues that may have escaped the facilitator’s notice. However, observers should remain quiet and refrain from influencing the test participant. Having a dedicated study facility with an observation room and a one-way mirror is one way to guard against observer interference. (GPT Image-1)

Completion Criteria

Sessions succeed when:

Protocol Adherence: Facilitators follow scripts consistently, giving all participants identical core instructions and tasks.
Data Richness: Participants remain comfortable and engaged in thinking aloud, providing plentiful qualitative data. Facilitators explore key observations through effective probing.
·Data Integrity: Sessions record successfully (audio and screen), while note-takers capture detailed observations, direct quotes, and critical incident timestamps.
Neutrality Maintained: Facilitators avoid leading participants or introducing bias, allowing natural behavior and authentic thoughts to surface.

Remember: fantastic facilitators find the fine balance between friendly and formal, guiding gracefully while gathering genuine gems of insight.

Step 9. Analyze and Synthesize the Findings

The analysis and synthesis stage transforms raw, random recordings into actionable answers. This is arguably the most mentally demanding part of research. Simply collecting countless clips and copious notes isn’t enough; value emerges only when this raw material is systematically sorted, scrutinized, and shaped into sensible stories. Without rigorous analysis, findings become anecdotal accidents based on whoever complained loudest rather than weighted evidence across all sessions. Structured approaches ensure conclusions are credible, defensible, and directly tied to observed data.

Synthesis differs from analysis like cooking differs from chopping. Analysis breaks data into bite-sized bits: individual observations, problems, and quotes. Synthesis reassembles these pieces into a palatable whole, connecting scattered sightings to form fuller findings and constructing narratives explaining why problems persist. This process elevates individual incidents into insights that inform strategic solutions.

Critical to this stage is prioritizing problems properly. Not all usability issues are created equal. Some are minor mosquito bites while others are massive migraines preventing users from completing critical tasks. A systematic prioritization framework considering frequency, impact, and severity ensures teams focus finite resources on fixes that matter most. Without prioritization, teams risk polishing doorknobs while the foundation crumbles.

What Should Be Done

Analysis and synthesis should start immediately after session one and continue throughout testing. The process moves from individual observations to prioritized, actionable insights.

First, conduct post-session debriefs. Immediately after each test, gather your team for a 15-minute memory dump while details remain fresh. Focus on significant struggles participants faced and surprising statements they made. This informal synthesis helps teams spot patterns promptly and build shared understanding.

During the test sessions, as many team members as possible should observe and take notes. An immediate observer debrief, while each session is fresh in the observers’ minds, is a great way to quickly capture insights that can be fed into later analysis. (GPT Image-1)

Second, organize and analyze raw data. This core work involves processing qualitative content to identify patterns.

Data Consolidation: Gather everything into one location: videos, notes, survey responses. Think of it as corralling cats into one room before counting them.
Qualitative Data Analysis: Two useful methods:
- Thematic Analysis: Read through data repeatedly until familiar, then “code” observations with labels. Group similar codes into themes. For instance, codes like “lost the login button,” “missed the menu,” and “couldn’t find contact info” might merge into “Navigation Nightmares.”
- Affinity Diagramming: Write each observation on a sticky note. Teams silently sort related notes into clusters on walls or digital whiteboards. Once grouped, name each cluster to capture its theme. It’s like organizing a messy closet: similar items together, then labeled clearly.

Affinity diagrams are traditionally made by moving sticky notes around on a large wall, but can also be done by moving colored squares around in cloud-based software like Miro, which supports remote team collaboration. (GPT Image-1)

Synthesize with Quantitative Data: Numbers add weight to words. If analysis reveals “checkout confusion,” support it with stats: “Only 40% completed checkout successfully, average ease rating was 2 out of 7.” Quantitative data shows problem size; qualitative explains the why.

Third, create a prioritized problem list. Rate each issue on three factors:

Frequency: How many participants hit this hurdle? Common or rare?
Impact: Does this problem pause progress or prevent completion entirely?
Persistence: Will users face this frustration once or repeatedly?

Assign severity scores (0=no problem to 4=catastrophic calamity). This provides objective ordering for tackling troubles, ensuring critical issues get attention before cosmetic concerns.

Completion Criteria

Analysis succeeds when you’ve produced:

Organized Raw Data: All materials centralized and accessible for review.
Key Themes: Systematic analysis yielding 5–10 major themes summarizing user behavior patterns.
Prioritized Problem List: Comprehensive catalog of usability issues, each clearly described and severity-scored using your framework, sorted from serious to superficial.
Evidence-Based Findings: Every theme and issue supported by specific evidence: quotes, clips, or quantitative metrics proving problems aren’t imaginary.

Remember: successful synthesis transforms scattered snippets into stories, random reactions into reliable recommendations, and countless comments into clear conclusions that compel constructive change.

Step 10. Report Findings and Actionable Recommendations to Stakeholders

The final report is your research’s grand finale, the vehicle that transforms testing into tangible change. Research is only as impactful as its ability to be understood, believed, and acted upon. A poorly presented report can cause brilliant breakthroughs to be buried or butchered, rendering weeks of work worthless. Your goal isn’t simply presenting data; it’s telling a compelling, evidence-based story that generates empathy, provides direction, and influences decisions.

Effective reporting requires reading the room and recognizing your readers. Executives are time-starved and focused on business benefits. They need concise summaries highlighting critical findings and bottom-line impacts. Designers and developers need detailed directions showing exactly where users struggled and why. A single serving won’t satisfy these different dietary needs. Successful communication often involves multiple meals: executive appetizers, detailed main courses for core teams, and raw ingredients (video clips, data) for those wanting to cook from scratch.

Showing video clips of real user behavior is one of the best ways to generate stakeholder buy-in. (GPT Image-1)

Reports must propose clear, specific solutions, not just identify issues. “Users find navigation confusing” is an observation floating in space. “Change ‘Resources’ to ‘Help Center’ since 80% of users looked there for support” is an actionable answer. Recommendations should link directly to findings, be practical to implement, and be prioritized by problem severity. This focus on solutions transforms research from diagnosis into treatment.

What Should Be Done

First, structure your story sensibly. Whether writing documents or designing decks, narratives need natural flow:

Executive Summary: Start with essentials. Include study goals, top three findings, and priority recommendations. Busy bosses can grasp greatness in five minutes.
Methodology: Briefly describe what you did, who participated, and when. This establishes credibility without causing comas.
Key Findings: Present problems thematically. Group similar struggles into sensible stories.
Recommendations: For each finding, provide specific fixes with supporting evidence.
Appendix: Include links to test plans, scripts, and video vaults for detail devotees.

Second, show rather than simply saying. Evidence elevates everything:

Quote Participants: Include powerful phrases capturing frustration firsthand. “I feel like I’m playing hide-and-seek with the checkout button” beats “Users couldn’t find checkout.”
Screenshot Problems: Circle confusing components. Pictures prove points powerfully.
Video Victories: Nothing beats 30-second clips of users struggling while saying “Where is this thing?!” One minute of meaningful montage beats hours of bullet points.

Third, frame fixes that are actually actionable:

Be Specific: Replace “Improve forms” with “Make phone number field accept formats with or without dashes.”
Connect to Evidence: Link every recommendation to research reality.
Prioritize Problems: Mark must-fixes versus nice-to-haves using severity scores from analysis.
Include Wins: Report what worked well. Teams need to know which designs to duplicate.

Finally, present with pizzazz. Keep presentations under 30 minutes. Use storytelling to frame user journeys. Leave time for discussion, questions, and collaborative planning.

Completion Criteria

Reporting succeeds when:

Reports reach all relevant recipients
Communication clearly conveys goals, methods, and findings fitted to each audience
Recommendations are specific, prioritized, and tied to evidence
Stakeholders agree on findings and commit to addressing priority problems

Remember: powerful presentations persuade people, brilliant briefings build buy-in, and compelling communication converts skeptics into champions who crave constructive change.

Step 11. Follow Up on Implementation of Findings to Ensure Changes are Made

Conducting research and delivering dazzling reports is only half the battle. The ultimate goal isn’t producing papers but improving products. The follow-up stage is where research reaches reality. Without systematic tracking of recommendations, even brilliant insights become buried beneath backlogs, technical debt, and shifting strategies. This step closes the loop, ensuring your investment in understanding users translates into tangible improvements. It bridges insight and impact.

Effective follow-up requires researchers to transform from analysts into advocates. They must clarify recommendations, provide context, and help developers understand the “why” behind changes. This prevents fixes that technically tick boxes but miss the mark. If users can’t find the save button and you recommend “making it more visible,” developers might simply make it bigger. But perhaps the real problem is that it’s labeled “Archive” when users expect “Save.” Continuous collaboration ensures solutions actually solve problems.

Tracking impact is essential for demonstrating research ROI and fostering data-driven decisions. When you show that checkout completion jumped 30% after implementing your recommendations, you provide undeniable proof that usability work works. This justifies current spending and builds bulletproof business cases for future research.

What Should Be Done

First, translate recommendations into trackable tasks. Work with product managers to convert your prioritized findings into tickets in their tracking tools (Jira, Asana, or whatever alphabet soup they prefer).

Create detailed tickets including:

Clear problem descriptions
Links to research reports
Specific recommendations
Supporting evidence (videos, quotes)

Assign priorities matching your severity scores. Attend planning meetings to advocate for fixes and answer questions.

Second, collaborate closely during implementation. Your involvement shouldn’t stop at report delivery.

Provide context explaining why changes matter. When developers understand that users expect “Cart” not “Basket” because they’re shopping, not picnicking, implementation improves.

Review mockups before coding begins. Confirm proposed solutions solve actual problems. Discuss technical constraints and brainstorm alternatives when ideal solutions prove impossible. Perform final reviews in staging environments ensuring implementations match intentions.

Third, measure and report impact. After fixes go live, measure their effectiveness.

Rerun usability tests using original tasks and metrics. Compare the before-and-after data to demonstrate clear improvement. Monitor key analytics, such as completion rates and error frequencies. Create follow-up presentations showcasing success. Nothing beats a slide showing “Task success rate: 45% → 82%” for securing future funding.

Completion Criteria

Follow-up succeeds when:

High-priority problems are fixed and released
Impact is measured using original metrics
Results are communicated to stakeholders, demonstrating measurable improvements

Remember: fantastic follow-through transforms findings into features, ensuring excellent efforts evolve into enhanced experiences that excite everyone.

Step 12. Plan for the Next Iteration and Future Usability Studies

This final step transforms usability testing from a one-time task into a continuous cycle of curiosity and improvement. The goal isn’t achieving the mythical unicorn of “perfect” usability, but fostering an iterative loop of designing, testing, and learning. Findings from one study should feed the next, creating a virtuous vortex of user-centered development. Planning ahead ensures momentum maintains itself. It embeds user feedback into product development’s daily dance, making research proactive rather than panic-driven.

User research should keep ticking like a metronome. Ideally, pre-schedule user sessions for a specific day every week (say, Wednesday). When that day approaches, you can be sure that your team will have something they want to learn. If not, take the opportunity to conduct bottom-up exploratory research or a competitive study. (GPT Image-1)

Creating a research roadmap is your formal framework for continuity. This strategic document outlines planned research activities over coming quarters. It aligns research with product plans and business objectives, ensuring studies happen when insights can actually influence decisions. Schedule exploratory interviews before designers start sketching. Plan usability tests before developers write code. Timing is everything, and everything needs timing.

Strategic planning provides powerful perks. It gives stakeholders visibility into research direction, creating shared vision for how insights shape products. It forces prioritization, focusing finite resources on questions with maximum impact. It helps with resource allocation, allowing leaders to plan budgets, tools, and teams in advance. By treating research as a strategic program rather than sporadic scrambles, organizations mature from assumption-based to evidence-embraced decision making.

User research isn’t a one-time deal. You should research before designing anything, and multiple times throughout the prototyping and iterative design phases. (GPT Image-1)

What Should Be Done

First, identify and document new research questions. Every study spawns fresh mysteries.

Review “parking lot” items: those interesting issues that fell outside your recent study’s scope. Consolidate these curious questions for future exploration.

A parking lot is helpful in any team meeting or discussion: it’s a place to “park” items that can’t be handled right now (and would be distracting if discussed further). Parking something ensures that it’s not forgotten and can be revisited when appropriate, and also prevents the person making that suggestion from feeling overheard. (GPT Image-1)

Analyze findings for deeper dilemmas. If users struggle with search, the deeper question might be “What are users actually trying to accomplish?”

Solicit stakeholder suggestions. Meet with teams to discuss findings and brainstorm emerging uncertainties needing investigation.

Second, create or update your research roadmap. This living document deserves quarterly reviews.

Structure by time: “Now” (current quarter), “Next” (following quarter), “Later” (beyond six months). This provides perspective without premature promises.

Frame initiatives as questions, not methods. Write “Understanding why users abandon carts” not “Conducting cart analysis.”

Link each initiative to product goals. Show how research delivers insights when decisions need them.

Include essential details: brief descriptions, proposed methods, priority levels, current status, and involved teams.

A research roadmap keeps you on track to continuously move forward. (GPT Image-1)

Third, share and socialize your roadmap. Communication creates collaboration.

Present the roadmap to stakeholders, explaining priorities and connections to company goals. Gather feedback ensuring alignment with organizational objectives. Store it somewhere accessible so anyone can see what’s planned.

Completion Criteria

This final phase succeeds when:

New research questions from previous studies are documented and maintained
A formal roadmap exists, gets updated quarterly, and remains accessible to stakeholders
Research clearly connects to product initiatives and business objectives
Completed studies consistently inform future research priorities

Remember: persistent planning produces powerful progress, roadmaps reap recurring rewards, and iterative insights inspire infinite improvements.

Conclusion

The path to a usable product is paved with empirical evidence. I’ve given you the steps to gather such evidence with the number-one user research method: user testing. Other research methods can provide even more insights, but they are not suitable for companies with low UX maturity. Start with user testing, and once you have mastered this method and proven the value of user research, you will likely gain the staff, budget, and experience to employ additional research methods.

Usability testing works because it puts reality between your team and your assumptions. But reality is messy. A systematic sequence makes testing manageable and repeatable, especially in companies that lack a deep tradition of user research.

This is how you turn a usability test from a show-and-tell into a decision engine. The sequence reduces bias, increases traceability, and speeds iteration. It also builds trust: when product leads can see how the next UI change will be tested and when they’ll know if it worked, they commit. When engineers receive clear issues with evidence and acceptance criteria, they fix. When leadership sees before/after curves and fewer support tickets, they fund the next round. Each usability project feeds the next, creating cycles of continuous improvement.

The systematic process allows leadership to trace how redesign recommendations originated from their business objectives, making it profitable for the company to implement usability improvements. (GPT Image-1)

Keep the 12 steps lightweight but disciplined. Your users will get faster, easier paths; your team will ship with more confidence; and your organization will gradually replace opinion battles with evidence-based decisions, which are the true mark of mature product practice.

Remember: persistent processes produce powerful products, systematic studies spawn spectacular solutions, iterative insights inspire incredible interfaces, and continuous cycles create customer-centered cultures that consistently conquer confusion through collaborative, coordinated, and courageous commitment to understanding users utterly.

Summary of the 12 Steps to Running a Usability Study

The 12 steps recommended in this article. (Napkin)

Step 1. Define the Problem, the Context, and Existing Data
Synthesize existing data to create an evidence-based, user-centric problem statement. Avoid prescribing solutions. Get stakeholder consensus before proceeding.

Step 2. Define Clear Research Goals, Objectives, and Scope
Set 3–5 research goals with specific questions. Document scope boundaries to prevent mission creep. Include change control process.

Step 3. Choose Appropriate Usability Testing Methods, Tools, and Metrics
Match methods to goals: qualitative for insights, quantitative for metrics. Select appropriate tools and define success metrics with benchmarks.

Step 4. Identify and Recruit Representative Test Participants
Define target users by behaviors. Create ungameable screeners. Recruit 5–8 for qualitative, 40+ for quantitative studies. Include backups.

Step 5. Develop the Usability Test Plan, Scenarios, Tasks, and Script
Document everything in a test plan. Create goal-based scenarios with non-leading tasks. Write complete scripts ensuring session consistency.

Step 6. Prepare the Test Environment and Essential Materials
Prepare distraction-free environments with tested technology. Organize all materials including scripts, forms, and payments. Professional setup ensures quality.

Step 7. Conduct Pilot Testing for Procedural Refinement
Test everything with 1–2 pilot participants. Identify and fix confusing tasks, timing issues, and technical problems before starting.

Step 8. Facilitate the Usability Test Sessions
Facilitate neutrally while encouraging think-aloud. Ask open-ended questions. Follow scripts consistently. Document everything with timestamps and direct quotes.

Step 9. Analyze and Synthesize the Findings
Analyze data systematically to identify themes. Prioritize problems by frequency, impact, and severity. Support findings with concrete evidence.

Step 10. Report Findings and Actionable Recommendations to Stakeholders
Tailor reports to audiences. Use compelling evidence like videos and quotes. Provide specific, prioritized recommendations tied to findings.

Step 11. Follow Up on Implementation of Findings to Ensure Changes are Made
Convert recommendations into development tickets. Collaborate during implementation. Measure impact with follow-up tests. Demonstrate ROI to stakeholders.

Step 12. Plan for the Next Iteration and Future Usability Studies
Document emerging questions. Create research roadmaps aligned with product plans. Update quarterly. Make research a continuous, proactive cycle.

In combination, these 12 steps become an evidence-in, action-out machine. (GPT Image-1)

About the Author

Jakob Nielsen, Ph.D., is a usability pioneer with 42 years experience in UX and the Founder of UX Tigers. He founded the discount usability movement for fast and cheap iterative design, including heuristic evaluation and the 10 usability heuristics. He formulated the eponymous Jakob’s Law of the Internet User Experience. Named “the king of usability” by Internet Magazine, “the guru of Web page usability” by The New York Times, and “the next best thing to a true time machine” by USA Today.

Previously, Dr. Nielsen was a Sun Microsystems Distinguished Engineer and a Member of Research Staff at Bell Communications Research, the branch of Bell Labs owned by the Regional Bell Operating Companies. He is the author of 8 books, including the best-selling Designing Web Usability: The Practice of Simplicity (published in 22 languages), the foundational Usability Engineering (28,897 citations in Google Scholar), and the pioneering Hypertext and Hypermedia (published two years before the Web launched).

Dr. Nielsen holds 79 United States patents, mainly on making the Internet easier to use. He received the Lifetime Achievement Award for Human–Computer Interaction Practice from ACM SIGCHI and was named a “Titan of Human Factors” by the Human Factors and Ergonomics Society.

Subscribe to Jakob’s newsletter to get the full text of new articles emailed to you as soon as they are published.
Follow Jakob on LinkedIn.
Read: article about Jakob Nielsen’s career in UX
Watch: Jakob Nielsen’s first 41 years in UX (8 min. video)

4 Likes

Discussion about this post

Getting Started with AI for UXUse generative-AI tools to support and enhance your UX skills — not to replace them. Start with small UX tasks, and watch out for hallucinations and bad…Oct 18, 2023 • Jakob Nielsen1464 Hello AI Agents: Goodbye UI Design, RIP AccessibilityAutonomous agents will transform user experience by automating interactions, making traditional UI design obsolete, as users stop visiting websites in…Feb 21 • Jakob Nielsen16011 UX Angst of 2023-24You think UX is tanking? Think again. The UX sky’s not falling; it’s reshaping. AI integration will facilitate a UX renaissance, while the profession’s…Sep 13, 2023 • Jakob Nielsen1275 © 2025 Jakob NielsenPrivacy ∙ Terms ∙ Collection notice Start writingGet the appSubstack is the home for great culture

Read the Full Post

The above notes were curated from the full post jakobnielsenphd.substack.com/p/user-testing.

12 Steps for Usability Testing: Plan, Run, Analyze, Report

my notes ( ? )

What Should Be Done

Completion Criteria

What Should Be Done

Completion Criteria

What Should Be Done

Methodology Cheat Sheet

Completion Criteria

What Should Be Done

Completion Criteria

What Should Be Done

Completion Criteria

What Should Be Done for In-Person Testing

What Should Be Done for Remote Testing

Completion Criteria

What Should Be Done

Completion Criteria

What Should Be Done

Completion Criteria

What Should Be Done

Completion Criteria

What Should Be Done

Completion Criteria

What Should Be Done

Completion Criteria

What Should Be Done

Completion Criteria

Read the Full Post

Related reading

Cookies disclaimer