Curated Resource ( ? )

2025 in AI Video (Featuring My Top 10)

2025 in AI Video (Featuring My Top 10)

my notes ( ? )

I thought that 2024 was the year of AI video and that 2025 would be the year of AI agents. There certainly was strong growth in AI agents in 2025, but they’re not quite ready for mainstream use yet, even as enterprise use cases are starting to emerge.

AI video did indeed do well in 2024, as shown by my list of top 10 videos of 2024 and particular my music video “Top 10 UX Articles of 2024” made in late December 2024. But the 2024 progress in AI video was nothing compared to the leaps in AI video capabilities realized in 2025. In retrospect, that December 2024 video is quite primitive compared to the videos I released in December 2025 (for example, “Old Workers Stay Creative With AI”).

To demonstrate the progress in AI video this year, I have made a highlights reel with clips from my best music videos released from January to December 2025. The clips are presented in chronological sequence, allowing you to judge the improvements.

I judge the rate of improvement to be as follows for the different components that go into making AI videos:

  • Speech synthesis: only modest improvements, because speech was already good by late 2024. The biggest improvement has been in models like ElevenLabs v3 that use a language model to understand the meaning of the words it is being asked to speak, allowing it to adjust the emotional delivery of its lines. (For an example, watch my explainer video “Slow AI: User Control for Long Tasks Explained in 5 Minutes.”)
  • Songs and music: only modest improvements, because they were also already good by late 2024. One first: I was finally able to make an operatic area that sounds reasonably good (no Mozart, though): “Direct Manipulation.” Even as recently as the summer of 2025, all my attempts at opera sounded like a bad Broadway musical, not like a proper opera, and therefore I never published them.
  • Avatar animation: big gains, with models like HeyGen Avatar IV now capable of fairly high fidelity, as long as we stay with talking head close-up views. Quality drops for full-figure presenters. (As an example, watch my music video “Creation by Discovery: Navigating Latent Design Space,” and observe how much better the singer performs in the closeup cuts.)

Current AI avatar models are notably better at rendering close-up views than full-figure avatars. (Seedream 4.5)

  • Dance and movement animation: much better now, but still far from what’s needed. For example, watch the dance sequences in “Creation by Discovery: Navigating Latent Design Space,” which was my recent attempt at creating a K-pop performance. Individual dance moves are usually good, but the overall choreography is poor and not at all up to JYP standards. My guess is that the video models are trained on numerous K-pop performances, so they have the individual moves down pat and can perform a synchronized group dance well. But they have no concept of coordinating the dance with the music. The same is true for the singers’ movements or the musicians’ use of their instruments. Looks fine if the music is muted, but when you play a song as intended, you can see that the image and the audio aren’t coordinated, except for the lip synch.
  • Native audio with video generation: A huge leap forward with Veo 3 (and now 3.1), but the model only generates 8 seconds of video, meaning that it’s only useful for B-roll, not for complete movies, music videos, or explainers. As an experiment, I did edit together a sequence of 8-second clips for my AI rendition of Shakespeare’s most obscure play, “Pericles, Prince of Tyre.” The resulting mini-movie does allow you to get the gist of the plot, but I will be the first to admit that we need much better for AI movies to be enjoyable for their storytelling, rather than appreciated as demos of the technology.

AI video trends showed marked improvements in all areas. (Nano Banana Pro)

Video in 2026

Contrary to what some AI influencers say, I don’t think that “Hollywood is cooked” yet, and it won’t be cooked in 2026 either. We’ll definitely see bigger and bigger components of mainstream feature films and TV shows made with AI, but probably not fully made with AI until maybe 2028. For now, the best use of significant AI by a legacy studio seems to be Amazon’s series House of David, particularly in battle scenes and other special effects. (Interesting that I refer to Amazon as a legacy studio, but after their acquisition of MGM and continued releases of quite traditional films and TV shows with a very traditional “studio”-like production process, I think this is appropriate, especially to distinguish Amazon from the emerging independent creators.)

Traditional Hollywood film production is not cooked yet, but by 2028, I expect all major studios to either make the transition to AI or actually become cooked. (Seedream 4.5)

I do expect strong improvements in all parts of the AI video pipeline, especially in the native audio segment, where you get a full clip in one shot rather than having to generate the image and audio sides separately. 30-second clips are a realistic possibility by the end of 2026.

Character consistency from one clip to the next is already possible for the character’s visual appearance by uploading still images as references for the video model. (See my video “Aphrodite Explains Usability” for an example made with Veo 3.1.) The character’s voice currently shifts from one clip to the next, meaning that can’t suspend disbelief and experience a sequence of cuts as a single progression as we’ve had in manually-made movies since D. W. Griffith advanced cross‑cutting and continuity editing by systematizing them into a coherent “film language” to sustain long, complex narrative features rather than being perceived as disconnected short scenes. AI voice consistency in 2026? I think it’s more likely than not, given the demand.

Currently, the same character sounds different across clips, even when generated by the same AI video model. (Nano Banana Pro)

Music is needed both for music videos and to enrich the soundtrack of other videos. One-shot songs from Suno 5 are already good. I certainly tap my feet more to my own songs (made with Suno) than I do when listening to any chart-topper. And I rewatch my own music videos with pleasure many times more than I watch any million-view commercial music video, even from my favorite K-pop label (JYP Entertainment). Of course, this is because I make exactly what I like, but that’s the point about AI-supported creation: we’re no longer limited by corporate mass media and its bland taste. Instead, everybody can pursue their own individual vision and create what they like.

Music models still provide a fairly limited ability to navigate the latent design space. Suno recently added equalizer (EQ) editing to music tracks, but this feature seems like a very limiting pre-AI approach to music creation: you can vary the intensity of the sound in different frequency bands. Instead, we need semantic editing features. At a minimum, the ability to request things like making the drums softer or emphasizing the woodwinds, as a symphony orchestra conductor might do to achieve a spooky or supernatural effect for a certain scene in a ballet. (Thinking of you, Giselle.) However, true semantic editing would allow the user to specify a higher level of intent (e.g., spooky vs. romantic mood) to navigate the latent space along meaningful vectors. This example is for music, because I want to vent about Suno’s misguided new feature, but the same point applies to any of the constituent media forms that come together for a full video. (Speech, movement, dance, action scenes, general acting, etc.)

In sum: strong advances in AI video in 2025, but it’s not there yet for ambitious projects. We can taste the future, though. (Nano Banana Pro)

Judging Video Popularity Metrics

In deciding the list of my “top” videos for 2025, I considered several metrics:

  • Clickthrough rate. This is more about the thumbnail design than about the quality of the video at the other end of the click. Clickthrough rates are life and blood for creators who live by their views, because no video view happens unless the user clicks the thumbnail (or other link) first. However, even though I like the thumbnails I have been making lately, I don’t score super-high clickthrough rates because I refuse to engage in overused YouTube tropes, such as the exaggeratedly astonished face closeup. (These thumbnail styles are overused exactly because they work and deliver clicks: human brains have incredibly many neurons devoted to face recognition and emotion recognition.)

90% of YouTube thumbnails seem to have come from the same casting call that rejected anybody who didn’t have the same signature look of extreme astonishment. They are designed that way because this look works. (Seedream 4.5)

  • View count. This is a more meaningful metric, because the various services usually only record a “view” if the user doesn’t just click through from the thumbnail but also continues to watch the video for some time. (Even if not necessarily to the end.) A terrible video with a compelling thumbnail will have a high click count but a low view count. I used views as a major component of my decision process, with the following modification: I summed views across all the platforms where I publish my videos (currently YouTube, LinkedIn, Instagram, and X), but with an age-related modification for the YouTube numbers. The older the video, the more time it has had to rack up YouTube views, whereas the social media platforms only expose a video for a few days, meaning that old videos don’t accumulate more views than new videos there.
  • View duration. This is a prime indicator of video quality: if people watch longer, they must like it more.

View duration can be measured in many ways. YouTube reports two metrics: average view length and percentage of the video watched. Neither is a great metric. The average length can only ever be high for long videos. For example, my video “Aphrodite Explains Usability” only runs for 41 seconds, making it impossible for it to even score a full minute in average viewing time, even if people like it enough to watch the entire thing. On the other hand, “Transformative AI (TIA): Scenarios for Our Future Economy” is an 8-minute video, and it’s empirically proven that very few online users continue watching a video for that long, even if it’s good. Its average view duration is 4 minutes 3 seconds, so it handily beats Aphrodite, who never had a chance.

I prefer a third metric: the drop-off rate. YouTube tells you how many viewers are still watching after 30 seconds: 49% for Aphrodite and 68% for Transformative AI. Now we’re talking: the second video wins on this fair metric where any-length video (above 30 seconds, that is) has an equal chance.

Comparing the viewing retention curves for two versions of my Direct Manipulation song: opera version left and rock version right. People watched the rock version longer. (Of course, I know that the mainstream audience doesn’t like opera, but I do, so I reserve the right to make more operas in the future.)

By default, YouTube reports retention at 30 seconds, but you can get the number at other points by dragging along the retention curve. I also like to calculate what share of those people who watched at the 0:30 mark are still watching after a full minute: those are the real fans who truly enjoy that video.

In the example I showed here, these numbers are:

  • Opera version: 53% watching after 30 secs, 38% watching after 1 minute. Thus, 72% of the half-minute viewers stuck with the video for a full minute.
  • Rock version: 64% watching after 30 secs, 49% watching after 1 minute. Thus, 77% of the half-minute viewers continued watching for a full minute.

Both metrics show that people preferred the rock version.

In 2026, AI video will likely realize more of its potential to empower independent creators. The longer-term future could be more revolutionary, pivoting from linear storytelling to worldbuilding, in which users shift from passive observers to participants immersed in an environment specified by the creator. (Nano Banana Pro)

My Top 10 Videos of 2025

1. No More User Interface

AI products have evolved from enhancing classical user interfaces to becoming the primary means for users to engage with digital features and content. This may mean the end of UI design in the traditional sense, refocusing designers’ work on orchestrating the experience at a deeper level, especially as AI agents do more of the work.

2. Service as Software

(Also as a music video)

Over the upcoming decade, AI-provisioned intelligence will become almost free, and instantly available. AI won’t just assist professionals, it will take over much of the work as a packed instant service provider. Welcome to the age of boundless skill scalability, where services transform into software and economies grow beyond human imagination.

3. Making a Usability Action Figure

All in good fun. The action figure was first created as a still image using ChatGPT’s native image model, then animated with Kling 1.6. The video version seems more tangible than the still image, thanks to the 3D animation. I am pleased that computers and AI can help humans have fun and enjoy themselves. It doesn’t always have to be so serious.

4. Vibe Coding and Vibe Design

(Also as a music video)

AI transforms software development and UX design through natural language intent specification. This shift accelerates prototyping, broadens participation, and redefines roles in product creation. Human expertise remains essential for understanding user needs and ensuring quality outcomes, balancing technological innovation with professional insight.

5. UX in 2025: Jakob Nielsen’s 6 Big Themes

(Also as a music video)

The UX field is in transition. This avatar video explains Jakob Nielsen’s predictions for the 6 themes for the user experience profession in 2025.

6. UI vs. UX: Jakob Nielsen Explains the Difference

The key differences between User Interface (UI) and User Experience (UX). UI is the tangible elements users interact with, such as buttons, menus, and layouts, which could be graphical, gestural, or auditory. UX, however, encompasses a user’s overall feelings, interpretations, and satisfaction when using a product. Although UX is shaped by UI, it’s not directly designed but influenced through UI choices, product names, and marketing messages. Ultimately, AI is predicted to automate most UI design tasks, enabling human designers to focus more strategically on enhancing UX.

7. Recognition Rather than Recall (Jakob Nielsen’s Usability Heuristic 6)

Jakob Nielsen’s 6th usability heuristic, “Recognition Rather than Recall,” advises designers to minimize users’ memory load by making key information and options visible or easily accessible. In other words, interfaces should allow users to recognize elements (by seeing cues or prompts) instead of forcing them to recall information from memory.

8. Pivot Your UX Career for the AI Age

UX professionals only have a few years to adapt before AI reshapes everything. Legacy skills like wireframing are becoming obsolete as AI handles technical tasks. The future demands uniquely human abilities: Agency (taking initiative), Judgment (choosing from AI options), and Persuasion (selling ideas). Best join an AI-native company and abandon resistance to change. This transition period isn’t doom; it’s opportunity. We’re actively inventing UX’s future through experimentation. Time to trade legacy expertise for future relevance.

9. Error Prevention Explained by Vikings (Jakob Nielsen’s Usability Heuristic 5)

Prevent errors over recovery. Design with constraints (e.g., date pickers, dropdowns, smart defaults), not free text. Use inline validation and autocomplete as invisible guides. Confirm only destructive actions. Scale friction to risk: nudges for minor issues, speed bumps for irreversible ones. Prevention cuts support costs and builds user confidence through invisible craftsmanship.

10. AI Going Mainstream: Crossing the “Chasm” to Early Majority Users

(Also as an explainer video)

AI is surging from early-adopter novelty to everyday utility. Yet this transition is uneven across countries and use cases. In some cases, AI has already crossed Geoffrey Moore’s famed chasm between visionary early adopters and the pragmatic early majority, whereas in other cases, AI diffusion is much slower.

Bonus Videos: AI Helps Old Users Stay Creative

Available as both a music video and an avatar explainer. (I am particularly pleased with the music version: this is a case where the song lyrics explain the story better than the prose narration.)

The human aging process is detrimental to the brain, particularly degrading fluid intelligence, which peaks around age 20. Luckily, crystallized intelligence continues to increase, and the combination of the two means that creative knowledge workers are at their best around age 40. Then, downhill! AI changes this age-old picture but augmenting older creative professionals’ declining fluid intelligence, allowing them to make full productive use of their superior crystallized intelligence. AI extends the creative careers of older professionals by several decades.

(Sadly, my analysis of how AI helps seniors wasn’t very popular, but I think it’s important enough that I’ll list it here anyway! It’s my newsletter and my rules. YouTube’s analytics indicate that most of my audience is in the 25–44-year range, which may be why they don’t care about the user experience of older adults. Just you wait, brain decay is coming for you sooner than you think.)

Main Takeaway: UX Is Evolving. Are You?

There are many details in these videos, but my main message can be summarized in this infographic (NotebookLM):

About the Author

Jakob Nielsen, Ph.D., is a usability pioneer with 42 years experience in UX and the Founder of UX Tigers. He founded the discount usability movement for fast and cheap iterative design, including heuristic evaluation and the 10 usability heuristics. He formulated the eponymous Jakob’s Law of the Internet User Experience. Named “the king of usability” by Internet Magazine, “the guru of Web page usability” by The New York Times, and “the next best thing to a true time machine” by USA Today.

Previously, Dr. Nielsen was a Sun Microsystems Distinguished Engineer and a Member of Research Staff at Bell Communications Research, the branch of Bell Labs owned by the Regional Bell Operating Companies. He is the author of 8 books, including the best-selling Designing Web Usability: The Practice of Simplicity (published in 22 languages), the foundational Usability Engineering (29,538 citations in Google Scholar), and the pioneering Hypertext and Hypermedia (published two years before the Web launched).

Dr. Nielsen holds 79 United States patents, mainly on making the Internet easier to use. He received the Lifetime Achievement Award for Human–Computer Interaction Practice from ACM SIGCHI and was named a “Titan of Human Factors” by the Human Factors and Ergonomics Society.

Read the Full Post

The above notes were curated from the full post jakobnielsenphd.substack.com/p/2025-videos.

Related reading

More Stuff I Like

More Stuff tagged ai , artificial intelligence

See also: AI, chatGPT, LLM

Cookies disclaimer

MyHub.ai saves very few cookies onto your device: we need some to monitor site traffic using Google Analytics, while another protects you from a cross-site request forgeries. Nevertheless, you can disable the usage of cookies by changing the settings of your browser. By browsing our website without changing the browser settings, you grant us permission to store that information on your device. More details in our Privacy Policy.