Google Veo 3 is coming to Canva: generative AI is adding sound to your videos

A historic step forward for the audiovisual industry

In May 2025, DeepMind (a subsidiary of Google) unveiled Veo 3, an innovative AI-based video generation model. Capable of producing short 4K video clips with integrated audio (voice, sound effects, and music), Veo 3 represents a technological breakthrough in the audiovisual field¹. In just a few weeks, traffic on specialized platforms surged by 162%, demonstrating the creative community’s immediate and massive interest in this new capability.². This breakthrough marks the end of the era of AI-generated silent videos and paves the way for more immersive and accessible audiovisual creation.

Multimodal technology: text, images, video, and audio

Veo 3 is based on a hybrid broadcast-transformer architecture, optimized to maintain visual consistency across long sequences. One of the model’s major strengths is its multimodal capability: it accepts text prompts, as well as still images or video clips as input, enabling it to reproduce a specific style or mood. Veo 3 also integrates camera controls (zoom, pan, drone), as well as advanced physics simulation —light, shadows, fluids, and textures—ensuring realistic, professional rendering.³.

Specific, concrete applications

Veo 3 applications can be found in several sectors:

Film & advertising: ultra-realistic 4K VFX production, at up to 99% lower cost than traditional methods, enables directors and advertisers to create prototypes and teasers at a lower cost⁴.
Video games: Veo 3 simplifies the creation of immersive cinematics for trailers or intros, reducing production costs and speeding up time-to-market.
Social media: Creators can now produce short videos with narration, boosting engagement by 30%, which demonstrates the added audiovisual value on platforms like Instagram and TikTok⁵.
Education & e-learning: Veo 3 enables the creation of multimodal educational content (voice-over animations, animated scientific demonstrations), making learning more visual and auditory, and therefore more effective.
E-commerce & branding: Companies can quickly create animated product videos with narration, boosting conversion rates through more engaging content.

Technical limitations and ethical challenges

Despite its advances, Veo 3 has certain limitations:

Limited video duration (approx. 8 seconds at 720p) in the basic package. Longer 4K versions are currently in development, but are reserved for Gemini Ultra subscribers or via the Vertex AI API.⁶.
Audio synthesis is still imperfect, particularly in terms of natural intonation, lip-syncing, and complex emotions, often requiring post-production adjustments.⁷.
The risk of deepfakes: the ease with which realistic visuals can be generated raises ethical questions. Google offers an invisible SynthID watermark and moderation tools, but potential abuses require legal and technical vigilance.⁸.
High cost and limited accessibility: The Gemini Ultra subscription, priced at $249 per month, is only available to studios and large companies, leaving independent creators waiting for more affordable options.

Tomorrow’s Skills for Designers

With the arrival of Veo 3, the video industry is changing:

Creative design prompt: Write a clear, visual brief to guide the AI toward the desired outcome.
Video and audio post-production: adjust the generated sequences (editing, color correction, lip-syncing) for professional rendering.
Technical understanding: Understand AI mechanisms (pipeline, format management, watermarking) to better integrate the tool into the workflow.
Ethics and regulation: master the legal principles governing image rights, privacy protection, and the responsible use of audiovisual content.

These hybrid skills, which bridge the gap between art, digital technology, and ethics, are essential for Veo 3 to reach its full potential.

Veo 3: Toward Hybrid and Collaborative Professions

By 2030, audiovisual production will rely on hybrid teams with a wide range of skills:

The executive producer, who guides the vision and ensures narrative consistency.
The prompt engineer, trained in AI language models to guide multimodal content creation.
The IA sound designer, who is responsible for sound quality and lip-syncing.
The content ethicist, ensuring the responsible use of images and data.
An AI technician responsible for model integration, deployment, and maintenance.

This organization will foster a creative synergy that is faster, more collaborative—and above all, more human.

Ethics & Responsibility: A Competitive Advantage

More than just a technical issue, ethics is becoming a key factor in building trust:

Content traceability: The SynthID watermark identifies the origin of generated videos.
Transparency and control: managing prompts and the AI pipeline ensures a controlled, compliant narrative.
Combating misinformation: by combining watermarking, moderation, and contextual verification, technology can limit the spread of deepfakes.
Inclusive creation: Veo 3 democratizes access to professional-quality content, promoting diversity of voices and styles in audiovisual production.

These measures position Veo 3 and its creators as responsible stewards of the future of content.

People are still in control

Veo 3 does not mark the end of the director’s or designer’s career; on the contrary, it enhances it. By automating technical tasks, AI offers gains in time, creativity, and precision.
For this transformation to be successful, several conditions must be met:

A clear, ethical framework that includes watermarking, traceability, and up-to-date regulations.
Improving the skills of those involved in audiovisual production.
An ongoing dialogue among technicians, lawyers, artists, and audiences.

In this way, AI becomes a partner, not a substitute—ensuring creativity that is enhanced, responsible, and rooted in human intent.

References

1. Wikipedia. (2025). Veo (video text template).
https://fr.wikipedia.org/wiki/Veo_%28mod%C3%A8le_texte-vid%C3%A9o%29

2. Reuters. (2025). Veo 3 generates a 162% spike in traffic.
https://www.aibase.com/news/19041

3. DeepMind Blog. (2025). Veo 3: integrated audio and 4K rendering.
https://veo3.im/blog/deepmind-veo3

4. Veo3.io. (2025). Cinema & advertising use.
https://www.veo3.io/fr

5. Veo3.io. (2025). Cinema & advertising use.
https://www.veo3.io/fr

6. Tom’s Guide. (2025). Limited duration in standard version.
https://www.tomsguide.com/

7. Medium. (2025). Audio synthesis: progress and limits.
https://medium.com/

8. The Verge. (2025). SynthID & the fight against deepfakes
https://www.theverge.com/

When artificial intelligence revolutionizes video: Veo 3, augmented cinema

A historic step forward for the audiovisual industry

Multimodal technology: text, images, video, and audio

Specific, concrete applications

Technical limitations and ethical challenges

Tomorrow’s Skills for Designers

Veo 3: Toward Hybrid and Collaborative Professions

Ethics & Responsibility: A Competitive Advantage

People are still in control

References

Don't miss our upcoming articles!

Get the latest articles written by aivancity experts and professors delivered straight to your inbox.

Don't miss our upcoming articles!

Get the latest articles written by aivancity experts and professors delivered straight to your inbox.

Leave a comment Cancel reply

About aivancity

Blog

Contact us

When artificial intelligence revolutionizes video: Veo 3, augmented cinema

A historic step forward for the audiovisual industry

Multimodal technology: text, images, video, and audio

Specific, concrete applications

Technical limitations and ethical challenges

Tomorrow’s Skills for Designers

Veo 3: Toward Hybrid and Collaborative Professions

Ethics & Responsibility: A Competitive Advantage

People are still in control

References

Don't miss our upcoming articles!

Get the latest articles written by aivancity experts and professors delivered straight to your inbox.

Don't miss our upcoming articles!

Get the latest articles written by aivancity experts and professors delivered straight to your inbox.

Related posts

From ChatGPT to an intelligent browser: OpenAI takes artificial intelligence one step further

$200 a month for conversational AI: Perplexity's strategic gamble

ChatGPT-5 is coming this summer: what's new for users?

Leave a comment Cancel reply

About aivancity

Blog

Contact us