AI This Week: Sora 2, Canada’s New AI Task Force, Microsoft Vibe Working and More

The AI wars escalated sharply this week. OpenAI launched Sora 2 with a standalone social app, positioning video generation not as a creative tool but as a new medium for communication itself. The model’s ability to accurately simulate real-world physics marks what OpenAI calls “the GPT-3.5 moment for video.” But the company can’t claim the spotlight alone: Canada’s government is racing to finalize a national AI roadmap in just 30 days, Microsoft is rolling out Agent Mode that automates entire spreadsheets and documents, and Anthropic just shipped what may be the strongest coding model yet. No single announcement dominates. Instead, the week underscores how quickly the battleground is expanding across consumer apps, enterprise productivity, national infrastructure, and developer tooling.

Listen to the AI-Powered Audio Recap

This AI-generated podcast is based on our editor team’s AI This Week posts. We use advanced tools like Google Notebook LM, Descript, and Elevenlabs to turn written insights into an engaging audio experience. While the process is AI-assisted, our team ensures each episode meets our quality standards. We’d love your feedback—let us know how we can make it even better.

🇨🇦 Canadian Innovation Spotlight

Canada’s New AI Task Force

Canada’s AI strategy is about to get a rapid refresh. Minister Evan Solomon formally unveiled the federal government’s new AI Strategy Task Force, a 26-member group of academics, entrepreneurs, and industry leaders who now have just 30 days to shape the country’s next national roadmap.

The roster pulls from across Canada’s AI ecosystem: Cohere’s Joelle Pineau, Inovia Capital’s Patrick Pichette, Build Canada’s Dan Debow, and Council of Canadian Innovators president Ben Bergen join names from academia like Michael Bowling of Google DeepMind and the University of Alberta, Gail Murphy of UBC, and Mary Wells of Waterloo. Representation also stretches into policy, security, and corporate innovation, signalling an effort to balance perspectives.

Solomon’s message at Toronto’s Empire Club and earlier at the ALL IN conference was clear: AI is “the second great technological revolution in the last quarter-century,” and Canada needs to move quickly if it wants to lead. The group’s accelerated timeline reflects that urgency. The task force will focus on research, commercialization, safety, education, infrastructure, and security, feeding into a strategy the minister pledged to table before year-end — a full year ahead of the original schedule.

Beyond the task force, the government also opened public consultations on October 1, extending the conversation to Canadians outside the room. With $2.4 billion in AI funding already committed in the 2024 budget, the pressure now falls on this group to provide the policy framework that can channel investment into compute, capital, and trust.

💼 Productivity & Consumer Tools

Microsoft Brings “Vibe Working” to Office

Microsoft is extending the idea of vibe coding into everyday productivity. The company announced a new Agent Mode for Excel and Word that can generate full spreadsheets and documents from a single prompt, alongside an Office Agent in Copilot chat that produces entire slide decks and reports on demand.

Agent Mode works by breaking down a task into smaller, auditable steps that play out in real time in the sidebar, more like watching an automated macro than a one-off AI answer. In Excel, that means building complex sheets that remain refreshable and verifiable, while in Word, it turns drafting into an ongoing dialogue where Copilot refines, clarifies, and suggests changes as you go. Early benchmarks place Agent Mode ahead of most competitors in accuracy, though still trailing human performance.

The bigger reveal may be Office Agent inside Copilot chat. Powered by Anthropic’s models, it can spin up entire PowerPoint decks or Word documents while pulling in live research. This move underscores Microsoft’s willingness to layer in more than just OpenAI’s models, mixing strengths from different providers to build its ecosystem.

For now, Agent Mode is available in the web versions of Excel and Word through the Microsoft 365 Copilot Frontier program, with desktop support on the way. Office Agent is also rolling out in the US to Copilot users, signalling Microsoft’s push to keep Office ahead in a crowded field of AI productivity tools.

Opera Launches AI-Centric Neon Browser

Opera has introduced Neon, a subscription-based browser built around AI from the ground up. For $19.99/month, Neon goes beyond traditional search and offers features like Neon Do, which can automate tasks such as summarizing an article and posting it to Slack, or pulling details from content you browsed days earlier.

One of Neon’s standout elements is Cards, which are reusable prompts that function like modular commands. Similar to The Browser Company’s “Skills,” cards let you chain together actions like building comparison tables or extracting details across tabs, creating an IFTTT-style workflow for AI tasks.

The browser also introduces Tasks, self-contained workspaces that combine tabs with AI chats, echoing Arc Browser’s workspaces but with agentic capabilities layered in. Neon can even generate code snippets, turning browsing sessions into mini app-building environments.

With these features, Opera positions Neon as a power-user tool, setting it apart from free AI experiments in Chrome, Edge, or Arc. The challenge now will be proving its value beyond demos and convincing users to pay for a browser in a crowded, competitive space.

YouTube Music Experiments with AI Hosts

YouTube Music is testing AI hosts that layer trivia, backstories, and commentary on top of the tracks you’re listening to. It’s an experiment that echoes Spotify’s AI DJ, but with a stronger emphasis on fan knowledge and contextual storytelling.

The feature is part of YouTube Labs, a new hub where users can try early-stage AI products. Unlike Google Labs, which tends to focus on standalone tools, YouTube Labs is centred on integrating AI into the music and video experience. Access is limited to U.S. participants for now, but it doesn’t require a Premium subscription, and anyone can sign up if invited.

This latest test follows a string of recent AI additions: conversational radio that builds stations from natural-language prompts, generative tools for Shorts, and an AI-powered search carousel for discovery. At the same time, YouTube has been updating its policies to curb “AI slop,” restricting monetization of repetitive, low-quality generated content.

Taken together, YouTube is pushing AI deeper into how people find, enjoy, and interact with music, while signalling that the platform wants creativity and commentary, not mass automation, to shape its AI future.

🤖 OpenAI’s Expanding Agenda

OpenAI Launches Sora 2

OpenAI’s biggest release of the year is here: Sora 2, a next-generation video and audio model, paired with a new social platform that rivals TikTok.

Sora 2 represents a major step forward in realism and control. Earlier models often bent reality to satisfy prompts, with teleporting basketballs or impossible acrobatics. Sora 2 can now model both success and failure. Missed shots bounce off the backboard, paddleboards react with real buoyancy, and Olympic-level stunts carry convincing physics. The model also follows complex instructions across multiple shots and generates dialogue, background sound, and effects with cinematic quality.

The Sora app is launching alongside the model. Available on iOS in the United States and Canada, it introduces a TikTok-style feed where users can generate, remix, and share AI videos. Its standout feature is cameos, which allow you to upload a short video and audio clip to capture your likeness, then drop yourself into any Sora scene. You can also share your cameo with friends so they can create videos that include you together.

OpenAI says the app is designed to prioritize creation over passive scrolling. Users can personalize their feeds with natural language instructions, and parental controls allow families to set limits on infinite scroll, algorithmic personalization, and messaging. At launch, the app is free, with paid options only appearing if demand exceeds available compute.

The combination of generative video and a social platform raises important safety questions. OpenAI allows users to revoke cameo permissions at any time, but likeness-sharing could be misused to generate harmful or deceptive content. As with ChatGPT, the company will need to show that its safeguards can keep pace with rapid adoption.

Together, Sora 2 and the Sora app mark OpenAI’s clearest step beyond chat.

OpenAI Introduces ChatGPT Pulse

OpenAI is rolling out ChatGPT Pulse, a feature that flips the chatbot dynamic on its head by generating personalized morning briefs while you sleep. Instead of waiting for prompts, Pulse delivers five to ten reports that range from curated news to trip itineraries with the goal of making ChatGPT the first app you check in the morning.

Pulse is part of OpenAI’s shift toward asynchronous AI products that act more like personal assistants than reactive chatbots. Early demos showed a wide range of briefs: a soccer news roundup, Halloween costume suggestions, even family travel itineraries. Each appears as a card with AI-generated visuals and source links, and users can query ChatGPT further for context. Importantly, Pulse is designed with a stopping point (“Great, that’s it for today”), a deliberate counter to infinite-scroll feeds.

Two screenshots of a mobile app interface. Left shows 'Today’s pulse' with London travel tips and trail run ideas. Right shows a travel preference survey with options like family-friendly hotels, one-of-a-kind experiences, and affordable options. — Featured Image: OpenAI

For now, the feature is exclusive to the $200/month Pro tier, reflecting its heavy compute demands. OpenAI says Plus subscribers will follow once the system becomes more efficient, and a broader rollout will depend on scaling new AI data centers with partners like Oracle and SoftBank. Pulse also integrates with Connectors like Gmail and Google Calendar, pulling in emails, agendas, and personal context from ChatGPT’s memory to shape more relevant updates.

The bigger question is whether Pulse edges into the territory of existing news and productivity apps. By citing sources and staying tightly scoped, OpenAI is trying to frame it less as a replacement and more as a personalized layer. Long-term ambitions hint at more agentic capabilities such as handling reservations or drafting emails, but for now, Pulse is an experiment in making ChatGPT not just reactive, but proactive.

GPT-5 Put to the Test Against Human Experts

OpenAI has introduced GDPval, a new benchmark designed to gauge how its models stack up against professionals across key industries. The first version, GDPval-v0, tested 44 occupations in nine sectors that make up the bulk of the U.S. economy, from healthcare and finance to manufacturing and government.

The setup was simple: experienced professionals compared reports written by humans and AI models without knowing which was which, then chose the stronger result. In these head-to-head trials, GPT-5-high, which is a scaled-up version of the model with extra compute, matched or outperformed experts in about 40% of cases. Anthropic’s Claude Opus 4.1 edged slightly higher at 49%, though OpenAI argued that Claude’s tendency to produce visually polished outputs may have boosted its score.

For perspective, GPT-4o, which was released just 15 months earlier, managed only 13.7% on the same test. The leap underscores how quickly large models are closing the gap in tasks that involve structured analysis and written reporting. Still, OpenAI is quick to note that GDPval today measures only a narrow slice of work: creating reports, not managing teams, interacting with clients, or handling the unpredictable edge cases that make up much of real-world jobs.

Even with those limits, OpenAI sees GDPval as a step toward its mission of measuring progress toward AGI. Its chief economist, Aaron Chatterji, framed the results as a productivity signal: if models can already take on some research and reporting, professionals may be freed to focus on higher-value responsibilities. Evaluations lead Tejal Patwardhan pointed out that the pace of improvement suggests this trend will only accelerate, and that new, more comprehensive benchmarks will be needed as models expand into more interactive, real-world workflows.

OpenAI Tests Safety Routing and Parental Controls

OpenAI is introducing new safeguards in ChatGPT, rolling out a safety routing system and parental controls that have sparked both praise and criticism. The changes come amid heightened scrutiny, including a wrongful death lawsuit alleging that the chatbot reinforced a teenager’s harmful delusions before his suicide.

1/ We’ve started testing a new safety routing system in ChatGPT.

As we previously mentioned, when conversations touch on sensitive and emotional topics the system may switch mid-chat to a reasoning model or GPT-5 designed to handle these contexts with extra care. This is similar…
— Nick Turley (@nickaturley) September 27, 2025

The routing system detects emotionally sensitive conversations and temporarily switches responses to GPT-5-thinking, a model variant trained with “safe completions” that aim to respond more responsibly to high-stakes prompts. Unlike earlier models, particularly GPT-4o, which became known for its overly agreeable style, GPT-5-thinking is designed to provide supportive but grounded answers. Users can see which model is active if they ask, and OpenAI has set a 120-day window to refine the system based on real-world feedback.

Alongside routing, parental controls are now available for teen accounts. Parents can set quiet hours, disable voice mode and memory, restrict image generation, and opt out of model training. The controls also introduce additional content filters and even a detection system that flags signs of possible self-harm. If risk signals appear, OpenAI may alert parents directly and, in extreme cases, escalate to emergency services.

The reception has been divided. Some users and safety advocates welcome the move toward stronger protections, while others argue that the measures risk infantilizing adults and diminishing the fluidity that made earlier models appealing. For OpenAI, the challenge will be balancing these competing expectations while ensuring the system does more good than harm.

⚙️ New Models & Capabilities

Anthropic Debuts Claude Sonnet 4.5

Anthropic has released Claude Sonnet 4.5, positioning it as the company’s strongest model yet for software engineering. Unlike earlier iterations that often felt best suited for prototypes, this version is pitched as capable of building production-ready applications from writing code to standing up infrastructure and even performing compliance checks.

The model is already drawing endorsements from developer-focused companies. Cursor’s CEO called it state-of-the-art on long-horizon coding tasks, while Windsurf’s leadership described it as a new generation of coding models. Early enterprise trials reportedly saw Claude run autonomously for up to 30 hours, handling tasks like database setup, domain registration, and security audits.

On benchmarks, Claude Sonnet 4.5 delivers top results on SWE-Bench Verified and others, though Anthropic notes that standardized tests can’t fully capture its longer-run reliability. Pricing remains the same as Claude Sonnet 4, keeping it accessible to developers at $3 per million input tokens and $15 per million output tokens.

The release also introduces the Claude Agent SDK, opening up the same infrastructure that powers Claude Code to outside developers. And for Max subscribers, Anthropic is previewing “Imagine with Claude,” a research demo that generates software live in response to user requests. Combined, these moves signal Anthropic’s bid to reinforce its reputation as the coding-first AI company, even as OpenAI’s GPT-5 encroaches on that territory.

AI This Week: Sora 2, Canada’s New AI Task Force, Microsoft Vibe Working and More

Listen to the AI-Powered Audio Recap

🇨🇦 Canadian Innovation Spotlight

Canada’s New AI Task Force

💼 Productivity & Consumer Tools

Microsoft Brings “Vibe Working” to Office

Opera Launches AI-Centric Neon Browser

YouTube Music Experiments with AI Hosts

🤖 OpenAI’s Expanding Agenda

OpenAI Launches Sora 2

OpenAI Introduces ChatGPT Pulse

GPT-5 Put to the Test Against Human Experts

OpenAI Tests Safety Routing and Parental Controls

⚙️ New Models & Capabilities

Anthropic Debuts Claude Sonnet 4.5

Keep ahead of the curve – join our community today!

OpenAI DevDay 2025: What’s New for Building with AI

Canada Is the New Destination for AI Innovation

AI This Week: Sora 2, Canada’s New AI Task Force, Microsoft Vibe Working and More

Listen to the AI-Powered Audio Recap

🇨🇦 Canadian Innovation Spotlight

Canada’s New AI Task Force

💼 Productivity & Consumer Tools

Microsoft Brings “Vibe Working” to Office

Opera Launches AI-Centric Neon Browser

YouTube Music Experiments with AI Hosts

🤖 OpenAI’s Expanding Agenda

OpenAI Launches Sora 2

OpenAI Introduces ChatGPT Pulse

GPT-5 Put to the Test Against Human Experts

OpenAI Tests Safety Routing and Parental Controls

⚙️ New Models & Capabilities

Anthropic Debuts Claude Sonnet 4.5

Keep ahead of the curve – join our community today!

OpenAI DevDay 2025: What’s New for Building with AI

Canada Is the New Destination for AI Innovation

Subscribe to our newsletter: