AI This Week: Five Technologies That Do More Than Just Assist

6 mins

March begins with a clear trend: AI stepping beyond advice into action. Opera’s new Browser Operator navigates websites and completes purchases while you focus elsewhere. Google brings data science automation directly into Colab notebooks. Deutsche Telekom’s upcoming voice-controlled phone aims to eliminate app-switching altogether. Meanwhile, OpenAI’s surprisingly modest GPT-4.5 launch hints at bigger priorities on the horizon, while Podcastle enters the voice synthesis market with pricing that undercuts established rivals.

Opera Debuts AI Agent That Completes Web Tasks Automatically

Opera has introduced “Browser Operator“, an innovative AI assistant that actively executes online tasks based on user instructions. The functionality is straightforward yet powerful – users can direct the AI to find flights matching their specific requirements or handle other tedious online activities while they attend to different matters. When the AI finishes its assignment, it places selected items in the shopping cart for the user’s final review and payment.

The Browser Operator understands natural language instructions using the client’s local resources. It processes webpage information through the DOM Tree and browser layout data rather than capturing screenshots or videos of the browsing session. This approach allows it to access the entire page at once without needing to scroll through content, significantly reducing completion time.

Featured Image: Opera

The browser has incorporated essential protection features. When users need to input sensitive information like payment details, the Browser Operator pauses and allows direct interaction without processing this data through AI. Users maintain control throughout the process, with the ability to monitor progress, provide additional input when needed, or cancel operations at any point.

Opera intends to release the complete version in the coming months as part of its AI feature drop program, furthering its commitment to AI advancement following its fully AI-enabled browser launch in 2023 and its early adoption of large language models in the browser environment.

Deutsche Telekom Partners with Perplexity for New “AI Phone”

T-Mobile’s parent company, Deutsche Telekom, is entering the AI-focused smartphone market with plans to launch a dedicated “AI Phone” powered by Perplexity Assistant. The companies announced their partnership at the Mobile World Congress (MWC) in Barcelona, revealing that the new device will be released later this year under the “Magenta AI” ecosystem.

The AI phone was first previewed at MWC 2024 as an “appless” device that primarily relies on voice control for everyday tasks. According to Deutsche Telekom board member Claudia Nemat, the forthcoming device will enable users to book taxis and complete shopping without switching between different applications.

Beyond Perplexity’s AI-powered search capabilities, the Magenta AI ecosystem will integrate several other AI tools and services, including Google Cloud AI, ElevenLabs, and Picsart.

In an interesting development for existing smartphone users, Deutsche Telekom plans to make some of these AI features available through its MeinMagenta app. This will allow customers to access Perplexity’s AI assistant and other tools without purchasing the dedicated hardware. The company expects to roll out these app features this summer, ahead of the AI phone’s launch in the second half of 2025.

Featured Image: Deutsche Telekom

OpenAI Releases GPT-4.5 With Tempered Expectations

OpenAI has launched GPT-4.5, its newest and largest AI language model, which was initially available as a research preview for ChatGPT Pro subscribers. While the company describes it as their “most knowledgeable model yet,” they’ve taken the unusual step of downplaying expectations by clarifying that GPT-4.5 is not considered a frontier model compared to their specialized reasoning models like o1 and o3-mini.

According to OpenAI, GPT-4.5 offers enhanced writing capabilities, improved world knowledge, and what they describe as a “refined personality” that makes interactions feel more natural. The model reportedly excels at recognizing patterns and drawing connections, making it particularly effective for writing, programming, and practical problem-solving tasks.

A key technical achievement highlighted is GPT-4.5’s computational efficiency, improving GPT-4 by more than 10x. The company developed the model using new supervision techniques alongside traditional methods like supervised fine-tuning (SFT) and reinforcement learning from human feedback (RLHF).

One notable improvement is reduced hallucination rates compared to GPT-4o and slightly better performance than the o1 model. On social media, OpenAI CEO Sam Altman candidly described GPT-4.5 as a “giant, expensive model” while acknowledging it “won’t crush benchmarks.” This measured approach to the release suggests the company is carefully setting realistic expectations while reserving more significant advancements for the anticipated GPT-5 release.

This week, OpenAI has begun rolling out GPT-4.5 to ChatGPT Plus subscribers, following last week’s initial release to users of ChatGPT Pro ($200/month). The deployment will take 1-3 days. The model is also immediately available through Microsoft’s Azure AI Foundry platform.

Google Integrates Data Science Agent into Colab

Google has enhanced its cloud-based notebook tool, Google Colab, by adding a new AI agent called Data Science Agent. The integration aims to streamline data analysis workflows by helping users clean data, create visualizations, and extract insights from their datasets with minimal manual coding.

While Data Science Agent was initially launched as a standalone project following its announcement at Google’s I/O developer conference last year, the company has now decided to incorporate it directly into the Colab environment. According to Kathy Korevec, director of product at Google Labs, this integration allows users to access the agent’s capabilities within their Colab notebooks seamlessly.

The agent is powered by Google’s Gemini 2.0 AI model family and leverages specialized reasoning tools for data science tasks like feature engineering and data cleaning. It currently supports CSV, JSON, and .txt files under 1GB and can process approximately 120,000 tokens (about 480,000 words) in a single prompt.

Users upload their data and ask questions in natural language. The Data Science Agent creates fully functional, executable notebooks that users can modify, extend, and share with teammates using standard Colab sharing features. This allows researchers and data scientists to focus on deriving insights rather than wrestling with setup and boilerplate code. The agent can help with tasks beyond standard data analysis, including identifying API anomalies, analyzing customer data, and generating SQL code.

Data Science Agent is free to all Colab users, though free users will face Colab’s standard computational limitations. For those requiring more processing power, Google offers paid Colab plans starting at $9.99.

Weekly Tool Highlight: Podcastle

Podcastle has launched Asyncflow v1.0, its entry into the AI text-to-speech market. This new model powers an extensive library of AI voices and comes with developer tools for seamless integration.

Featured Image: Podcastle

Key Features

– Library of 450+ AI voices
– Developer API for application integration
– Cost-efficient training and inference architecture
– Competitive pricing at $40 per 500 minutes (compared to ElevenLabs’ $99)
– Streamlined voice cloning requiring only seconds of audio
– Integration with Podcastle’s Magic Dust AI for improved audio quality
– Unified platform for audio, video, podcasting, and AI narration tools

Podcastle joins companies like ElevenLabs, Speechify, and WellSaid in the text-to-speech market, which serves diverse applications, including marketing, advertising, content creation, education, and corporate training.

Keep ahead of the curve – join our community today!

Follow us for the latest discoveries, innovations, and discussions that shape the world of artificial intelligence.