AI This Week: New Developments in Voice Agents, Reasoning Models, and Regulation

6 mins

This week’s AI roundup features Google’s “Daily Listen” for personalized podcast news feeds; NovaSky’s cost-effective AI model, Sky-T1; ChatGPT’s new functionality that enables users to set reminders and schedule recurring tasks; OpenAI’s proposed economic blueprint for AI regulation; and Deepgram’s Voice Agent API, which allows developers to create sophisticated voice AI interactions. Unpack these stories to understand the ongoing evolution of AI and its expanding role in different sectors.

Google Introduces Daily Listen

Google is testing its new “Daily Listen” feature, redefining how we stay updated on our favourite topics. This innovative tool generates personalized, AI-powered podcasts directly from your Google Discover feed, ensuring you’re always in the loop with news tailored just for you. Daily Listen is designed for those deeply invested in their interests who want their news delivered quickly and engagingly.

Featured Image: Google

Available exclusively to U.S. users who are part of Google’s Search Labs experiment, this feature crafts episodes up to five minutes long. Each podcast summarizes the stories users are most interested in and links to related articles. Integrating with Google Discover means the content is finely tuned to the listener’s preferences, making every podcast feel like it’s made just for them.

Like NotebookLM’s Audio Overviews, Daily Listen uses AI to create a seamless audio experience. It includes a rolling written transcript that can be scrolled through as the audio is listened to. Users can access this feature directly from a new “Daily Listen” section on their Google app home screen, which includes a play button and episode duration.

This feature promises a unique blend of convenience and customization for those with access, transforming how we consume news.

NovaSky’s Sky-T1 Model Sets New Affordability Standard

In a notable advancement within the world of artificial intelligence, researchers from UC Berkeley’s Sky Computing Lab have introduced Sky-T1-32B-Preview, a cutting-edge reasoning AI model that challenges traditional cost barriers. Developed under the banner of NovaSky, this model is touted as the first to be fully open-source, including both the dataset and the training code necessary for replication.

What sets Sky-T1 apart is its accessibility and cost-effectiveness. Remarkably, by leveraging synthetic training data, which significantly reduced the financial outlay for development, the model was trained for less than $450, a fraction of the millions typically required to train comparable models.

Sky-T1 excels in high-level reasoning, making it reliable in physics, science, and mathematics. This model self-verifies its reasoning, adding a layer of trustworthiness often absent in conventional AI. With a 32-billion-parameter setup, Sky-T1 was trained using 8 Nvidia H100 GPUs over 19 hours, showcasing its efficiency.

In performance tests, Sky-T1 outperformed an early version of OpenAI’s o1 in solving competition-level math problems and complex coding challenges from LiveCodeBench. However, it still trails behind in areas requiring deep physics, biology, and chemistry knowledge.

The NovaSky team is committed to further enhancing the efficiency and accuracy of Sky-T1, promising more breakthroughs in developing cost-effective, powerful reasoning models. 

ChatGPT Adds Task Scheduling to Its Capabilities

OpenAI has introduced a significant update to ChatGPT, enabling paying users to schedule reminders and set recurring tasks. This new functionality, part of a beta feature rollout called “tasks,” is available globally to ChatGPT Plus, Team, and Pro users.

What You Can Do with Tasks

  • Set Reminders: Users can schedule reminders for upcoming meetings or appointments. They could set a task like, “Remind me about my weekly team meeting every Tuesday at 10 AM, which the AI will flag via push notifications on enabled platforms.
  • Recurring Requests: ChatGPT can now handle requests like providing weekly weekend plans or daily news briefings at specified times tailored to the user’s location and preferences.
Featured Image: OpenAI/ChatGPT

The tasks feature is accessible through the “4o with scheduled tasks” option in the ChatGPT dropdown menu. Users can manage their tasks directly through a chat interface or a dedicated tasks manager tab available only in the web app. While currently limited to non-purchase actions, this update marks a significant step towards more autonomous AI agents.

During this beta phase, OpenAI aims to gather user feedback on the tasks feature to refine and expand its functionality. The company is also preparing to introduce more sophisticated systems, such as the Operator agent, which could manage tasks like writing code and booking travel.

OpenAI’s Blueprint for AI Regulation: A Call for Collaborative Policy Development

OpenAI has recently unveiled its “economic blueprint,” a comprehensive proposal outlining the policies the company believes are essential for the U.S. to maintain its leadership in AI innovation while safeguarding national security. This blueprint, presented as a document, seeks collaborative efforts with the U.S. government and its allies to establish a cohesive regulatory framework for AI development.

Key Aspects of the Blueprint

  • Investment in Infrastructure: The blueprint emphasizes the need for substantial investments in chips, data centres, energy, and talent to bolster the AI sector significantly.
  • State vs. Federal Regulation: It highlights the challenges of disparate state-level AI regulations, which often conflict, making a strong case for a unified federal strategy.
  • International Collaboration and Export Controls: OpenAI proposes developing best practices for AI model deployment and streamlining engagement with national security agencies. It also suggests establishing export controls that foster collaboration with allies while restricting access to adversarial nations.
  • Innovation and Copyright: The document addresses the contentious issue of copyright in AI development, advocating for policies that allow AI to utilize publicly available information while protecting creators’ rights against unauthorized use.
  • Voluntary Pathways and Partnerships: OpenAI recommends creating voluntary pathways for AI companies to collaborate with the government on model evaluations and safety standards, echoing approaches taken in recent executive orders.

The blueprint also reflects OpenAI’s ongoing efforts to influence policy through increased lobbying and strategic hires from government sectors. As the company expands its lobbying and forms key partnerships, such as those with the Pentagon and startups like Anduril, it reinforces its position as a pivotal player in shaping the future of AI governance.

Weekly Tool Highlight: Deepgram’s Voice Agent API

This week, we’re featuring Deepgram’s Voice Agent API, a comprehensive platform that empowers developers to integrate advanced voice AI into their applications. Deepgram offers robust APIs for speech-to-text, text-to-speech, and language understanding, catering to several applications, from medical transcription to fully autonomous voice agents.

Featured Image: Deepgram

Key Features of Deepgram’s Voice Agent API:

  • Unified Voice-to-Voice API: Enables natural-sounding conversations between humans and machines, ideal for creating responsive voice AI experiences.
  • Speech to Text: Provides transcription services with unmatched accuracy, offering fast response times at competitive costs.
  • Audio Intelligence: Delivers advanced insights with enterprise-scale audio analysis, enhancing interaction quality and data extraction.
  • Text to Speech: Features lightning-fast, humanlike voice synthesis suitable for real-time AI interactions and high-throughput applications.

Explore and Experiment:

  • Playground: Test the capabilities of Deepgram’s APIs with sample audio files or your own text to see how the audio understanding models perform in real-world scenarios.
  • Community Support: Join a thriving community of over 2,000 members to get insights, support, and answers to more than 1,300 questions.

Keep ahead of the curve – join our community today!

Follow us for the latest discoveries, innovations, and discussions that shape the world of artificial intelligence.