Gemini 1. 5 pro

Gemini 1.5 Pro update makes AI listen and generate content

João Lucas Batista avatar
With the new update, Google's artificial intelligence now understands and analyzes audio. Imagen 2 can now add and remove image elements and create 4-second GIFs

A Google Artificial Intelligence: Gemini, received a new and innovative update this Tuesday, April 9th, during the Google Cloud Next. With the upgrade, the Gemini 1.5 Pro, received the ability to understand and analyze audio and video lines, producing content based on what is entered into the application.

Initially, the new feature is only available to users of the Vertex A.I, software aimed at machine learning (machine learning), used by programmers and scientists to develop new AIs.

What does Gemini, Google's AI, do?

Gemini 1. 5 pro update makes ia listen and generate content. With the new update, Google's artificial intelligence now understands and analyzes audio. Imagen 2 can now add and remove image elements and create 4-second gifs
Gemini interface. Image: Lucas Gomes/ Showmetech

O Gemini, Google's Artificial Intelligence, was launched in December last year, replacing Bard, and stands out for its ability to handle highly complex tasks, from coding to refined logical reasoning.

Artificial Intelligence has functions that can assist the user in the most diverse tasks, manipulating a wide variety of files, with the purpose of combining different types of information, in addition to organizing them. Its capabilities allow you to use different content formats, including:

  • texts;
  • images;
  • audios;
  • videos; It is
  • programming languages.

O Gemini It has three operating modes, with different specificities:

  • Gemini Ultra — larger and more capable for highly complex tasks;
  • Gemini Pro — best for scaling a wide variety of tasks;
  • Gemini Dwarf — more efficient for mobile tasks.

Gemini 1.5 Pro update and its new functions

Gemini 1. 5 pro update makes ia listen and generate content. With the new update, Google's artificial intelligence now understands and analyzes audio. Imagen 2 can now add and remove image elements and create 4-second gifs
Gemini IA. Image: rafares/Shutterstock)

This new model, presented by the technology giant, represents a significant advance compared to the previous one, with improvements in performance and understanding of long contexts. The Gemini 1.5 Pro, which is the initial version made available for testing, is optimized for a variety of tasks and is more efficient in terms of computation, being a more robust version and capable of meeting the requirements of even more complex activities.

Furthermore, the Gemini 1.5 Pro contains an experimental resource, which, in theory, could process up to 1 million tokens for large-scale base models, which will be revolutionary. According to Google, this immense amount represents 700.000 words and 30.000 lines of code, which is equivalent to one hour of video ou 11 hours of audio.

New tools in Gemini 1.5 Pro allow the application to reason between images (frames) and audio (speech) for videos uploaded on the Google AI Studio, which will facilitate content production. As per official information, Google's AI update is available in more than 180 countries through the Gemini API (Application Programming Interface, in Portuguese), with an unprecedented native ability to understand audio and a new API that facilitates file handling.

The release also features new system instructions and mode features. JSON (lightweight data format for exchanging information between computer systems). Believing in the potential of the new update, Google promises that the text embedding model outperforms competitors with similar functions.

O Gemini 1.5 Pro is currently only available through Vertex AI.

Imagen 2 can create GIFs

During the event Google Cloud Next, another important announcement was made by Google: the AI ​​model Imagen 2, which can generate images and short videos from prompts of text. With this, it is possible to create GIFs of up to four seconds from different camera angles and also show movement.

The difference with this tool is precisely the possibility of exploring different angles, with more dynamism in the scenes, far beyond AI videos generated with static photos and limited movements.

Example of creations from Imagen 2 in Vertex IA. Video: Google Cloud/ YouTube

O Imagen 2 has the ability to produce video clips, also known as live images, at a low resolution 640 x 360. Furthermore, Google is using its technique SynthID to apply a invisible watermark in AI-generated clips and images. The company claims that the SynthID can support edits and even compression, measures that aim to promote data security.

To date, the resources of Imagen 2 are only available through Vertex AI, which now includes support for internal and external painting, as well as the ability to edit images using AI, allowing you to expand the borders or add/remove specific parts of the image. Tools aimed at marketing professionals and content creation for campaigns, among other advertising pieces and video platforms.

Vertex AI

Gemini 1. 5 pro update makes ia listen and generate content. With the new update, Google's artificial intelligence now understands and analyzes audio. Imagen 2 can now add and remove image elements and create 4-second gifs
Vertex AI Platform. Image: Google/Reproduction

O Vertex AI is a platform for machine learning (ML) that enables the training and deployment of AI tools and applications, including the customization of large language models (LLMs) for use in AI-powered applications.

The platform compiles Google's diverse capabilities and applications, integrating data engineering, data science, and data engineering workflows. machine learning, enabling collaboration between teams through a common set of tools, as well as scaling applications with the benefits of Google Cloud.

A Vertex A.I offers several options for training and deploying models:

  • AutoML allows you to train tabular, image, text or video data without the need to write code, or prepare data splits.
  • Personalized training gives you full control over the training process, including the use of framework preferred ML code, own training coding, and selection of hyperparameter tuning options.
  • model garden Enables discovery, testing, customization, and deployment of Vertex AI models, including model selection and open source resources (OSS).
  • A Generative AI offers access to Google's large generative AI models in multiple modalities (text, code, images, speech). You can tune Google LLMs to meet your needs and deploy them for use in your AI-powered applications.

Source: The Verge, Google for Developers, Tom's guide, Beebom, GoogleCloud.

See also:

reviewed by Glaucon Vital in 10 / 4 / 24.

Sign up to receive our news:

Leave a comment

Your email address will not be published. Required fields are marked with *

Related Posts