Llama 3 brings features like lip-sync video translation, celebrity voices, and more to Meta AI. Now multimodal, Meta AI can see and speak, understanding images, tables, and graphs, as well as converse naturally with the user. Understand

Llama 3.2 brings features like lip-sync video translation, celebrity voices and more to Meta AI

lucas gomes avatar
Now multimodal, Meta's artificial intelligence can see and speak, understanding images, tables and graphs, as well as conversing naturally with the user. Understand

Today (25) happened the Goal Connect 2024, an event by the company responsible for Facebook, Instagram and WhatsApp, with the announcement of its technological innovations, presented by the company itself Mark Zuckerberg. Beyond holographic glasses Orion, we still saw news about the Llama 3.2 and the Meta AI — which integrates all of the company's social networks. See now the highlights on Meta's artificial intelligence

What's new with Llama 3.2

Small and medium-sized vision llms (11b and 90b) and lightweight text-only templates (1b and 3b) that adapt to mobile devices. Image: meta
Small and mid-sized vision LLMs (11B and 90B) and lightweight text-only models (1B and 3B) that adapt to mobile devices. Image: Meta

The two biggest new models in the collection llama 3.2 with 11B and 90B (11 and 90 billion, respectively) parameters, stand out for their support for visual reasoning tasks, such as understanding complex documents, including tables and graphs, as well as image captions and identifying objects in visual scenes based on natural language descriptions.

A practical example involves Llama 3.2's ability to analyze graphs to quickly answer questions about a company's sales performance in a given month. In another case, the model might interpret maps, indicating when a trail becomes steeper or the distance of a specific route. These advanced models also connect vision and language, being able to extract details from an image and generate captions to describe the scene.

Meta also announced lighter models, 1B and 3B parameters, for smaller devices such as smartphones and smart glasses. These were made for multilingual text generation and automated command execution. They enable the development of customizable applications that run directly on devices, ensuring complete privacy since data is not sent to the cloud. These applications can summarize incoming messages and identify important items to send calendar invites directly, using the tool's call functionality.

Running models locally has two main advantages: almost instantaneous responses, due to direct on-device processing, and greater privacy, by avoiding sending sensitive data to the cloud. This allows control over which queries remain on the device and which can be processed by larger models in the cloud to be done in a clear and secure way.

This work was supported by our partners across the AI ​​community. We would like to thank and acknowledge (in alphabetical order): Accenture, AMD, Arm, AWS, Cloudflare, Databricks, Dell, Deloitte, Fireworks.ai, Google Cloud, Groq, Hugging Face, IBM watsonx, Infosys, Intel, Kaggle, Lenovo, LMSYS, MediaTek, Microsoft Azure, NVIDIA, OctoAI, Ollama, Oracle Cloud, PwC, Qualcomm, Sarvam AI, Scale AI, Snowflake, Together AI, and UC Berkeley – vLLM Project.

Meta's Thank You on Your Website

New Meta AI Features

And the news doesn't stop there! Meta AI will benefit from the following new features:

Voices on WhatsApp, Instagram, Facebook and Messenger

Meta invites its users to test the new AI voices of celebrities. Image: meta vo llama 3
Meta invites its users to test out the new AI voices of celebrities. Image: Meta

Mark Zuckerberg announced a new update to Meta's AI assistants, which will now feature celebrity voices such as Dame Judi Dench, John Cena, Awkwafina, Keegan Michael Key and Kristen Bell. The idea is to make interaction more natural and fun, offering a personalized experience on platforms such as Facebook, Messenger, WhatsApp e Instagram.

In addition to the new voices, one of the most significant innovations is the ability of AI models to interpret photos and other visual information from users, expanding interaction alternatives and offering even more contextual and relevant responses for each user.

View, explain and edit images

Users will be able to upload photos and request edits from the AI. Image: meta
Users will be able to upload photos and request edits from the AI. Image: Meta

Meta AI's editing capability has also been expanded, allowing it to process visual information. It will now be possible to take a photo of a flower while walking and ask Meta AI to identify it and/or explain more about it, or do the upload an image of a dish and receive the corresponding recipe.

Users will also be able to do detailed edits to your real photos using everyday language commands, such as adding or removing elements. Previously, this feature only worked on images generated by Meta AI, but is now available for photos taken by users, facilitating custom adjustments.

with the function Imagine of Meta AI, it will be possible to insert yourself into stories, feed posts and even your profile picture on Facebook and Instagram, sharing AI generated selfies interactively. AI can also suggest captions for your Instagram and Facebook stories. Simply choose an image, and Meta AI will suggest several caption options, making it easy to choose the one that best suits your post.

Lip-sync dubbing in Reels

Dubbing is still restricted for now. Image: meta
Dubbing is still restricted for now. Image: Meta

Meta is currently also testing the automatic lip-sync video dubbing on Instagram and Facebook Reels, starting with languages English and Spanish. This functionality will allow users to watch content in their native language, making it easier to understand and interact.

The feature is initially available to a small group of creators, but there are plans to expand it to more creators and include other languages coming soon. This advancement has the potential to significantly increase the reach of content creators, allowing their productions to transcend language barriers and connect with a global audience, regardless of the language spoken.

Availability

AI voice in Australia, Canada, New Zealand and USA in English only. Image: meta
AI voice in Australia, Canada, New Zealand and the US in English only. Image: Meta

The company claims that the llama 3.2 is now available on Meta platforms, with the exception of Meta AI voice updates. These are available in Australia, Canada, New Zealand, and the US in English only.

And you, what did you think of the news? Tell us Comment!

See also:

Meta shows Orion holographic glasses, which display images on the screen.

With information from: Goal [1] e [2].

Text proofread by: Daniel Coutinho (25 / 09 / 24)


Discover more about Showmetech

Sign up to receive our latest news via email.

Leave a comment

Your email address will not be published. Required fields are marked with *

Related Posts