- Superpower Daily
- Posts
- Meta’s Movie Gen model puts out realistic videos with sound
Meta’s Movie Gen model puts out realistic videos with sound
OpenAI introduced Canvas for ChatGPT
In today’s email:
🔥 There’s a New Hit Podcast That Will Blow Your Mind
🛸 This Homemade AI Drone Software Finds People When Search and Rescue Teams Can’t
🌓 Google's AI thinks I left a Gatorade bottle on the moon
🧰 11 new AI-powered tools and resources. Make sure to check the online version for the full list of tools.
Meta has introduced Movie Gen, a new generative video model capable of creating realistic videos with sound based on text prompts. Unlike similar efforts from companies like Runway and OpenAI, Movie Gen isn't available to the public yet, which Meta says is due to safety considerations. The model is built from several foundation models and generates videos up to 16 seconds long, with a resolution of 768 pixels wide that is later upscaled to 1080p. The most notable feature is its ability to produce audio to match the video content, such as engine noises, natural sounds, or even background music. However, it currently does not generate voice, likely due to both technical challenges and concerns over misuse.
One of Movie Gen's strengths is its text-based editing functionality, allowing users to modify existing video scenes by simply specifying changes in natural language, like altering someone's clothing or changing the background. This aims to address a common issue with generative video models, where making slight adjustments can result in vastly different outputs. Camera movements can also be defined in prompts, though the control is still fairly basic.
Movie Gen’s model was trained on a mix of licensed, publicly available, and proprietary data, likely drawing from Meta's vast library of videos from platforms like Facebook and Instagram. Meta's focus seems to be on developing a practical tool that can create polished final products from simple prompts, unlike some of its competitors that are more focused on proving technical prowess. However, concerns about deepfake misuse and the difficulties of matching speech to facial movements mean that voice synthesis has been deliberately excluded for now.
The fastest way to build AI apps
Writer Framework: build Python apps with drag-and-drop UI
API and SDKs to integrate into your codebase
Intuitive no-code tools for business users
Fei-Fei Li, known as the "godmother of AI," expressed uncertainty about the concept of artificial general intelligence (AGI) during a recent discussion at Credo AI’s responsible AI leadership summit. Despite her extensive background in AI, Li admitted that she doesn't know what AGI is, comparing it to something that "you know when you see it." Instead, she prefers to focus on practical, evidence-based AI advancements that could benefit society.
Li, who created ImageNet and served as Chief Scientist of AI/ML at Google Cloud, now leads the Stanford Human-Centered AI Institute and her startup, World Labs. She described the mission of her startup as building "large world models" aimed at achieving "spatial intelligence" – enabling computers not only to see but to understand and navigate the 3D world like humans do. According to Li, the challenge goes beyond simply recognizing objects to understanding how they interact in real-world environments.
During the summit, Li also spoke about her advisory role in the aftermath of California Governor Gavin Newsom’s veto of AI bill SB 1047. She argued that penalizing technologists for potential misuse of AI is akin to punishing car engineers for accidents involving vehicles. Instead, she advocates for an improved regulatory framework and continuing innovation in safety measures.
Li is also vocal about the need for diversity in AI development. As one of the few women leading an AI lab at the cutting edge of research, she believes diverse human intelligence will lead to more effective and inclusive artificial intelligence. She remains optimistic about building technology that benefits humanity and emphasizes the importance of combining human and machine intelligence to create impactful solutions.
This week, OpenAI introduced "canvas," a new feature for ChatGPT designed to enhance writing and coding projects. Canvas opens a dedicated workspace alongside the traditional chat window, allowing users to generate content and make edits directly. Users can highlight specific parts of their writing or code, prompting the model to make edits without having to regenerate an entire response. Currently in beta for ChatGPT Plus and Teams users, Canvas will be available for Enterprise and Edu users next week.
Canvas represents OpenAI's move to match similar features offered by competitors like Anthropic's Artifacts and the coding assistant Cursor, both of which emphasize editable workspaces for a more interactive experience. The new interface aims to address the challenge that chatbots face in handling large projects from a single prompt. With editable workspaces like Canvas, users can modify sections of generated content more naturally and effectively.
During a recent demonstration, OpenAI product manager Daniel Levine showcased how Canvas can help with various tasks. For example, users can write an email in the canvas window and adjust its length or tone using a simple slider or by highlighting sentences for specific changes. On the coding side, canvas offers users the ability to prompt ChatGPT for code and add in-line comments or explanations. New features also include a "review code" button, which suggests edits for the generated or user-written code.
Once the beta phase is complete, OpenAI plans to make Canvas available to free users, positioning it as a user-friendly tool for both writing and coding projects. This feature aims to simplify collaboration with ChatGPT, offering a more seamless approach to generating and refining content.
Other stuff
There’s a New Hit Podcast That Will Blow Your Mind 🔥🔥
This Homemade AI Drone Software Finds People When Search and Rescue Teams Can’t 🔥
The racist AI deepfake that fooled and divided a community 🔥
Another OpenAI founder moves to arch-rival Anthropic
Future of Programming with AI - Lex Fridman talks with the Cursor Team 🔥
NVLM 1.0 - Open frontier-class multimodal LLMs
CERN trains AI models to revolutionize cancer treatment
Almost all pictures of “Baby Peacock” on Google are AI-generated
Students who use AI as a crutch don’t learn anything
Imagining faces in tree trunks and your morning eggs? AI can see them, too
Google's AI thinks I left a Gatorade bottle on the moon 🔥
All your ChatGPT images in one place 🎉
You can now search for images, see their prompts, and download all images in one place.
Bolt.new - What if AI dev products (Claude, v0, etc) let you install packages, run backends & edit code?
OpenBB Terminal - Make smarter investment decisions with AI-driven research
GPT Pilot doesn't just generate code, it builds apps!
FacePoke - Move the face landmarks
Theneo 3.0 - AI-powered API docs
Dashworks Bots - Create AI assistants that answer your team's questions
Firebender - Android Studio's most powerful AI assistant
Text Behind Image - Create stunning text-behind-image designs easily
Open Agent Cloud - Generate no-code automation agents from screen recordings
Podial - Generate engaging educational podcasts from any documents
Graphy AI - Turn your data into stories with AI
Unclassified 🌀
Meet PodSnacks, your time-saving podcast companion! Receive AI-powered episode summaries. Supercharge your podcast game! Try PodSnacks Free!
How did you like today’s newsletter? |
Help share Superpower
⚡️ Be the Highlight of Someone's Day - Think a friend would enjoy this? Go ahead and forward it. They'll thank you for it!
Hope you enjoyed today's newsletter
Did you know you can add Superpower Daily to your RSS feed https://rss.beehiiv.com/feeds/GcFiF2T4I5.xml
⚡️ Join over 200,000 people using the Superpower ChatGPT extension on Chrome and Firefox.
OR