Superpower Daily
Posts
A completely open-source AI Wearable device

A completely open-source AI Wearable device

ChatGPT should now be much less lazy

Saeed Ezzati
February 05, 2024

In today’s email:

👼 This AI learned language by seeing the world through a baby’s eyes
🎼 Meta found a way to secretly watermark deep fake audio
🤗 Hugging Face launches open-source AI assistant maker to rival OpenAI’s custom GPTs
🧰 13 new AI-powered tools and resources. Make sure to check the online version for the full list of tools.

This AI learned language by seeing the world through a baby’s eyes

An AI model has learned to recognize words like 'crib' and 'ball' by studying videos recorded from the perspective of an 18-month-old baby named Sam. This unique approach, detailed in a study published in Science on February 1, 2024, utilized 61 hours of footage from a head-mounted camera to train a neural network. This method of learning through the visual and auditory experiences of a baby provides new insights into human language acquisition, challenging the notion that innate knowledge about language is necessary for learning.

The AI was exposed to 250,000 words and corresponding images from Sam's daily activities, learning through a process called contrastive learning to associate specific words with images. In tests, the AI successfully matched words to images with a 62% success rate, which is significantly higher than chance and comparable to other AI models trained on much larger datasets. However, its ability to generalize this knowledge to unseen examples varied, performing best with objects that appeared frequently in the training data and had consistent appearances.

The study's findings offer evidence against theories that suggest language acquisition requires special mechanisms beyond the ability to form associations between different sensory inputs. However, the reliance on data from a single child's experiences raises questions about the generalizability of the results. Nonetheless, the research highlights the potential of AI to mimic and provide insights into human learning processes, though acknowledging limitations such as the AI's inability to replicate the tactile and interactive experiences vital to a real baby's learning.

Meta found a way to secretly watermark deep fake audio

AudioSeal is a new system by Facebook Research designed to detect AI-generated deepfake audio through imperceptible watermarking. As AI voice synthesis improves, distinguishing between real and fake human speech has become challenging, leading to potential misuse like voice cloning and deepfakes. AudioSeal addresses this by embedding watermarks in synthetic speech, enabling the identification of manipulated audio.

Traditional passive detection methods are becoming ineffective against advanced synthesis technologies. In contrast, AudioSeal uses an active approach, marking generated voices to adapt to technological advancements. It comprises a generator for embedding watermarks and a detector for identifying watermarked audio, offering precise localization and robustness against audio distortions.

AudioSeal surpasses previous models in detection speed and efficiency, providing generalizability, precise manipulation localization, and robustness against editing techniques. However, it also faces challenges such as confidentiality, ethical concerns, and the need for standardization and user consent. Despite its advancements, AudioSeal is part of a broader effort to tackle the evolving challenges of synthetic media, emphasizing the importance of authenticity and ethical practices.

Read the full paper Here.

A completely open-source AI Wearable device

This is Adeus, the Open Source AI Wearable device - and in this repo, you will be guided on how to set up your own! From buying the hardware (~$100, and it will be cheaper once we finish the Raspberry PI Zero version) to setting up the backend, and the software, and start using your wearable!

Adeus consists of 3 parts:

A mobile / web app: an interface that lets the user to interact with their assistant and data via chat.
Hardware device (Currently Coral AI, but soon a Rasberry-Pi Zero W worth $15): this will be the wearable that will record everything, and send it to the backend to be processed
Supabase: Our backend, and datavase, where we will process and store data, and interact with LLMs. Supabase is an open-source Firebase alternative, a "backend-as-a-service" - that allows you to set up a Postgres database, Authentication, Edge Functions, Vector embeddings, and more - for free (at first) and at extreme ease!

Hugging Face launches open-source AI assistant maker to rival OpenAI’s custom GPTs

Hugging Face, a New York City-based startup known for its open-source AI code repository has launched a new product called Hugging Chat Assistants. This product allows users to create their own AI chatbots for free, offering a similar service to OpenAI's custom GPT Builder but without the associated costs. Unlike OpenAI's offerings, which rely on proprietary models like GPT-4, Hugging Chat Assistants can be powered by a variety of open-source large language models (LLMs), such as Mistral’s Mixtral or Meta’s Llama 2. Hugging Face's approach emphasizes user customization and the freedom to choose from a wide range of models. Additionally, they have created a central repository where users can share and access customized Assistants. While some in the AI community see Hugging Chat Assistants as superior to GPTs due to their customizability and free access, there are limitations such as the lack of web search capabilities and automatic logo generation, features that OpenAI's models currently offer. The launch of Hugging Chat Assistants is seen as a significant step in the open-source community's efforts to offer competitive alternatives to proprietary AI models like those of OpenAI.