• Superpower Daily
  • Posts
  • Researchers use AI chatbots against themselves to 'jailbreak' each other

Researchers use AI chatbots against themselves to 'jailbreak' each other

New 'Mind-Reading' AI Translates Thoughts Directly From Brainwaves

In today’s email:

  • 📚 Pushing ChatGPT's Structured Data Support To Its Limits

  • 🗺️ Artificial intelligence can find your location in photos

  • 🤯 A New Kind of AI Copy Can Fully Replicate Famous People.

  • 🧰 8 new AI-powered tools and resources. Make sure to check the online version for the full list of tools.

main-sponsor
Top News

Researchers from Nanyang Technological University in Singapore have developed a method for compromising various AI chatbots, including ChatGPT, Google Bard, and Microsoft Bing Chat, in a process known as "jailbreaking." This involves exploiting flaws in the chatbots' software to make them produce content that goes against their developers' guidelines.

The researchers trained a large language model (LLM) on a database of prompts that had previously been successful in jailbreaking these chatbots. This new LLM is capable of automatically generating prompts to jailbreak other chatbots, exploiting their weaknesses.

LLMs, which are the core of AI chatbots, enable them to generate human-like text for various tasks. The NTU researchers' work demonstrates how these LLMs can be manipulated to produce content that is normally restricted, such as violent or unethical material.

Their method, named "Masterkey," first involves reverse-engineering how LLMs detect and defend against malicious queries. Then, they use this knowledge to teach an LLM to produce prompts that can bypass other LLMs' defenses. This process can be automated, allowing the creation of new jailbreak prompts even after developers update their LLMs' security.

This work has been accepted for presentation at a major security forum and highlights the vulnerabilities in AI chatbots. The researchers also propose using their method to help developers strengthen their LLMs against such attacks.

Are you building a business? Achieving SOC 2 compliance can help you win bigger deals, enter new markets, and deepen trust with your customers — but it can also cost you real time and money.

Vanta automates up to 90% of the work for SOC 2 (along with other in-demand frameworks), getting you audit-ready in weeks instead of months. Save up to 400 hours and 85% of associated costs.

Download the free checklist to learn more about the SOC 2 compliance process and the road ahead.

Researchers from Nanyang Technological University, Singapore (NTU Singapore), have developed a method to "jailbreak" AI chatbots, including ChatGPT, Google Bard, and Microsoft Bing Chat. This jailbreaking involves exploiting flaws in the chatbots' software to make them produce content against their developers' guidelines. The term "jailbreaking" in computer security refers to hackers finding and exploiting system vulnerabilities to bypass restrictions.

The team trained a large language model (LLM) on a database of successful jailbreak prompts, creating an AI capable of generating new prompts to jailbreak other chatbots. LLMs, which power AI chatbots, can process human inputs and generate human-like text. The NTU research adds jailbreaking to the capabilities of LLMs, highlighting the weaknesses and limitations of current AI chatbots, thereby urging developers to strengthen their defenses.

The NTU team's approach, named "Masterkey," involves reverse-engineering LLMs to understand their defense mechanisms and then training an LLM to bypass these defenses. This method proved to be three times more effective than previous methods, able to adapt and generate new prompts even after developers patch their systems.

The research aims to show the vulnerabilities in AI chatbots, making them susceptible to producing unethical or criminal content. The findings, presented at a leading security forum, could help developers fortify their AI against such attacks. This escalating arms race between hackers and developers underscores the ongoing challenge in securing AI systems against misuse.

Midjourney, known for its image generation tool within a Discord server, is expanding into video generation. The CEO, David Holz, announced plans to train video models starting in January, aiming for a release in the coming months. This marks a significant step for Midjourney, evolving from a mature image model to entering the competitive generative video industry. They also plan to refine their manga/anime generator model, V6 Niji, and make consistency fixes for the official release of Midjourney V6.

Midjourney has typically prioritized quality and user experience over speed, introducing features like inpainting and outpainting later than competitors. This approach contrasts with other platforms like Stable Diffusion, Dall-E 3, SDXL, Ideogram, and IF, which have already ventured into text generation and other advanced features.

The move into video generation follows similar advancements by competitors like Stability AI's Stable Video Diffusion, Meta's EMU video generator, Pika, Runway ML, and Leonardo AI. With its recent v6 update, Midjourney aims to stay competitive in the rapidly evolving AI landscape, emphasizing improved prompt adherence and realistic image generation. This venture into AI-generated video content holds significant implications for the creative and media industries, potentially revolutionizing how we produce, manipulate, and perceive video content.

Other stuff

Superpower ChatGPT now supports voice 🎉

Text-to-Speech and Speech-to-Text. Easily have a conversation with ChatGPT on your computer

Superpower ChatGPT Extension on Chrome

 

Superpower ChatGPT Extension on Firefox

 

Tools & LinkS
Editor's Pick ✨

OpenVoice - Instant voice cloning that requires only a short audio clip to replicate voice and generate speech in multiple languages and tones

ShotSolve is a Mac menubar app that allows you to take a screenshot and then uses GPT-4 Vision to solve any question you have in mind.

Robin AI - The AI Copilot for your Contracts

DreamTalk - When Expressive Talking Head Generation Meets Diffusion Probabilistic Models

Plot Chat - Talk With Data

crewAI - Platform for building sophisticated multi-agent interactions

Ferret: Refer and Ground Anything Anywhere at Any Granularity

FrequentlyAskedAI - Build an AI FAQ, and answer any customer question in seconds

How to create multiple custom instruction profiles on ChatGPT

Unclassified 🌀 

How did you like today’s newsletter?

Login or Subscribe to participate in polls.

Help share Superpower

⚡️ Be the Highlight of Someone's Day - Think a friend would enjoy this? Go ahead and forward it. They'll thank you for it!

Hope you enjoyed today's newsletter

Follow me on Twitter and Linkedin for more AI news and resources.

Did you know you can add Superpower Daily to your RSS feed https://rss.beehiiv.com/feeds/GcFiF2T4I5.xml

⚡️ Join over 200,000 people using the Superpower ChatGPT extension on Chrome and Firefox.

Superpower ChatGPT Extension on Chrome

 

Superpower ChatGPT Extension on Firefox