• Superpower Daily
  • Posts
  • AI now outperforms physician "experts" on medical knowledge

AI now outperforms physician "experts" on medical knowledge

A playground to compare AI image models side-by-side

Otter AI - Make Your Conversations & Meetings Better. Take Your Conversations Everywhere. Real-Time Transcripts For Collaboration, Productivity, & More.

New! Send an Otter Assistant to attend your Zoom meetings. It automatically records, takes notes, & share notes with all meeting attendees. Try Otter AI FREE Today!

sponsored

Highlights💡 

Zoo is a playground to compare AI image models side-by-side [Try it here]

Mina built an AI wearable on Raspberry Pi that can see the world and talk about it [Project page]

Google is planning to release Colab, an AI-powered coding assistant, free of charge [Link]

  • Colab will add AI coding features like code completions, text-to-code generation, and even a code-assisting chatbot.

  • Colab will use Codey, a family of code models built on PaLM 2.

  • Fine-tuned on a large dataset of high-quality code from external sources

  • Customized especially for Python and for Colab-specific uses

TUTORIAL: How to make a comic book using Midjourney, ChatGPT, and Comic Life 3

Med-PaLM2 now outperforms physician "experts" on medical knowledge answers [Paper]

Recent artificial intelligence (AI) systems have reached milestones in "grand challenges" ranging from Go to protein-folding. The capability to retrieve medical knowledge, reason over it, and answer medical questions comparably to physicians has long been viewed as one such grand challenge.

Large language models (LLMs) have catalyzed significant progress in medical question answering; Med-PaLM was the first model to exceed a "passing" score in US Medical Licensing Examination (USMLE) style questions with a score of 67.2% on the MedQA dataset. However, this and other prior work suggested significant room for improvement, especially when models' answers were compared to clinicians' answers. Here we present Med-PaLM 2, which bridges these gaps by leveraging a combination of base LLM improvements (PaLM 2), medical domain finetuning, and prompting strategies including a novel ensemble refinement approach.

Med-PaLM 2 scored up to 86.5% on the MedQA dataset, improving upon Med-PaLM by over 19% and setting a new state-of-the-art. We also observed performance approaching or exceeding state-of-the-art across MedMCQA, PubMedQA, and MMLU clinical topics datasets.

We performed detailed human evaluations on long-form questions along multiple axes relevant to clinical applications. In a pairwise comparative ranking of 1066 consumer medical questions, physicians preferred Med-PaLM 2 answers to those produced by physicians on eight of nine axes pertaining to clinical utility (p < 0.001). We also observed significant improvements compared to Med-PaLM on every evaluation axis (p < 0.001) on newly introduced datasets of 240 long-form "adversarial" questions to probe LLM limitations.

While further studies are necessary to validate the efficacy of these models in real-world settings, these results highlight rapid progress toward physician-level performance in medical question answering. Read the full paper here.

Brilliant - Learn AI like it’s 2023 (sponsored)

AI won’t take your job. Someone using AI will. Best time to level up? Yesterday. Second best? Right now.

Luckily, there’s Brilliant — the interactive app that makes it easy to master concepts in math, data, and computer science in just minutes a day.

Here’s how:

  • They have thousands of lessons on tons of topics — from AI and neural networks to data science.

  • They break down complex concepts into digestible building blocks that stick.

  • Their interactive style keeps you engaged, so it’s easy to build a daily habit.

Join over 10 million people around the world and start building skills in minutes a day. You can try everything Brilliant has to offer for free for a full 30 days. Plus, right now you can get an exclusive 20% off an annual premium membership.

Tools & Links 🛠️

  • AI Agent - A web app that makes choices and performs tasks on its own, based on the goals set by you [Link]

  • Rebuff - The more you attack this prompt injection detector, the stronger it gets [Link]

  • DayZero - An AI engine that converts your ideas into execution within 6 minutes. [Link]

  • Muze - Your Personalized AI Music Curator [Link]

  • AI Diary - A New Way to Document Your Journey [Link]

  • Recap - Easily summarize any portion of any webpage with ChatGPT [Link]

  • ChatAll - Concurrently chat with ChatGPT, Bing Chat, bard, Alpaca, Vincuna, Claude, and more [GitHub]

  • DailyBot - Bring ChatGPT and AI Workflows to your Work Chat. [Link]

  • Nexus - navigate your entire network using AI [Link], [Anouncement]

  • Text Blaze - Eliminate repetitive typing and mistakes. [Link]

  • Eigtify - YouTube summaries powered by ChatGPT [Link]

  • TimeMaster automatically detects what you are working on, categorizes your activities, tags projects, and even writes time logs on your behalf. [Link]

  • Kive AI - Let AI sort your visual libraries with auto-tagging [Link]

  • Etched AI chips only do one thing - run large language models. [Link]

  • Tutorial: How to give your LangChain Agent a voice on Telegram. [Link]

  • StableStudio - Stability AI Releases the Open-Source Future of DreamStudio [Announcement], [GitHub]

  • The Dangers of Google’s .zip TLD [Link]

Unclassified🌀 

  • Morning Brew - Get smarter in 5 minutes (it's free) [Link]

  • Beehiiv - The newsletter platform built for growth [Link]

  • Instapage - Introducing the AI Content Generator [Link]

  • Fiverr - Find the right freelance service, right away [Link]

  • WFH.Team - Work from anywhere in the world [Link]

  • Add your link here ➜

What kind of content do you want to see more?

Login or Subscribe to participate in polls.

Tell your friend about us

Share this edition via text, social media, or email. Just copy and paste this link:

Hope you enjoyed today's newsletter!

Brought to you by eeeziii

⚡️ Check out our Superpower ChatGPT extension on Chrome Web Store and Mozilla Add-Ons Page.