This week might go down in history, or at least has very likely broken some kind of record because on Tuesday really was something else. For whatever reason, this Tuesday every AI company decided to announce something, and big things, making it feel more like a month or year of progress all culminating in a single day.
In case you missed it, here's what happened, not this week, not this month, but on Tuesday of this week.
✨ Anthropic released a "Computer Use" API - a foundational breakthrough that allows Claude to operate a computer just like a humanWe've built an API that allows Claude to perceive and interact with computer interfaces.
— Anthropic (@AnthropicAI) October 22, 2024
This API enables Claude to translate prompts into computer commands. Developers can use it to automate repetitive tasks, conduct testing and QA, and perform open-ended research. pic.twitter.com/eK0UCGEozm
✨ Genmo announced a new AI Video Generator called Mochi 1 that looks like it can produce true production quality videos with AI
Introducing Mochi 1 preview. A new SOTA in open-source video generation. Apache 2.0.
— Genmo (@genmoai) October 22, 2024
magnet:?xt=urn:btih:441da1af7a16bcaa4f556964f8028d7113d21cbb&dn=weights&tr=udp://tracker.opentrackr.org:1337/announce pic.twitter.com/YzmLQ9g103
✨ Perplexity introduced "Reasoning Mode" so you can ask multi-layered questions
Pro Search is now more powerful. Introducing Reasoning Mode!
— Perplexity (@perplexity_ai) October 22, 2024
Challenge your own curiosity. Ask multi-layered questions. Perplexity will adapt.
Try it yourself (sample queries in thread)👇 pic.twitter.com/NHlxA34nLd
✨ Oh and speaking of Anthropic, along with "Computer Use" they also released new versions of their Sonnet and Haiku models
Introducing an upgraded Claude 3.5 Sonnet, and a new model, Claude 3.5 Haiku. We’re also introducing a new capability in beta: computer use.
— Anthropic (@AnthropicAI) October 22, 2024
Developers can now direct Claude to use computers the way people do—by looking at a screen, moving a cursor, clicking, and typing text. pic.twitter.com/ZlywNPVIJP
✨ Runway was all the rage yesterday with their announcement and demo of Act I - a new way to generate expressive character performances (seriously, you need to see the demo - it is wild)
Introducing, Act-One. A new way to generate expressive character performances inside Gen-3 Alpha using a single driving video and character image. No motion capture or rigging required.
— Runway (@runwayml) October 22, 2024
Learn more about Act-One below.
(1/7) pic.twitter.com/p1Q8lR8K7G
✨ Ideogram launched Canvas, an infinite creative board for organizing, generating, editing, and combining images
Today, we’re introducing Ideogram Canvas, an infinite creative board for organizing, generating, editing, and combining images.
— Ideogram (@ideogram_ai) October 22, 2024
Bring your face or brand visuals to Ideogram Canvas and use industry-leading Magic Fill and Extend to blend them with creative, AI-generated content. pic.twitter.com/m2yjulvmE2
So yeah, quite the week. I'm still diving into all of these with most of my time spent on Anthropic's "Computer Use" as I think this is a major paradigm shift in how we'll all use not just LLMs, but our computers...or more like, how we won't use our computers 👀