1 broad agent beats 6 specialist ones

plus the Google I/O drop, xAI's bet on Hermes, and Higgsfield's new Supercomputer.

Remy Gaskell
May 22, 2026

welcome back beautiful people,

I crashed out a bit today, i’m in Bali and the internet has gone out at villa for the last 2 days, and I’ve had a to-do list coming out of my ass.

went and sat by the beach and drank a coconut, calmed myself down, now we’re sitting here writing this bad boy…

📌 TL;DR

Google I/O → Gemini Omni (Nano Banana for video), Spark (Workspace agent), Antigravity 2.0 (Google's answer to Claude Code), and Gemini baked into YouTube, Maps, and Search.
Hermes Agent + xAI Grok → xAI shipped an official first-class Grok integration into Hermes.
Higgsfield Supercomputer → all their image and video models plus Claude, GPT, and Gemini behind one chat agent. Also: i officially landed my first brand deal with Higgsfield this week.
Builder's notes → broad agents with niche skills beats a fleet of single-purpose ones. The models are good enough now.

Google's I/O drop

Google had I/O this week and emptied the clip. Here's everything actually worth your attention.

Gemini 3.5 Flash

Their new fast, cheap workhorse model. Google's claiming it runs 12x faster than other frontier models inside Antigravity, while outperforming their previous Gemini 3.1 Pro on key coding and agentic benchmarks.

Gemini Spark

Spark is Google's new agent (think Claude Cowork x OpenClaw), baked directly into Google Workspace.

You can give it a task from the Gemini app, your email, or chat. it runs in the background on Google Cloud (close your laptop and it keeps going) and knocks through Workspace tasks: creating docs, sending emails, scheduling meetings, hunting through your inbox.

In the keynote demo, the presenter hit Spark with:

"find every upcoming meeting with Sundar and turn them hot pink, write a note to my new neighbour inviting his family to the block party, and create a doc with everything we need to do for the kids before end of school year."

My take: it's just another agent harness. Claude Code, Codex, OpenClaw, Hermes, Cowork, and now Spark. they're all just different cars. Claude Code is a Ferrari. Codex is a G-Wagon. Spark is a Range Rover.

Learn how the steering, brakes, and ignition work (context, skills, MCPs) in one of them and you can hop in any other.

Gemini Omni

Omni is Google's new video model. their own framing: "Nano Banana, but for video".

The long-term pitch is "any input, into any output", but what launched today is video generation.

Feed Omni any combination of text, images, audio, and existing clips, and it gives you back one finished video.

— (@)

What you can practically do:

Multi-reference generation → drop in up to 5 photos (a character, a product, a location) and Omni keeps them visually consistent across the entire clip.
Conversational editing → upload a real video and prompt the changes: "remove the violin", "add smoke effects to the skater", "make the room dark".
Native audio → sound effects and voice are generated in the same pass as the video.
AI avatars → create a reusable digital version of yourself and drop it into any video. Omni voices it natively too.

Available now on Google AI Plus or higher, developer API in a few weeks. clips max out at 10 seconds, and every output gets a SynthID watermark (metadata flagging AI use).

Anti-gravity 2.0

Anti-gravity is Google's coding agent platform. think Claude Code or Codex, but Google's version.

1.0 launched November 2025 as a desktop app. if you're familiar with VS Code, it was just a forked version of that. files on the left, editor in the middle, Gemini on the right.

2.0 (this week) is more interesting than a normal version bump. Google split the product into two separate apps:

Antigravity 2.0 → a brand new desktop app (see image below). it's literally just the Code tab in the Claude desktop app or the Codex app, but you can only use Gemini models in it.
Antigravity IDE → the original 1.0 experience continuing on as its own separate download. the editor is still there if you want it.

Google's framing on stage: "what we needed to do is separate out the agent-first surface into its own standalone application so you can just deal with the agents."

big announcement, but in reality they've just caught up to Codex and Claude Code with an app and the same set of features.

They also dropped the Antigravity CLI alongside 2.0.

Gemini EVERYWHERE

Google is also baking Gemini directly into the other Google apps:

YouTube got Ask YouTube: search any tutorial and get a written step-by-step guide that jumps you straight to the relevant part of the video.
Search got improved, plus background agents that monitor the web for you (e.g. "ping me when this item goes on sale").
Maps got Ask Maps, the one i'm most excited about. demo: "my kid just fell into the duck pond and the wedding starts in 30 minutes. where can i walk and buy her a new dress?"

Hermes Agent is officially popping off

A few weeks ago it was just a popular agent framework on GitHub. Now Hermes Agent is a serious project i'm using every day, and the release pace is nuts.

The biggest news this week: xAI shipped an official Grok integration on May 15.

Your existing Grok subscription plugs straight into Hermes over OAuth (no separate API key).

It brings Grok 4.3 (which dropped earlier this month with a 1M context window and tops the agentic tool-calling leaderboards), Grok TTS, and Grok Imagine in as native model backends, plus a real-time X search tool.

This is the first time a major model lab has built a first-class integration into someone else's open-source agent instead of launching a closed competitor.

My honest prediction: xAI ends up acquiring Nous Research. Watch this space.

Higgsfield put their whole studio behind one chat

Higgsfield Supercomputer is here.

They've put all their image and video models (Soul, Seedance, Nano Banana, Kling) plus Claude Opus 4.7, GPT-5.5, and Gemini 3.1 Pro behind a single chat agent.

You describe what you want, the agent picks the right model, generates, and delivers.

Cinematic ads, UGC at scale, full mini-films, all from one prompt. Skills (slash-command workflows), Connectors (Slack, Drive, Notion, Gmail), and scheduled tasks all baked in.

Fun fact: it's running on an enhanced Hermes Agent under the hood.

And on a personal note. i officially landed my first brand deal with Higgsfield this week. Pretty stoked because i've been using them every day anyway and genuinely love the product.

My first sponsored video with them is probably already live on Instagram by the time you're reading this. If you could head over and leave a like or a comment, it would absolutely mean the world. I'll actually come and kiss you.

Also this week...

Anthropic passed OpenAI in business adoption → Ramp’s latest AI Index has Anthropic at 34.4% of US business AI adoption vs OpenAI at 32.3%. Within hours, OpenAI offered companies 2 months of free Codex to switch from Claude Code, then Anthropic fired back with 50% higher Claude Code weekly limits through July 13. They’re using the drug dealer method to try and get you hooked.
ChatGPT can now connect to your bank accounts → OpenAI launched personal finance tools. ChatGPT can now show spending, subscriptions, upcoming payments, and portfolio performance from your actual financial data. Early preview, US-only, but this is will be quite lifechanging for your average family.

💡 Builder’s notes

I’ve built enough agents now to confidently say…

Broad agents, niche skills is the meta

If you’re trying to automate content, you probably don’t need:

1 Twitter agent
1 LinkedIn agent
1 YouTube agent
1 copywriting agent
1 thumbnail agent
1 newsletter agent

That sounds organised on paper, but in practice its a building and maintenance nightmare.

Now you’ve got 6 prompts to maintain, 6 sets of tool permissions, 6 different “personalities” to debug, and 6 places where things can go wrong (and they will).

The better pattern is:

1 Head of Content agent that owns EVERYTHING content related

Then a library of properly sanded-down skills.

X post writing
LinkedIn writing
Reel scripting
YouTube thumbnails
YouTube descriptions
Blog repurposing
Newsletter writing (just kidding on that one)

The models are good enough now that over-specialising the agents often just makes the whole system more fragile. You don’t need 40 tiny agents all pretending to be departments. You need a few capable agents, then really sharp skills they can pick up when the job changes.

How I actually use Claude skills → my latest YouTube video.

Side note… I find the Claude desktop app to be completely unusable.

It’s so buggy, crashes a lot, and is really slow.

I highly recommend downloading VS code, its a free code editing app (that cursor and antigravity are built off), and then installing the claude code extension in there.

It works so well, the user interface is really nice, I never have bugs, and its way quicker than the desktop app.

btw you dont need to be coding to appreciate this. you can open docs and work on them in there too, with your nice claude window on the right :)

🧰 Tools to try

OmniSocials → social media manager inside Claude. Draft, schedule, and publish across 10 platforms.
Marketing Skills v2.0 → Corey’s viral marketing skills pack got a big upgrade. covered the original version ages ago, but v2 feels worth re-mentioning. This is exactly the “broad agent, niche skills” thing I was talking about

🥣 Brain food

How I actually use Claude skills → my latest YouTube video. Skills are probably the most underrated part of building useful agents, so i broke down how i actually use them, when to make one, and more :)
Google I/O release recap → highlights from Google’s I/O event product launches
49 Minutes On Sobriety → absolutely nothing to do with AI, and not really relevant to me personally, but Blake Rocha is the goat and this is just a great video.

I accidentally left the Airbnb TV in Los Angeles logged into my youtube account, my watch history is… interesting you could say.

anyways,