DigDeep Tech Podcast

ChatGPT Agent - What can I actually do with this?


Listen Later

Recently, I gave ChatGPT's new "Agent" mode a straightforward task:

Find a tote bag to gift my wife.

Simple enough, right?

Here's what actually happened:

* It opened up a browser container (good start!).

* Attempted to log into Amazon but couldn't get past authentication.

* Struggled with dynamically loaded sites - simply couldn't handle modern, JS-heavy pages.

* Lost track of the original intent halfway through.

* Ended up suggesting obscure luxury handbags priced at around AED 4,000.

* After over five minutes of watching it flail, I manually intervened and stopped the process.

Just for comparison, I gave GPT-4o the exact same prompt:

* Quick web search → clear, relevant results → done in 23 seconds.

The contrast was stark.

Why is this happening?

Despite its impressive-sounding name and sleek UI, ChatGPT Agent tasks still require continuous manual supervision. It often feels like babysitting rather than automating.

Instead of true autonomy, I got:

* Repetitive loops

* Frequent breakdowns on interactive sites

* A constant need to step in and redirect or correct the task

Here's what I'd ideally want a real "agent" to do:

* System-level automations: Ability to run local scripts, manage files, and adjust settings.

* Context-aware recommendations: Observing my habits, identifying recurring workflows, and suggesting intelligent automations.

* Persistent memory: Remember context and user preferences across multiple tasks and sessions.

* Robust error handling: Automatically retry, replan, and recover from failures.

* API integration: Reliably switch to APIs when UI-based interactions fail.

* Transparency: Clear tracking of state, actions, token usage, and auditability.

Conclusion

Right now, ChatGPT Agent feels more like a glorified macro in a sandbox not a true autonomous assistant. It's early days, and perhaps expectations should be tempered accordingly. But let's not confuse a slick interface and containerized control with genuine agentic capability.

True AI agents must do more than click buttons, they need to think, learn, adapt, and collaborate.

We're not there yet.



This is a public episode. If you would like to discuss this with other subscribers or get access to bonus episodes, visit digdeeptech.substack.com
...more
View all episodesView all episodes
Download on the App Store

DigDeep Tech PodcastBy Ajay Cyril