Dude, ChatGPT just solved an Erdős problem a few days ago and Mythos is exploiting decade old undiscovered 0-days in OSes and capable of pivoting 0-day Firefox bugs into full blown root access.
Yeah, I get that the viral “how many 'r’s are in strawberry” stuff is funny, but the idea that historical issues with transformers is preventing them from accelerating peak capabilities way beyond what most experts thought was possible just years ago is borderline delusional.
The field is moving so fast at this point that if you are basing any sense of limitations on even ~2mo old sampling, your conclusions are likely out of date.
They aren’t a silver bullet for everything (yet) but how capable they are at the things transformers are starting to be specialized into is well past the avg practitioner.
I’ve been writing software for well over a decade and the modern agents do a better job than I would around 90% of the time. Yes, I’ll occasionally need to bring up issues with their work, but I’d say at this point around 50% of the times I think they made a mistake I was actually the one who was wrong.
This is only within around the last 3-4 months that it’s been like this.
Dude, ChatGPT just solved an Erdős problem a few days ago and Mythos is exploiting decade old undiscovered 0-days in OSes and capable of pivoting 0-day Firefox bugs into full blown root access.
Yeah, I get that the viral “how many 'r’s are in strawberry” stuff is funny, but the idea that historical issues with transformers is preventing them from accelerating peak capabilities way beyond what most experts thought was possible just years ago is borderline delusional.
The field is moving so fast at this point that if you are basing any sense of limitations on even ~2mo old sampling, your conclusions are likely out of date.
They aren’t a silver bullet for everything (yet) but how capable they are at the things transformers are starting to be specialized into is well past the avg practitioner.
I’ve been writing software for well over a decade and the modern agents do a better job than I would around 90% of the time. Yes, I’ll occasionally need to bring up issues with their work, but I’d say at this point around 50% of the times I think they made a mistake I was actually the one who was wrong.
This is only within around the last 3-4 months that it’s been like this.
Oh did it solve it? You didn’t really provide any sources so I had to look it up myself.
And in the example from 2 days ago, it just applied an existing formula in a different context.
Which is helpful for sure, but I wouldn’t say it solves it.