The New Gemini Pro 2.0 Experimental sucks Donkey Balls.

Wow. Last night, after a long coding bender I heard the great news that Gemini were releasing some new models. I woke up this morning super excited to try them.

My first attempt was a quick OCR with Flesh light 2.0 and I was super impressed with the Speed. This thing is going to make complex OCR an absolute breeze. I cannot wait to incorporate this into my apps. I reckon it's going to cut the processing times in half. (Christmas came early)

Then I moved onto testing the Gemini 2.0 Pro Experimental.

How disappointing... This is such a regression from 1206. I could immediately see the drop in the quality of the tasks I've been working on daily like coding.

It makes shit tons of mistakes. The code that comes out doesn't have valid HTML (Super basic task) and it seems to want to interject and refactor code all the time without permission.

I don't know what the fuck these people are doing. Every single release it's like this. They just can't seem to get it right. 1206 has been a great model, and I've been using it as my daily driver for quite some time. I was actually very impressed with it and had they just released 1206 as Gemini 2.0 pro EXP I would have been stoked. This is an absolute regression.

I have seen this multiple times now with Google products. The previous time the same thing happened with 0827 and then Gemini 002.

For some reason at that time, they chose to force concise answers into everything, basically making it impossible to get full lengthy responses. Even with system prompts, it would just keep shortening code, adding comments into everything and basically forcing this dogshit concise mode behavior into everything.

Now they've managed to do it again. This model is NOT better than 1206. The benchmarks or whatever these people are aiming to beat are just an illusion. If your model cannot do simple tasks like outputting valid code without trying to force refactoring it is just a hot mess.

Why can't they get this right? They seem to regress a lot on updates. I've had discussions with people in the know, and apparently it's difficult to juggle the various needs of all the different types of people. Where some might like lengthy thorough answers for example, others might find that annoying and "too verbose". So basically we get stuck with these half arsed models that don't seem to excel in anything in particular.

I use these models for coding and for writing, which has always been the case. I might be in the minority of users and just be too entitled about this. But jesus, what a disappointment.

I am not shitting you, when I say I would rather use deepseek than whatever this is. It's ability to give long thorough answers, without changing parts of code unintentionally is extremely valuable to my use cases.

Google is the biggest and most reliable when it comes to serving their models though, and I absolutely love the flash models for building apps. So you could say I am a major lover and hater of them. It's always felt this way. A genuine love-hate relationship. I am secretly rooting for their success but I absolutely loathe some of the things they do and am really surprised they haven't surpassed chatgpt/claude yet.. Like how the fuck?

Maybe it's time to outsource their LLM production to CHHHIIIIINNAAAA. Just like everything else. Hahahaa