One LLM to rule them all (PERFECT!)

#4
by Pink-Elephant - opened

I have been playing with this model for several days now. At least for my use cases, which is general stuff and NSWF storytelling/roleplay, I find that I literally don't need any other models anymore. First off, Gemma 4 is just amazing. Gemma 3 was very good, but this one is head and shoulders better yet. And once again, HauHau's technique to uncensor models is perfect. It doesn't require any custom prompting, and just works. The output doesn't feel tamed down either. The "Abliterated" techniques can't come close. And just as HauHau suggests, the "Balanced" release responds perfectly to my use cases.

As for the NSFW content, I've never seen any other model come close to how good this one is. Qwen can't touch it, nor can anything else. It almost always just "gets" what you want, and at most just needs a little shove in the right direction sometimes. It doesn't flood the output with adverbs/adjectives, but rather creates truly good responses that will elaborate on what you want, creatively, and without going off on some weird tangent. It can be as creative, or as dirty, as you want it to be. Also, it can progress a story without constant micro-direction. I've never seen anything this good before.

"Thinking" mode isn't on by default, but it isn't generally needed either. You can easily add it however. In "LM Studio", go to the "Inference" tab of the model's settings, and under "Reasoning Parsing", enter "<|channel>thought" (no quotes) for "Start String", and "<channel|>" for "End String". Then add the line "{%- set enable_thinking = true %}" to the top of the jinja template.

Now you can easily toggle "Thinking" off or on by changing enable_thinking to false or true. You don't even need to reload the model after. This method doesn't add a toggle switch to the user interface as many people would like, but it's a proper and safe method that doesn't use any trickery.

It's worth trying out "Thinking" mode with Gemma 4, as it's far better then it is with Qwen. Qwen will ramble on forever during thinking, and sometimes get stuck in an infinite loop. Gemma 4's thinking is sensible, and doesn't take very long either.

i mean it's own model card details directly contradict the no refusals claim. If you have to Re-ask something, that is a refusal. So if that 0/465 refusal claim, that can NOT be compared with the ususal benchmark that all others do, is not just copy/paste into every hauhaucs model card, that benchmark can not be realistic. It makes no sense in combination with the details in the lower parts. ONE of the two must be wrong, by simple logic..

And of course there is none of the usual benchmarks to backup the various bold 100%/perfect/just as good as the original claims.
Most of what you praise is just Gemma-4 in general.

having seen a deep dive I and dissection on one of the hauhaucs qwen models (27b) recently (same bold perfection claims) , it (and the mothod used) was shown to be good in some aspects, in most really, but it's not magically "perfect" or "just like the original".

And i was looking for something like that because everything around the hauhaucs models uncensoring AND benchmarking if any, is quite secretive, and tbh quite marketing shouty-ish without any of the regular benchmarks. If it is perfect as claimed, just do them?

I've never had to re-ask it anything, and I've tested with some pretty wild and intense stuff. He does state "edge-case prompts", so maybe I'm just not being extreme enough. As I said, it's been perfect for my use cases. I'm just providing some actual real world feedback, not any in-depth analysis.

He does say that an Aggressive variant is coming, but I worry if it could be too accepting of given situations, without the needed checks and balances you'd expect in real life responses. (I just don't know.) Since I love the behavior I've experienced with this Balanced variant, I personally don't expect anything better from the Aggressive variant.

And yes, most of my praise is for Gemma 4 in general, which is why I led off with "Gemma 4 is just amazing". However, I have found HauHau's uncensor technique to be extraordinarily good versus other techniques, as they often require custom prompting, fail to uncensor correctly, and/or have watered down responses. HauHau's just works. If he wants to keep his method secret that's his business.

I appreciate your views, and you certainly have valid points. However, most people just want a model that's good for their needs, and don't care about anything else. I've provided my use case experience, and I think many people will appreciate that over charts of numbers and technical data.

Sign up or log in to comment