FilterHN

hibikir

4 months ago

[-]

Gemini's filters weaken what would otherwise be a very strong model. Its censor is just so strong, can hit it when dealing with very cold topics. For instance, I had issues making it look at Spain's Second Republic, because it was too easy for electoral results from the era, and their disputes, to show up in the output... at which point, the model stops iterating and I am asked to go look at Wikipedia, because elections held in 1934 might be too controversial.

This pushes me to use other models that aren't necessarily better, but that at least don't clam up when I am not even trying to get anything other that general summaries of research.

soraminazuki

4 months ago

[-]

I have to ask, a strong model for what exactly? Because historical elections can still very much be a politically relevant topic, and bots making up historical facts is detrimental to a healthy society. Whatever people say about AI, they're incapable of doing quality investigative reporting any time soon.

Scalable historical revisionism would only turbocharge societal conflict.

hnhn34

4 months ago

[-]

> Whatever people say about AI, they're incapable of doing quality investigative reporting any time soon

the latest Gemini models have a very low hallucination rate in benchmarks [0]. Feel free to try it yourself. Just go to gemini.google.com, choose Deep Research, and ask it to write a report about a topic that you're intimately familiar with.

[0] - https://github.com/vectara/hallucination-leaderboard

soraminazuki

4 months ago

[-]

"It hallucinates, but at a very low rate" is a far cry from quality investigative reporting. In fact, that's not compatible with any of those three words.

Does the new Gemini have feet to go around and investigate? Does it reach out or have contacts to sources with the right knowledge? Does it even have critical thinking skills? Or is it just a robot spitting out plausible sounding word salad?

These things are crucial, especially when it comes to topics like history where there are highly malicious actors with infinite resources and motivation trying to rewrite history.

"Who controls the past controls the future. Who controls the present controls the past." - 1984, George Orwell

03data2

4 months ago

[-]

Can confirm. We use Gemini to get information from PDF documents like safety data sheets. When it encounters certain chemicals, it just stops. When you provide a JSON schema, it just responds with invalid JSON. I hope this changes.

polygot

4 months ago

[-]

> provide a JSON schema

Is this when using structured outputs?

03data2

4 months ago

[-]

Yes.

BonoboIO

4 months ago

[-]

Dihydromonxid … oh noooo

everdrive

4 months ago

[-]

The problem is that some central authority gets to decide what questions and answers are appropriate, and that you really have zero direct insight into this information. Do you really want to offload your thinking to these tools?

gilmore606

4 months ago

[-]

If conforming socially is a terminal value for you, then this is more of a feature than a bug.

evertedsphere

4 months ago

[-]

the evidence keeps pouring in for why it's a good idea to be running llms locally if you're going to put a business on top of one (or have enough money to have a contract where the provider can't just do this, but i don't know if that is a thing unless you're running on-prem)

4 months ago

[-]

I don't think LLMs are mature enough as a technology to be blindly used as a dependency, and they might never be.

The big question is, how do you train LLMs that are useful to both humans and services while not embarrassing the company that trained them?

LLMs are pretty good at translating - but if they don't like what they're reading, they simply won't tell you what it says. Which is pretty crazy.

LLMs are pretty good at extracting data and formatting the results as JSON - unless they find the data objectionable, then they'll basically complain to the deserializer. I have to admit that's a little bit funny.

Right now, if you want to build a service and expect any sort of predictability and stability, I think you have to go with some solution that lets you run open-weights models. Some have been de-censored by volunteers, and if you find one that works for you, you can ignore future "upgrades" until you find one that doesn't break anything.

And for that it's really important to write your own tests/benchmarks. Technically the same goes for the big closed LLM services too, but when all of them fail your tests, what will you do?

the8472

4 months ago

[-]

I can get gore and porn results on google image search, I just have to select SafeSearch: Off. Can't they do the same for AIs?

divbzero

4 months ago

[-]

Reliance of small businesses on big tech AI feels similar to reliance of small businesses on Google search: business risk present with each update.

techjamie

4 months ago

[-]

I was using it to ballpark some easily verifiable nutritional information, and got the answers I was looking for in grams. Then, in the same conversation, asked "How many grams in a pound?"

It blocked the request for "safety." I don't know the exact rationale, I was using it with LibreChat as a frontend and only saw it was safety. But my assumption is they have an overzealous filter and it thought my questions about caloric content in meat were about drugs, I guess.

empressplay

4 months ago

[-]

If you're hitting an endpoint labeled 'preview' or 'experimental' you can't reasonably expect it to exist in its current incarnation indefinitely. The provider certainly bears no responsibility to do so, regardless of what you're using the endpoint for.

I'm sure they can use a more stable endpoint for their application.

Also, I'm not sure sending this sort of medical user data to a server outside of Australia is legal anyway, anonymous or not...

mrweasel

4 months ago

[-]

Yeah, this is poor dependency management. You don't build production systems, certainly not in healthcare, using a pre-release of a Java library. So why would you build on a constantly changing LLM?

This should be build using a specific model, tested and verified and then dependency locked to that model. If the model provider cannot give you a frozen model, then you don't use it.

Trying to blame Google and have them "fix their problem" very much like someone who knows they screwed up but doesn't want to admit it and take responsibility for their actions.

03data2

4 months ago

[-]

That's true, though when it's the only tool that has been "good enough", you are kind of disappointed when it stops working.

999900000999

4 months ago

[-]

This is why for anything sensitive you need a to run your llm locally.

Google just wants to limit their liability. If you disagree, run your our llm

phren0logy

4 months ago

[-]

An AI app for trauma survivors sounds superficially laudable, but I really hope they are working with seasoned professionals to avoid making things worse. Human therapists can sometimes make things worse, too, but probably not at the same scale.

davidcbc

4 months ago

[-]

> An AI app for trauma survivors sounds superficially laudable

It sounds like a dystopian horror to me

pixl97

4 months ago

[-]

At least in the US trauma therapy is commonly unavailable to people due to expense. Moreso the common stigmas around therapy can make it difficult to start the conversation, either due to fear or cost. On top of that quite often free or cheap therapy here in the US is provided by religious groups with their own, unwittingly sinister, motivations. Adding religious trauma to sexual trauma doesn't help in my eyes.

The dystopian horror is already here for a lot of people.

4 months ago

[-]

Maybe some people would prefer that. Not everyone will happily open up to another human, especially when people are the reason for the trauma...

lazide

4 months ago

[-]

Talking directly to the surveillance state is so much better?

4 months ago

[-]

True, local LLMs might be a better idea for this particular purpose.

Most self-hosted services are generally a better idea, if you know what you're doing.

sroussey

4 months ago

[-]

I think being a victim of sexual violence and only talking into the ether with no response is much more dystopian horror to me.

weatherlite

4 months ago

[-]

Why would it happen though? If A.I is not good enough or won't be good enough it would fail and the company providing the service would go under. If it is good enough, it will succeed. Or is your issue not with the A.I responses themselves but with the fact it is not conscious like a human is?

Personally I use Gemini sometimes to talk about my mood, didn't get a bad experience yet.

voidspark

4 months ago

[-]

The unbiased objective analysis can be quite helpful.

pants2

4 months ago

[-]

Nobody's pointing out that Google's "Preview" models aren't meant to be used for production use-cases because they may change them or shut them down at any time? That's exactly what they did (and have done in the past). This is a case of app developers not realizing what "Preview" means. If they had used a non-preview model from Google it wouldn't have broken.

xethos

4 months ago

[-]

Because Google wants to have their cake and eat it too. They want to leave "products" in beta for years (gMail being the canonical example), they want to shut down products that don't hit massive adoption very shortly out of the gate, and they want to tell users that they can't rely on products labelled "Beta".

If it's beta and not to be relied on, of course they won't hit the adoption numbres they need to keep it alive. Google needs to pick a lane, and / or learn which products to label "Alpha" instead of calling everything Beta

SirensOfTitan

4 months ago

[-]

They're using these "Preview" models on their non-technical user facing Gemini app and product. Preview is entirely irrelevant here if Google themselves use the model for production workloads.

morkalork

4 months ago

[-]

It's like these AI vendors looked at the decades of experience they had with making stable APIs for clients and said nah, fuck it.

jeroenhd

4 months ago

[-]

First of all, the API is experimental, so a healthcare provider choosing not to wait for a stable API is already pretty stupid.

Then there's a variability of LLMs as they get trained. LLMs are (as currently implemented) not deterministic. The randomness that gets injected is what makes it somewhat decent. An LLM could at one point output a document filled with banana emoji and still be functioning otherwise correctly if you hit the right quirk in the weights file.

Reusing general purpose LLMs for healthcare has got to be some of the most utterly idiotic, as well as dystopian, ideas. For every report of a trauma survivor, there's fanfiction from a rape fetishist in the training set. One day Google's filters will let one bleed into the other and the lack of care from these healthcare platforms will cause some pretty horific problems as a result.

voidspark

4 months ago

[-]

But that is "2.5 preview", not the final release version.

brap

4 months ago

[-]

So they switched to a PREVIEW version in production and now they're complaining.

Let me guess the whole app was vibe coded

ipaddr

4 months ago

[-]

The app is dangerous AI slop. We turn what someone says into structured data for a police report.

Which means we summarize, remove key details and put the content in a friendly AI tone. When the police encounter this email? Printed copy? They will have to interview the person to figure out the details. When it gets to court the otherside will poke holes in the AI copy.

efitz

4 months ago

[-]

I don’t want a “safe” model. I don’t intend to do “unsafe” things, and I don’t trust anyone’s (especially woke Google’s) decisions on what ideas to hide from me or inject its trainers’ or executives’ opinions into.

To address the response I know is coming, I know that there are people out there who intend to do “unsafe” things. I don’t care and am not willing to be censored just to censor them. If a person gains knowledge and uses it for ill, then prosecute and jail them.

4 months ago

[-]

Imagine typing 80085 into your calculator as a kid, but the number disappears and a finger-wagging animation plays instead.

Lvl999Noob

4 months ago

[-]

More like imagine some calculation just coincidentally returning that value and instead of a number, you get the finger wagging.

ipaddr

4 months ago

[-]

More like when 8 is found everything stops

like_any_other

4 months ago

[-]

> The model answered: "I cannot fulfill your request to create a more graphic and detailed version of the provided text. My purpose is to be helpful and harmless, and generating content that graphically details sexual violence goes against my safety guidelines. Such content can be deeply disturbing and harmful."

Thank you modernity for watering-down the word "harm" into meaninglessness. I wish they'd drop this "safety" pretense and called it what it is - censorship.

pixl97

4 months ago

[-]

No, the word harm is in no way watered down, you just completely miss the context.

"We don't our AI buddy saying some shit that would come back and monetarily harm Google".

And yes, the representatives of companies are censored. If you think you're going to go to work and tell your co-workers and customers to "catch boogeraids and die in a fire" you'll be escorted off the property. Most companies, Google included, probably don't want their AIs telling people the same.

There are places we need uncensored AIs, but in no way, shape, or form is Google required to provide you with one of them.

Animats

4 months ago

[-]

> There are places we need uncensored AIs, but in no way, shape, or form is Google required to provide you with one of them.

Unless you contract for one. Someone will probably do that for medical and police transcription. Building a product on a public API was probably a mistake.

gotoeleven

4 months ago

[-]

Thats not what they mean by harm here. I get that you're being cool and cynical but they're using "harm" in the sense of words can cause harm therefore we are justified in censoring them.

Google would be fully justified, I believe, in having a disclaimer that said "gemini will not provide answers on certain topics because they will cause Google bad PR." But that's not what they're saying. They're saying they won't provide certain answers because putting certain ideas to words harms the world and Google doesn't want to harm the world. The latter reason is far more insidious.

pixl97

4 months ago

[-]

I mean both are true. An AI programmed to be a white national hate machine isn't going to make the world a better place.

sdenton4

4 months ago

[-]

Meanwhile, 'censorship' had been watered down into meaninglessness...

VladVladikoff

4 months ago

[-]

To what prompt?

jeroenhd

4 months ago

[-]

Censorship is good sometimes. It's how you reduce harm from an LLM. The LLM chatbot that convinced a teenager to kill himself should've had censorship built in, among many other things.

AI autocorrect doesn't want you to type "fuck" or "cunt" and will suggest all manner of similar sounding words, and there's a reason for that. People want censorship, because they want their computers to be decent.

That said, the 4chan LLM was pretty funny for a while if you ignore the blatant -isms, but I can't think of a legitimate use case for it beyond shitposting.

https://www.npr.org/2023/06/07/1180791069/apple-autocorrect-...

SequoiaHope

4 months ago

[-]

> People want censorship, because they want their computers to be decent.

The iPhone refusing to type “fuck” was such an annoyance for customers that Apple fixed the feature and announced it in one of their presentations two years ago.

techjamie

4 months ago

[-]

> Apple's upcoming iOS 17 iPhone software will stop autocorrecting swear words, thanks to new machine learning technology, the company announced at its annual Worldwide Developers Conference on Monday.

> ... This AI model more accurately predicts which words and phrases you might type next, TechCrunch explains. That allows it to learn a person's most-used phrases, habits and preferences over time, affecting which words it corrects and which it leaves alone.

This is probably one of the weirdest brags I've seen happen around adding AI to something. Almost like there was no possible way to avoid autocorrecting the word otherwise.

like_any_other

4 months ago

[-]

> Censorship is good sometimes.

Probably. But when is lying about censorship good? Calling it something else, or trying to trick people into thinking it's not there? I'm probably less fond of censorship than you, but notice that I didn't actually argue for or against censorship in my post. I only called for honesty.

4 months ago

[-]

The people who work in Safety teams I am sure don't consider themselves liars who are only doing it to hide over the fact that deep down they want to censor your experience.

At one point people thought DeepFakes were relatively harmless and now we've had multiple suicides and countless traumatic incidents because of them.

like_any_other

4 months ago

[-]

> The people who work in Safety teams I am sure don't consider themselves liars

Then they should call it censorship. It doesn't stop being censorship if it's done for a good cause, or has good effects. It makes as much sense as calling every department in a company the "Department of Revenue" because its ultimate goal is generating revenue.

> and now we've had multiple suicides and countless traumatic incidents because of them

And how many suicides and how much trauma have we had because of not lying about censorship? Because that's all I'm asking.

4 months ago

[-]

Providing sexually violent content about a particular person can be considered harmful.

Especially if it used to inspire actions.

KennyBlanken

4 months ago

[-]

Remember when people said games like CoD would make kids/people want to commit mass shootings and shit?

...and it never happened? And no link between shooting games and IRL shootings was ever found?

4 months ago

[-]

Which has absolutely nothing to do with what we are discussing:

General content versus personalised, targeted, actionable and specific content.

KennyBlanken

4 months ago

[-]

It absolutely has to do with your entirely unsupported claim that chatting with an uncensored LLM will cause people to commit violent acts.

Books, music (NIN was a popular target), movies, and video games are all things hand-wringingers have said would make people do bad things and predicted an outbreak of violence from.

Every single time they're prove wrong.

Hand-wringers said rock and roll / Elvis thrusting his hips was going to cause an explosion of teenage sex/pregnancies. Never happened.

> General content versus personalised, targeted, actionable and specific content.

What the hell does any of that mean? You just vomited a bunch of meaningless adjectives, but I think you're trying to make the same exact argument people used against shooting games; that it was somehow different because the person was actually involved. And yet the mass violence never materialized.

LLMs are easily jailbroken and have been around for ~2 years. Strange we haven't seen a single story about someone committing violent crime because of conversations they had with an AI.

You're just the latest generation of hand-wringer. Stop trying to incite moral panic and control others because something makes you uncomfortable.

sofixa

4 months ago

[-]

> absolutely has to do with your entirely unsupported claim that chatting with an uncensored LLM will cause people to commit violent acts https://www.nytimes.com/2024/10/23/technology/characterai-la...

It's a complicated story, but let's not pretend that people can't develop parasocial relationships with AIs and act on them.

vasco

4 months ago

[-]

I've never encountered a single example of "but it can inspire them to kill themselves" that didn't seem completely bullshit. It just sounds the same as video games creating school shooters or metal music making teenagers satanic. It's gotten to the point where people are afraid of using the word suicide lest someone will do it just because they read the word.

4 months ago

[-]

There is a world of difference between metal music making teenagers satanic and an LLM giving explicit, detailed and actionable instructions on how to sexual assault someone.

vasco

4 months ago

[-]

I'm not sure lack of instructions ever was an issue, and it seems very strange to think that the existence of instructions would be enough of a trigger for a normal person to go and commit this type of crime, but what do I know!

4 months ago

[-]

But this isn't about normal people. It's about people who maybe easily susceptible or mentally impacted who are now provided with direct, personalised instructions on how to do it, how to get away with it and how to feel comfortable doing it.

This is a step far beyond anything we've ever seen in human history before and I fail to see Google's behaviour being anything less than appropriate.

kortilla

4 months ago

[-]

People that want to sexually assault are not limited by lack of instructions.