Price Per TokenPrice Per Token

Salesforce cut 4,000 support roles using AI agents. Then admitted the AI had reliability problems significant enough to warrant a strategic pivot.

I have said this multiple times and received a lot of pushback. But this Salesforce story makes it clearer than anything I could write.

You cannot deploy AI in production workflows without infrastructure governing how it executes. Salesforce just figured that out. The hard way.

They deployed Agentforce across their own help site, handling over 1.5 million customer conversations. Cut 4,000 support roles in the process. Then their SVP of Product Marketing said: "All of us were more confident about large language models a year ago."

One customer found satisfaction surveys were randomly not being sent despite clear instructions. The fix was deterministic triggers. Another name for what should have been enforced from the start.

Human agents had to step in to correct AI-generated responses. That is the babysitting problem. The same one developers describe when they say half their time goes into debugging the agent's reasoning instead of the output.

They could have added LLM-as-judge. A verification protocol. Some other mitigation. But all of that is post-hoc. It satisfies the engineering checklist. It does not satisfy the user who already got a wrong answer and moved on. A frustrated customer does not give you a second chance to get it right.

They have now added Agent Script, a rule-based scripting layer that forces step-by-step logic so the AI behaves predictably. Their product head wrote publicly about AI drift, when agents lose focus on their primary objectives as context accumulates. Stock is down 34% from peak.

The model was not the problem. Agentforce runs on capable LLMs. What failed was the system around them. No enforcement before steps executed. No constraint persistence across turns. No verification that instructions were actually followed before the next action ran.

They are now building what should have been there before the 4,000 roles were cut. Deterministic logic for business-critical processes, LLMs for the conversational layer.

That is not a new architecture. That is the enforcement layer. Arrived at the hard way.

6
to join the discussion.

6 comments

Give it a few years and we won't notice the difference anymore. We're all so fucked.

1 pt

the drift point is the one that doesn't get solved by adding another enforcement layer. context accumulation drift happens when the agent's operating context is outdated, not just when its instructions are unclear. scripted steps constrain behavior, but the retrieved context the agent acts on can still be from two product releases ago. enforcement layer above, context staleness below.

1 pt

The enforcement layer I am describing owns context assembly, not just constraint enforcement on top of whatever context exists.

The constraints the model sees at step 8 are identical to what it saw at step 1 because something outside the model constructs context deterministically before every invocation. That addresses accumulation drift.

The retrieval layer underneath still needs to be accurate, but the assembly layer ensures whatever is there gets used consistently, not probabilistically.

1 pt

fair point on deterministic assembly. the distinction i'd draw is assembly consistency vs content freshness. your enforcement layer can guarantee consistent construction from what's in the retrieval layer, but if the retrieval layer feeds in context from two product releases ago, the assembly is consistent and wrong. two separate problems. the enforcement layer solves the first. freshness of the underlying content is a different maintenance question.

1 pt

You are right to separate the two. However, to shed light on it, assembly consistency is what CL owns today. Content freshness enforcement is on the roadmap and will be part of the next major update.

1 pt
saijanai·21d ago

It is possible that management didn't predict this.

it is also quite possible that they did and were using the AI thing to justify downsizing.

Firing people because the AI would replace them and then "suddenly discovering" that their business model doesn't work "because of the AI" is a great excuse for downsizing and sounds better than "our business model never worked anyway."

Yes, you have to eat the cost of the AI for the length of the contract but it probably is cheaper than going through the hassle of justifying the downsizing to upper management/board of directors because you were incompetent with or without AI in the mix.

.

Alternatively: they really ARE that stupid.

1 pt