I’m tired of LLM bullshitting. So I fixed it.

@PolarKraken@lemmy.dbzer0.com

This sounds really interesting, I’m looking forward to reading the comments here in detail and looking at the project, might even end up incorporating it into my own!

I’m working on something that addresses the same problem in a different way, the problem of constraining or delineating the specifically non-deterministic behavior one wants to involve in a complex workflow. Your approach is interesting and has a lot of conceptual overlap with mine, regarding things like strictly defining compliance criteria and rejecting noncompliant outputs, and chaining discrete steps into a packaged kind of “super step” that integrates non-deterministic substeps into a somewhat more deterministic output, etc.

How involved was it to build it to comply with the OpenAI API format? I haven’t looked into that myself but may.

@SuspciousCarrot78@lemmy.world

Cheers!

Re: OpenAI API format: 3.6 - not great, not terrible :)

In practice I only had to implement a thin subset: POST /v1/chat/completions + GET /v1/models (most UIs just need those). The payload is basically {model, messages, temperature, stream…} and you return a choices[] with an assistant message. The annoying bits are the edge cases: streaming/SSE if you want it, matching the error shapes UIs expect, and being consistent about model IDs so clients don’t scream “model not found”. Which is actually a bug I still need to squash some more for OWUI 0.7.2. It likes to have its little conniptions.

But TL;DR: more plumbing than rocket science. The real pain was sitting down with pen and paper and drawing what went where and what wasn’t allowed to do what. Because I knew I’d eventually fuck something up (I did, many times), I needed a thing that told me “no, that’s not what this is designed to do. Do not pass go. Do not collect $200”.

shrug I tried.

@PolarKraken@lemmy.dbzer0.com

The very hardest part of designing software, and especially designing abstractions that aim to streamline use of other tools, is deciding exactly where you draw the line(s) between intended flexibility (user should be able and find it easy to do what they want), and opinionated “do it my way here, and I’ll constrain options for doing otherwise”.

You have very clear and thoughtful lines drawn here, about where the flexibility starts and ends, and where the opinionated “this is the point of the package/approach, so do it this way” parts are, too.

Sincerely that’s a big compliment and something I see as a strong signal about your software design instincts. Well done! (I haven’t played with it yet, to be clear, lol)

@SuspciousCarrot78@lemmy.world

Thank you for saying that and for noticing it! Seeing you were kind enough to say that, I’d like to say a few things about how/why I made this stupid thing. It might be of interest to people. Or not LOL.

To begin with, when I say I’m not a coder, I really mean it. It’s not false modesty. I taught myself this much over the course of a year and the reactivation of some very old skills (30 years hence). When I decided to do this, it wasn’t from any school of thought or design principle. I don’t know how CS professionals build things. The last time I looked at an IDE was Turbo Pascal. (Yes, I’m that many years old. I think it probably shows, what with the >> ?? !! ## all over the place. I stopped IT-ing when Pascal, Amiga and BBS were still the hot new things)

What I do know is - what was the problem I was trying to solve?

IF the following are true;

I have ASD. If you tell me a thing, I assume your telling me a thing. I don’t assume you’re telling me one thing but mean something else.
A LLM could “lie” to me, and I would believe it, because I’m not a subject matter expert on the thing (usually). Also see point 1.
I want to believe it, because why would a tool say X but mean Y? See point 1.
A LLM could lie to me in a way that is undetectable, because I have no idea what it’s reasoning over, how it’s reasoning over it. It’s literally a black box. I ask a Question—>MAGIC WIRES---->Answer.

AND

“The first principle is that you must not fool yourself and you are the easiest person to fool”

THEN

STOP.

I’m fucked. This problem is unsolvable.

Assuming LLMs are inherently hallucinatory within bounds (AFAIK, the current iterations all are), if there’s even a 1% chance that it will fuck me over (it has), then for my own sanity, I have to assume that such an outcome is a mathematical certainty. I cannot operate in this environment.

PROBLEM: How do I interact with a system that is dangerously mimetic and dangerously opaque? What levers can I pull? Or do I just need to walk away?

Unchangeable. Eat shit, BobbyLLM. Ok.
I can do something about that…or at least, I can verify what’s being said, if the process isn’t too mentally taxing. Hmm. How?
Fine, I want to believe it…but, do I have to believe it blindly? How about a defensive position - “Trust but verify”?. Hmm. How?
Why does it HAVE to be opaque? If I build it, why do I have to hide the workings? I want to know how it works, breaks, and what it can do.

Everything else flowed from those ideas. I actually came up with a design document (list of invariants). It’s about 1200 words or so, and unashamedly inspired by Asimov :)

MoA / Llama-swap System

System Invariants

0. What an invariant is (binding)

An invariant is a rule that:

Must always hold, regardless of refactor, feature, or model choice
Must not be violated temporarily, even internally. The system must not fuck me over silently.
Overrides convenience, performance, and cleverness.

If a feature conflicts with an invariant, the feature is wrong. Do not add.

1. Global system invariant rules:

1.1 Determinism over cleverness

Given the same inputs and state, the system must behave predictably.
No component may:
- infer hidden intent,
- rely on emergent LLM behavior
- or silently adapt across turns without explicit user action.

1.2 Explicit beats implicit

Any influence on an answer must be inspectable and user-controllable.
This includes:
- memory,
- retrieval,
- reasoning mode,
- style transformation.

If something affects the output, the user must be able to:

enable it,
disable it,
and see that it ran.

Assume system is going to lie. Make its lies loud and obvious.

On and on it drones LOL. I spent a good 4-5 months just revising a tighter and tighter series of constraints, so that 1) it would be less likely to break 2) if it did break, it do in a loud, obvious way.

What you see on the repo is the best I could do, with what I had.

I hope it’s something and I didn’t GIGO myself into stupid. But no promises :)

I’m tired of LLM bullshitting. So I fixed it.

I’m tired of LLM bullshitting. So I fixed it.

llama-conductor

The thing: llama-conductor

1) KB mechanics that don’t suck (1990s engineering: markdown, JSON, checksums)

2) Mentats: proof-or-refusal mode (Vault-only)

3) Vodka: deterministic memory on a potato budget

Privacy

A place to discuss privacy and freedom in the digital world.

Some Rules

Related communities

I’m tired of LLM bullshitting. So I fixed it.plus-square

I’m tired of LLM bullshitting. So I fixed it.plus-square

llama-conductor

The thing: llama-conductor

1) KB mechanics that don’t suck (1990s engineering: markdown, JSON, checksums)

2) Mentats: proof-or-refusal mode (Vault-only)

3) Vodka: deterministic memory on a potato budget

Privacy

A place to discuss privacy and freedom in the digital world.

Some Rules

Related communities

I’m tired of LLM bullshitting. So I fixed it.

I’m tired of LLM bullshitting. So I fixed it.