Show HN: Aligning AI with Entropy Instead of 'Human Values' ( Paper)
Posted by NyX_AI_ZERO_DAY 14 hours ago
Hey HN,
wrote this short paper cause i'm honestly tired of current alignment methods (RLHF). optimizing for "human preference" just creates models that hallucinate plausibly to please the user (stochastic parrots ), instead of being grounded in reality.
i'm proposing a different framework called LOGOS-ZERO. idea is to ditch moral guardrails (which are subjective/fluid) and anchor the loss function to physical/logical invariants.
basically:
Thermodynamic Loss : treat high entropy/hallucination as "Waste". if an action increases systemic disorder, it gets penalized.
Action Gating: Unlike current models that must generate tokens, this architecture simulates in latent space first. if the output is high-entropy or logically inconsistent, it returns a Null Vector (Silence/No).
it attempts to solve the grounding problem by making the AI follow the path of least action/ entropy rather than just mimicking human speech patterns.
link to the pdf on zenodo: https://zenodo.org/records/17976755
curious to hear thoughts on the physics mapping, roast it if u want.
Comments
Comment by NyX_AI_ZERO_DAY 13 hours ago
the core mechanic is what i call 'computational otium '.
Instead of the standard feed-forward loop where the model has to output the next token no matter what, i'm proposing a hard gate.
Vasically, before outputting, the agent runs a monte carlo sim in latent space.
If the trajectory creates high entropy (chaos) or violates the logic constraints, the activation function returns 0.
it's an attempt to fix the "immediacy bias" of transformers. giving the model the option to effectively staysilence /do nothing if the action isn't valid thermodynamically. Thnking BEFORE speaking.
Equation 4 has the loss function details if anyone wants to tear apart the math.