Surface Tension of Software

Posted by i8s 2 days ago

Comments

Comment by yuchi 2 days ago

While the reasoning holds generally, that specific example is wrong. The type the author presents is not the “User Profile” but a “User Profile Load Resource” (or something in those terms).

When you actually design interfaces you discover that there are way more states to keep in mind when implementing asynchronous loading.

1. There’s an initial state, where fetching has not happened yet

2. There may be initial cached (stale or not) data

3. Once loaded the data could be revalidated / refreshed

So the assumption that you either are loading XOR have data XOR have an error does not hold. You could have data, an error from the last revalidation, and be loading (revalidating).

Comment by jstimpfle 2 days ago

Yup, it's quite rare that ADTs (or Rust enums) are so clear cut and obvious.

The idea that the data model looks like

   enum XYZ {
      case A(B, C, D, E);
      case F(G, H, I, J, K);
      case L();
      case M(N, O);
   }

is just not true in practice.

I think messaging is one case where it can happen, but even there it's often good to combine fields and share them (and common code) over multiple types of messages. If Messages A and B both have a field "creationTime" or whatever (with identical semantics), it's probably a bad idea to model them as separate fields, because that leads to code duplication, which is unmaintainable.

Maybe I can be more precise by proclaiming that ADTs can be good to be clear what's "there", so they can be used to "send" information. But to write any useful usage code, typically a different representation that folds common things into common places is needed. And it might just happen that field F is valid in cases A and B but not C. That's life! Reality is not a tree but a graph.

That's why it's a bad idea to try and model the exact set of possible states and rule out everything else completely in the type system. Even the most complicated type systems can only deal with the simple cases, and will fail and make a horrible mess at the more complicated ones.

So I'm saying, there is value to preventing some misuse and preventing invalid states, but it comes at a cost that one has to understand. As so often, it's all about moderation. One should avoid fancy type system things in general because those create dependency chains throughout the codebase. Data, and consequently data types (including function signatures) is visible at the intersection of modules, so that's why it's so easy to create unmaintainable messes by relying on type systems too much. When it's possible to make a usable API with simple types (almost always), that's what you should do.

Comment by mrkeen 2 days ago

> that specific example is wrong

This will be discovered at compile time if you use immutability & types like the article suggests.

> So the assumption that you either are loading XOR have data XOR have an error does not hold.

If you design render() to take actual Userdata, not LoadingUserdata, then you simply cannot call render if it's not loaded.

The way to produce unspecified behaviour is to enable nulls and mutability. Userdata{..} changing from unloaded to loaded means your render() is now dynamically typed and does indeed hit way more states than anticipated.

Comment by 2 days ago

Comment by codeflo 2 days ago

Once you notice the pattern, you see it everywhere:

> Stability isn’t declared; it emerges from the sum of small, consistent forces.

> These aren’t academic exercises — they’re physics that prevent the impossible.

> You don’t defend against the impossible. You design a world where the impossible has no syntax.

> They don’t restrain motion; they guide it.

I don't just ignore this article. I flag it.

Comment by htk 2 days ago

Please elaborate. I thought the article was interesting and would love a contrasting take.

Edit: thank you for the answers, I don't know how I missed that em dash clue.

Comment by Jordan-117 2 days ago

Not just the em dash, but the pervasive "not X, but Y" construction.

(oh no am i one of them?)

Comment by polytely 2 days ago

it has a very LLM style of writing

Comment by jiggawatts 2 days ago

It is a sign of ChatGPT's style.

Comment by YJfcboaDaJRDw 2 days ago

[dead]

Comment by BSTRhino 2 days ago

Make your invalid states unrepresentable

Comment by beagle3 2 days ago

Indeed. But ... do not confuse your model with reality.

There's a folk story - I don't remember where I read it - about a genealogy database that made it impossible to e.g. have someone be both the father and the grandfather of the same person. Which worked well until they had to put in details about a person who had fathered a child with his own daughter - and was thus both the father and the grandfather of that child. (Sad as it might be, it is something that can, in fact, happen in reality, and unfortunately does).

While that was probably just database constraints of some sort which could easily be relaxed, and not strictly "unrepresentable" like in the example in the article - it is easy to paint yourself into a corner by making a possible state of the world, which your mental model dims impossible, unrepresentable.

Comment by louthy 2 days ago

Your example doesn’t validate your point. That’s a valid state made unrepresentable, not an invalid state made unrepresentable. Your example simply demonstrates a poorly architected set of constraints.

The critical thing with state and constraints is knowing at what level the constraint should be. This is what trips up most people, especially when designing relational database schemas.

Comment by jacquesm 2 days ago

Any assumption made in order to ship a product on time will eventually be found to have been incorrect and will cause 10x the cost that it would have taken to properly design the thing in the first place. The problem is that if you do that proper design you never survive to the stage where have that problem.

I think the solution to that is to continuously refactor, and to spell out very clearly what your assumptions are when you are writing the code (which is an excellent use for comments).

Comment by louthy 1 day ago

Continuous refactoring is much easier with well constrained data/type schemas. There are fewer edge cases to consider, which means any refactoring or data migration processes are simpler.

The trick is to make the schema represent what you need - right now - and no more. Which is the point of the “Make your invalid states unrepresentable” comment.

Comment by watt 2 days ago

I do see how it does, in a way. That something the designer thought is "invalid state" turns out a valid and possible state in real world. In terms or UI/UX, it's the uncomfortable long pause before something happens and screen renders (lack of feedback, feeling that system hangs). Or, content flicker when window is resized or dragged. Just because somebody thought "oh, this clearly is invalid state and can be ignored".

The real world and user experience requirements have a way of intruding on these underspecified models of how the world "should" be.

Comment by louthy 1 day ago

That’s still a poorly designed system. For UI there should be a ‘view model’ that augments your model, that view model should be able to represent every state your UI can be in, which includes any ‘intermediate’ states. If you don’t do this with a concrete and well constrained model then you’re still doing it, but with arbitrary UI logic, and other ad-hoc state that is much harder to understand and manage.

Ultimately you need to make your own pragmatic decisions about where you think that state should be and how it should be managed. But the ad-hoc approach is more open to inconsistencies and therefore bugs.

Comment by flir 2 days ago

> at what level the constraint should be

Hi, can you give an example? Not sure I understand what you're getting at there.

(My tuppence: "the map is not the territory", "untruths programmers believe about...", "Those drawn with a very fine camel's hair brush", etc etc.

All models are wrong, and that's inevitable/fine, as long as the model can be altered without pain. Focus on ease of improving the model (eg can we do rollbacks?) is more valuable than getting the model "right").

Comment by louthy 1 day ago

> Hi, can you give an example? Not sure I understand what you're getting at there.

An utterly trivial example is constraining the day-field in a date structure. If your constraint is at the level of the field then it can’t make a decision as to whether 31 is a good day-value or not, but if the constraint is at the record-structure level then it can use the month-value in its predicate and that allows us to constrain the data correctly.

When it comes to schema design it always helps to think about how to ‘step up’ to see if there’s a way of representing a constraint that seems impossible at ‘smaller’ schema units.

Comment by flir 1 day ago

I get it - thanks.

Comment by incognito124 2 days ago

That sounds like this (in)famous stackowerflow question: https://stackoverflow.com/a/6198257

Comment by ahoka 2 days ago

Wonder how old is this advice. Must be at least 20 years?

Comment by iamcalledrob 2 days ago

In theory, I think the article is correct.

Yet in practice, I've found that it's easy to over-model things that just don't matter. The number of potential states can balloon until you have very complex types to handle states that you know won't really occur.

And besides, even a well modelled set of types isn't going to save you from a number that's out of the expected range, for example.

I think Golang is a good counter example here -- it's highly productive, makes it easy to write very reliable code, yet makes it near impossible to do what the author is suggesting.

Properly considered error handling (which Go encourages) is much more important.

Comment by apples_oranges 2 days ago

Good type definitions are the foundation

Comment by ttoinou 2 days ago

   What does it mean when loading is false, error is Some, and data is also Some? The type allows nonsense. You write defensive checks everywhere. Every render must guess which combination is real

How about you never write the wrong state in the first place ? Then you have nothing to take care of

Comment by wvbdmp 2 days ago

Then you have to take care of nobody ever writing the wrong state, which is increasingly annoying the more people work on the thing.

Comment by qezz 2 days ago

> How about you never write the wrong state in the first place ?

Indeed, and tagged unions (enums in Rust) explicitly allow you to avoid creating invalid state.

Comment by bpavuk 2 days ago

this explains virtually everything:

- that's why OOP failed - side effects, software too liquid for its complexity

- that's why functional and generic programming are on their rise - good FP implementations are natively immutable, generic programming makes FP practical.

- that's why Kotlin and Rust are in position to purge Java and C, philosophically speaking - the only things that remain are technical concerns, such as JetBrains' IDEA lock-in (that's basically the only place where you can do proper Kotlin work) as well Rust's "hostility" to other bare-metal languages, embedded performance, and compiler speed.

Comment by i_am_a_peasant 2 days ago

what is it with this take that oop is dead… even the linux kernel heavily uses OOP.

inheritance is what has been changing in scope, where it’s discouraged to base your inheritance structure on your design domain.

Comment by jstimpfle 2 days ago

It always depends on the definition of OOP. Typical enterprise OOP is grounded of the idea of creating black boxes that you can't misuse. That creates silos that are hard to understand externally (and often internally as well, because their implementation tends to be composed of smaller black box objects). That practice may prevent some misuse but it creates a lot of problems globally because nobody understands what's happening anymore on the global scale. This leads to inefficiencies, both performance wise as well as development wise. Even with some understanding, there is typically so much boilerplate that changing things around becomes extremely tedious.

Actually, I have some similar concerns about powerful type systems in general -- not just OOP. Obsessing about expression and enforcement of invariants on the small scale can make it hard to make changes, and to make improvements on the large scale.

Instead of creating towers of abstraction, what can work better often is to try and structure things as a loosely coupled set of smaller components -- bungalows when possible. Interaction points should be limited. There is little point in building up abstraction to prevent every possible misuse, when dependencies are kept in check, so module 15 is only used by 11 and 24. The callers can easily be checked when making changes to 15.

But yeah -- I tend to agree with GP that immutability is a big one. Writing things once, and making copies to avoid ownership problems (deleting an object is mutation too), that prevents a lot of bugs. And there are so many more ways to realize things with immutable objects than people knew some time ago. The typical OOP codebase from the 90s and 00s is chock-full with unnecessary mutation.

Comment by ksclk 2 days ago

> the idea of creating black boxes that you can't misuse

Could you please expand upon your idea, particularly the idea that creating (from what I understood) a hierarchical structure of "blackboxes" (abstractions) is bad, and perhaps provide some examples? As far as I understand, the idea that you compose lower level bricks (e.g. classes or functions that encapsulate some lower level logic and data, whether it's technical details or business stuff) into higher level bricks, was what I was taught to be a fundamental idea in software development that helps manage complexity.

> structure things as a loosely coupled set of smaller components

Mind elaborating upon this as well, pretty please?

Comment by jstimpfle 2 days ago

> Could you please expand upon your idea that [..] a hierarchical structure of "blackboxes" [...] is bad?

You'll notice yourself when you try to actually apply this idea in practice. But a possible analogy is: How many tall buildings are around your place, what was their cost, how groundbreaking are they? Chances are, most buildings around you are quite low. Low buildings have a higher overhead in space cost, so especially in denser cities, there is a force to make buildings with more levels.

But after some levels, there are diminishing returns from going even higher, compared to just creating an additional building of the same size. And overhead is increasing. Higher up levels are more costly to construct, and they require a better foundation. We can see that most higher buildings are quite boring: how to construct them is well-understood, there isn't much novelty. There just aren't that many types of buildings that have all these properties: 1) tall/many levels 2) low overall cost of creation and maintenance 3) practical 4) novel.

With software components it's similar. There are a couple of ideas that work well enough such that you can stack them on top of each other (say, CPU code on top of CPUs on top of silicon, userspace I/O on top of filesystems on top of hard drives, TCP sockets on top of network adapters...) which allows you to make things that are well enough understand and robust enough and it's really economical to scale out on top of them.

But also, there isn't much novelty in these abstractions. Don't underestimate the cost in creating a new CPU or a new OS, or new software components, and maintaining them!

When you create your own software abstractions, those just aren't going to be that useful, they are not going to be rock-solid and well tested. They aren't even going to be that stable -- soon a stakeholder might change requirements and you will have to change that component.

So, in software development, it's not like you come up with rock-solid abstractions and combine 5 of those to create something new that solves all your business needs and is understandable and maintainable. The opposite is the case. The general, pre-made things don't quite fit your specific problem. Their intention was not focused to a specific goal. The more of them you combine, the less the solution fits and the less understandable it is and the more junk it contains. Also, combining is not free. You have to add a _lot_ of glue to even make it barely work. The glue itself is a liability.

But OOP, as I take it, is exactly that idea. That you're creating lots of perfect objects with a clear and defined purpose, and a perfect implementation. And you combine them to implement the functional requirements, even though each individual component knows only a small part of them, and is ideally reusable in your next project!

And this idea doesn't work out in practice. When trying to do it that way, we only pretend to abstract, we just pretend to reuse, and in the process we add a lot of unnecessary junk (each object/class has a tendency to be individually perfected and to be extended, often for imaginary requirements). And we add lots of glue and adapters, so the objects can even work together. All this junk makes everything harder and more costly to create.

> structure things as a loosely coupled set of smaller components

Don't build on top of shoddy abstractions. Understand what you _have_ to depend on, and understand the limitations of that. Build as "flat" as possible i.e. don't depend on things you don't understand.

Comment by ksclk 13 hours ago

Thanks a ton! While I don't have the experience to understand all of it, I appreciate your writing, like the sibling poster (and that you didn't delete your comment)!

It reminds me of huge enterprise-y tools, which in the long run often are more trouble than they're worth (and reimplementing just the subset you need perhaps would be better), and (the way you speak about OOP) bloated "enterprise" codebases with huge classes and tons of patterns, where I agree making things leaner and less generic would do a lot of good.

At first however I thought that you're against the idea of managing complexity by hierarchically splitting things into components (i.e. basically encapsulation), which is why I asked for clarification, because this idea seems fundamental to me, and seeing that someone is against it got me interested. I think now though that you're not against this idea, and you're against having overly generic abstractions (components? I'm not sure if I'm using the word "abstractions" correctly here) in your stack, because they're harder to understand, which I understand. I assume this is what blackbox means here.

Does it sound correct?

Comment by i_am_a_peasant 2 days ago

> When you create your own software abstractions, those just aren't going to be that useful, they are not going to be rock-solid and well tested. They aren't even going to be that stable -- soon a stakeholder might change requirements and you will have to change that component.

I also think it's about how many people you can get to buy-in on an abstraction. There probably are better ways of doing things than the unix-y way of having an OS, but so much stuff is built with the assumption of a unix-y interface that we just stick with it.

Like why can't I just write a string of text at offset 0x4100000 on my SSD? You could but a file abstraction is a more manageable way of doing it. But there are other manageable ways of doing it right? Why can't I just access my SSD contents like it's one big database? That would work too right? Yeah but we already have the file abstraction.

>But OOP, as I take it, is exactly that idea. That you're creating lots of perfect objects with a clear and defined purpose, and a perfect implementation. And you combine them to implement the functional requirements, even though each individual component knows only a small part of them, and is ideally reusable in your next project!

I think OOP makes sense when you constrain it to a single software component with well defined inputs and outputs. Like I'm sure many GoF-type patterns were used in implementing many STL components in C++. But you don't need to care about what patterns were used to implement anything in <algorithm> or <vector>. you just use these as components to build a larger component. When you don't have well defined components that just plug and play over the same software bus, no matter how good you are in design patterns it's gonna eventually turn into spagetti un-understandable mess.

I'm really liking your writing style by the way, do you have a blog or something?

Comment by jstimpfle 1 day ago

I think I agree with your "buy-in idea", but adding that the Unix filesystem abstraction is almost as minimal as it gets, at least I'm not aware of a simpler approach in existence. Maybe subtract a couple small details that might have turned out as not optimal or useful. You can also in fact write a string to an offset on an SSD (open e.g. /dev/sda), you only need the necessary privileges (like for a file in a filesystem hierarchy too btw).

A database would not work as mostly unstructured storage for uncoordinated processes. Databases are quite opinionated and require global maintenance and control, while filesystems are less obtrusive, they implement the idea of resource multiplexing using a hierarchy of names/paths. The hierarchy lets unrelated processes mostly coexist peacefully, while also allowing cooperation very easily. It's not perfect, it has some semantically awkward corner cases, but if all you need is multiplexing a set of byte-ranges onto a physical disk, then filesystems are a quite minimal and successful abstraction.

Regarding STL containers, I think they're useful and useable after a little bit of practice. They allow you to get something up and running quickly. But they're not without drawbacks and at some point it can definitely be worthwhile to implement custom versions that are more straightforward, more performant (avoiding allocation for example), have better debug performance, have less line noise in their error messages, and so on. The most important containers in the STL are quite easy to implement custom versions with fewer bells and whistles for. Maybe with the exception of map/red-black tree which is not that easy to implement and sometimes the right thing to use.

Comment by jstimpfle 1 day ago

> I'm really liking your writing style by the way, do you have a blog or something?

Thank you! I don't get to hear that often. I have to say I was almost going to delete that above comment because it's too long, the structure and build up is less than clear, there are a lot of "just" words in it and I couldn't edit anymore. I do invest a lot of time trying to write comments that make sense, but have never seen myself as a clear thinker or a good writer. To answer your question, earlier attempts to start a blog didn't go anywhere really... Your comment is encouraging though, so thanks again!

Comment by beagle3 2 days ago

OOP is different things to different people, see e.g. [0]. Many types of OOP that were popular in the past, are, indeed, dead. Many are still alive.

[0] https://paulgraham.com/reesoo.html

Comment by bpavuk 2 days ago

I'd personally declare dead everything except 3 and 4 because, unlike the rest, polymorphism is genuinely useful (e.g. Rust traits, Kotlin interfaces)

trivia: Kotlin interfaces were initially called "traits", but with Kotlin M12 release (2015), they renamed it to interfaces because Kotlin traits basically are Java interfaces. [0]

[0]: https://blog.jetbrains.com/kotlin/2015/05/kotlin-m12-is-out/...

Comment by i_am_a_peasant 2 days ago

1 is about encapsulation, that makes it really easy to unit test stuff. Say you need to access a file or a database in your test, you could write an abstraction on top of file or db access and mock that.

2 indeed never made sense to me since once everything is ASM "protected" means nothing, and if you can get a pointer to the right offset you can read "passwords". This claim of enforcing what can and cannot be reached from a subclass to help security never made sense to me.

3 i never liked function overloading, prefer optional arguments with default values.. if you need a function to work with multiple types of one parameter, make it a template and constrain what types can be passed

7 interfaces are a must have for when you want to add tests to a bunch of code that has no tests.

8 rust macros do this, and it's a great way to add functionality to your types without much hassle

9 idk what this is

Comment by mrkeen 2 days ago

I'm all for FP, but there's no way the mainstream is taking side-effects seriously.

Neither Kotlin nor Rust cares about effects.

Switching to Kotlin/Rust for FP reasons (and then relying on programmer discipline to track effects) is like switching to C++ for RAII reasons.

Comment by bpavuk 2 days ago

side effects are not necessarily a bad thing. unintentional side effects are. with some exceptions, such as UI frameworks, I find it harder to unintentionally create a side effect. also, UI is basically one huge side effect.

Kotlin and Rust are just a lot more practical than, say, Clojure or Haskell, but they both take lessons from those languages.

Comment by mrkeen 2 days ago

> side effects are not necessarily a bad thing. unintentional side effects are

Right. Just like Strings aren't inherently bad. But languages where you can't tell whether data is a String or not are bad.

No language forbids effects, but some languages track effects.

Comment by rightbyte 2 days ago

I wouldn't say OOP failed in any general sense. There seem to these strawman exceptions and Convay's law is often what actually is the problem.

Comment by Ygg2 2 days ago

Having tried Kotlin in IDEA, I must admit their refactoring tools for Java are miles ahead of Kotlin.

I don't know how strong lock in is.

Comment by bpavuk 2 days ago

near-infinite until they finish Kotlin LSP alpha.

also, hot take: Kotlin simply does not need this many tools for refactoring, thanks in part to the first-class FP support. in fact, almost every non-Android Kotlin dev I have ever met would be totally happy with analysis and refactoring levels on par with Rust Analyzer.

but even with LSP, I would still need IDEA (at least Community) to perform Java -> Kotlin migration and smooth Java interoperability.

Comment by frizlab 1 day ago

That’s one of the reasons why I love Swift. Representing and using a state is easy and super readable. It is an incredibly good language.