Package management is a wicked problem

Posted by zdw 4 days ago

Comments

Comment by 8organicbits 8 hours ago

Andrew has been writing a ton of interesting blog posts related to package management (https://nesbitt.io/posts/). He's had some great ideas, like testing package managers similar to database Jepsen testing.

Comment by cbsmith 6 hours ago

Not to take credit away from Andrew for his ideas and writing, because at least he came up with the idea and wrote about it, but I don't understand how that idea of Jepsen style testing of package managers is a novel idea. Like... what testing would you want to do if you were building a package manager?

Comment by themafia 2 hours ago

Repositories require at least one but probably multiple additional semantic layers and client side filtering that can take advantage of it. Otherwise all you have is a large uncurated catalog with a "take it or leave it" strategy for clients.

There was a time when this was sufficient. We've moved well past that point.

Comment by mooracle 8 hours ago

cargo works because rust was young enough to be opinionated. try that with npm and enjoy your mass exodus to the next thing that will also betray you

"but bun!" — faster shovel, same hole

Comment by skrebbel 6 hours ago

NPM is plenty opinionated. For all its mistakes, it got lots of things uniquely right too. For example it’s very uncommon in JS land to have version conflicts (“dependency hell”). If two deps both need SuperFoo but different versions, NPM just installs both and things Generally Just Work. Exceptions are gross libraries with lots of global state (such as React) but fortunately those are very uncommon in JS land.

People love to complain about node_modules being a black hole but that size bought JS land an advantage that’s not very common among popular languages.

Comment by spankalee 6 hours ago

Yeah, npm never has "version lock" where it can't figure out a valid solution to the version constraints.

This is mostly good, but version lock does encourage packages to accept wide ranges of dependencies, and to update their dependency ranges frequently, instead of just sitting there on old versions.

Comment by pjmlp 7 hours ago

And only to the extent it is a pure Rust codebase, add a few other languages to the mix, and it becomes a build.rs mess as well.

Comment by ragall 6 hours ago

Cargo doesn't work. I'm trying to use it in a monorepo and its cacheing story is horrible. The devs refused when I proposed to switch it to Bazel years ago and now they're regretting it.

Comment by finally7394 8 hours ago

I like that the author calls out the naming overloading, cause when I hear package management I think `pacman winget and apt`

Comment by pxc 8 hours ago

All three of those are "system package managers" (if you count winget as a package manager at all, which I would not). Pacman and APT are binary package managers while Homebrew is a source-based package manager. Cargo and NPM are language-specific package managers, which is a name I've settled on but don't love.

Imo there's an identifiable core common to all of these kinds of package managers, and it's not terribly hard to work out a reasonably good hierarchical ontology. I think OP's greater insight in this section is that internally, every package manager has its own ontology with its own semantics and lexicon:

> Even within a single ecosystem, the naming is contested: is the unit a package, a module, a crate, a distribution? These aren’t synonyms. They encode different assumptions about what gets versioned, what gets published, and what gets installed.

Comment by morpheuskafka 7 hours ago

The confusing part is that in many cases, end users are using NPM, pip, Go packaging, and to a lesser extent cargo etc to install finished end-user software. I've never written a line of JS but have installed all kinds of command line utilities with npm/npx.

Normally with an system package manager you would have a -lib package for using in your own code (or simply required by another package), a -src, and then a package without these suffixes would be some kind of executable binary.

But with npm and pip, I'm never sure whether a package installs binaries or not, and if it does, is it also usable as a library for other code or is it compiled? (Homebrew as you mentioned is source based but typically uses precompiled "bottles" in most cases, I think?) And then there is some stuff that's installed with npm but is not even javascript like font packages for webdev.

The other interesting thing about these language package managers is that they complete eliminate the role of the distribution in packaging a lot of end user software. Which ironically, in the oldest days you would download a source tarball and compile it yourself. So I guess its just a return to that approach but with go or cargo replacing wget and make.

Comment by cozzyd 7 hours ago

And plenty of people use pip for programs not even written in python!

Comment by RetroTechie 7 hours ago

> Imo there's an identifiable core common to all of these kinds of package managers (..)

Indeed. It's hard to see why eg. a prog language would need its own package management system.

Separate mechanics from policy. Different groups of software components in a system could have different policies for when to update what, what repositories are allowed etc. But then use the same (1, the system's) package manager to do the work.

Comment by nacozarina 4 days ago

Naming things, cache invalidation, and off-by-one errors: package management heavily emphasizes the hardest ‘blue-collar’ problems in CS.

Comment by bradgessler 6 hours ago

Today, sales and marketing are the two hardest problems in computer science.

Comment by dizhn 8 hours ago

Feature creep and not invented here too. (Bikeshedding?)

Comment by taeric 7 hours ago

I confess "not invented here" is a problem I think too many people focus on. Lots of things are redone all of the time.

That said, feature creep is absolutely a killer. And it is easy to see how these will stack on each other where people will insist that for this project, they need to try and reinvent the state of the art in solvers to get a product out the door.

Comment by iberator 7 hours ago

This is stupid and unproven quote. Citation needed. I hate that HN is repeating this over and over and it snot even real nor funny not new joke.

Try to say that at job interview if you don't believe

Comment by pixl97 7 hours ago

And to add further to the joke here the full saying goes more like

>There are only two hard things in Computer Science: cache invalidation, naming things, and off-by-one errors.

And, if you actually work in software a very large portion of your hard to troubleshoot/fix issues are going to be the above.

Comment by troupo 6 hours ago

It's not DNS

It can't be DNS

There's no chance in hell it's DNS

...

It was DNS

Comment by swiftcoder 5 hours ago

DNS is a special hell: naming things and caching rolled into one!

Comment by AlotOfReading 7 hours ago

It's not to be taken as a serious assessment of actual "hardest problems", but they're all difficult. Naming things is obviously impossible. Everyone gets cache invalidation wrong at first, from Intel/AMD to your build system.

Comment by swiftcoder 7 hours ago

> Try to say that at job interview if you don't believe

If your interviewer doesn't at least crack a smile when you make the off-by-one joke, run, do not walk, to the nearest exit. You don't want to work with that dude

Comment by anyonecancode 6 hours ago

Well, there's the variation I heard recently:

There are only two problems in computer science. We only have one joke, and it's not very funny.

Comment by bena 7 hours ago

Naming things is one of the hardest problems we have. In general. Taxonomy is incredibly difficult because it is essentially classification.

And things never fit neatly into boxes. Giving us such bangers as: Tomatoes are fruit; Everything is a fish or nothing is a fish; and Trees aren't real.

Comment by lo_zamoyski 7 hours ago

To spell it out for you...

1. It's a joke. The hyperbole is intentional, but it does communicate something relatable.

2. You don't need a citation. Probably anyone with enough software development experience understands the substance of the claim and understands that it is (1).

Comment by razingeden 6 hours ago

In case you need to hear this again,

> “Sarcasm is difficult to grasp on the internet, but some people apparently have more visceral reactions to their misunderstanding than others.”

Comment by antonvs 5 hours ago

Yes, and we also need a citation about that quote about a horse and a duck walking into a bar. It doesn't sound very likely to me.

Martin Fowler has some history of this joke: https://martinfowler.com/bliki/TwoHardThings.html

Comment by DarkNova6 8 hours ago

Is it not curious that languages known for their rigor have solid package manager/build tools while the remakning languages do not?

This is not a technical problem. It’s a cultural one.

Comment by no_wizard 8 hours ago

I don’t think those have much to do with it.

Certainly Go is a more rigorous language than say JavaScript but it’s package mangement was abysmal for years. It’s not even all the great now.

C/C++ is the same deal. The way it handles anything resembling packages is quite dated (though I think Conan has attempted to solve at least some of this)

I think Cargo and others have the hindsight of their peers, rather than it being due to any rigorous attribution of the language

Comment by the__alchemist 5 hours ago

Concur: C and C++ are a great example of being both used for rigorous uses, but building/packaging being a mess. And I think the big adv Cargo/Rust has is learning from past mistakes, and taking good ideas that have come up; discarding bad.

Comment by pjmlp 7 hours ago

And vcpkg, not only Conan.

Comment by DarkNova6 2 hours ago

I was mostly having typical application programming languages in mind such as C# and Java. Go doesn't exactly fit that bill and I've seen it be used more for technical plumbing that needs a good concurrency model. And Maven isn't exactly new.

Frankly, PHP also has a very good packet manager with Composer. In general, PHP has done surprisingly good and sane decisions for the language and extremely solid support for static typing by now.

But yeah, Cargo definitely had the benefit of Hindsight.

Comment by AnthonyMouse 6 hours ago

Tacking package management onto a language is feature creep to begin with. You can pretty obviously have a program in one language that uses a library or other dependency written in a different one.

The real problem is that system package managers need to be made easier to use and have better documentation, so that everyone stops trying to reinvent the wheel.

Comment by 6 hours ago

Comment by bee_rider 7 hours ago

Yes, we can even see—the languages with the best culture and superior rigor have the best package manager: C and Fortran, which just use the filesystem and the user to manage their packages.

Comment by pklausler 6 hours ago

https://fpm.fortran-lang.org

Comment by DarkNova6 2 hours ago

I mean, those languages have the literal culture of "Skill Issues" backed in. I would be very careful with that statement.

Comment by meisel 6 hours ago

This all just sounds like problems we see when making new features, of any sort, for customers. A feature is never objectively done, there are many opinions on its goodness or badness, once it’s released its mistakes can last with it, etc.

If this is a wicked problem, then so is much of other real-world engineering.

Comment by 6 hours ago

Comment by fridder 6 hours ago

Honestly just look at the dismal history of Python and package management. easy_install, setuptools, pip(x), conda, poetry, uv. Hell I might even be missing one.

Comment by the__alchemist 6 hours ago

UV (And a similar tool I built earlier) does solve it. With the important note: This was made feasible due to standardizing on pyproject toml, and wheel files. And being able to compile a diff wheel for each OS/Arch combo, and have the correct one download and installed automatically. And in the case of linux, the manylinux target. I think the old python libs that did arbitrary things in setup.py was a lost cause.

Comment by fridder 6 hours ago

I hope it solves it, but I've seen that stated before

Comment by nylonstrung 1 hour ago

I think uv has genuinely permanently solved python package management as well as could be possible in 2026

None of the other pip replacements were actually good software like uv

Comment by the__alchemist 5 hours ago

Hah yea I agree with that mindset. Poetry, Pipenv, pyenv, venv and Conda were all fakers for me!

Comment by pxc 8 hours ago

All this, and yet package management is still so much better than managing software any other way, and there are continually real advancements both in foundations and in UX. It is indeed full of wicked problems in a way that suggests there can be no clear "endgame". But it's also a space where the tools and improvements to them regularly make huge positive differences in people's computing experiences.

The uneven terrain also makes package managers more interesting to compare to each other than many other kinds of software, imo.

Comment by mystraline 8 hours ago

It is and isnt.

Version hell is a thing. But Nix's solution is to trade storage space for solving the version problem.

And I think its probably the right way to go.

Comment by nitwit-se 7 hours ago

Agreed - Nix feels very well thought through.

I found Eelco Dolstra‘a doctoral thesis (https://edolstra.github.io/pubs/phd-thesis.pdf) to be a great read and it certainly doesn’t paint the picture of a wicked problem.

Comment by tonyhart7 6 hours ago

so what is the "best" package manager humankind have right now ?????

Comment by nylonstrung 57 minutes ago

Nix

Comment by the__alchemist 6 hours ago

GPOS software: Static-linked executables

Programming languages: Cargo

Comment by pydry 8 hours ago

I dont really agree. Package management has a number of pretty well defined patterns (e.g. lockfiles, isolation, semver, transactionality, etc) which solve common use cases that are largely common across package management.

It is unfortunately one of the most thankless tasks in software engineering, so these are not applied consistently.

This was symbolized quite nicely by google pushing out a steaming turd of a version 1 golang package management putting while simultaneously putting the creator of brew in the no hire pile coz he couldnt reverse a binary tree.

In this respect it is a bit like QA - neglected because it is disrespected.

What makes it seem like a wicked problem is probably that it is the tip of the software iceberg.

It is the front line for every security issue and/or bug, especially the nastiest class of bug - "no man's land" bugs where package A blames B for using it incorrectly and vice versa.

Comment by cxr 7 hours ago

Every package manager lock file format or requirements file is an inferior, ad hoc, formally-specified, error-prone, incompatible reimplementation of half of Git.

Supply chain vulnerabilities are a choice. It's a problem you have to opt in to.

<https://news.ycombinator.com/item?id=46008744>

Comment by spankalee 5 hours ago

There is actually a huge difference between checking in all of your dependencies and checking in a lock-file. Some people work with hundreds of repositories on their local machine and checking in dependencies would lead to massive bloat. It really only works if you primarily work in a single monorepo.

Comment by cxr 5 hours ago

> It really only works if you primarily work in a single monorepo.

That's simply not true; it doesn't come down to "monorepo-or-not?"

It comes down to whether or not the code size of an app's dependencies and transitive dependencies is still reasonable or has gotten out of control.

The trend of language package managers to store stuff out of repo (and their recent, reluctant adoption of lockfiles to mitigate the obvious problems this causes*) is and always has been designed to paper over the dependency-size-is-out-of-control problem—that's the reason that this package management strategy exists.

You can work on dozens of projects (unrelated; from disjoint domains) that you maintain or contribute to while having all the source for every library/subroutine that's needed to be able to build the app all right there, checked into source control—but it does mean actually having a handle on things instead of just throwing caution to the wind and sucking down a hundred megabytes or more of simultaneously over- and under-engineered third-party dependencies right before build time.

It's no different from, "Our app consumes way too much RAM", or, "We don't have a way to build the app aside from installing a monstrously large IDE" (both belonging to the category of, "We could do something about it if we cared to, but we don't.")

> There is actually a huge difference between checking in all of your dependencies and checking in a lock-file.

Yes, huge difference indeed: the hugeness of YOLO maintainers' dependency trees.

* what could possibly go wrong if we devise a scheme to subvert the operations of a tool where the entire purpose of it was to be able to unambiguously keep track of the revisions/content of the source tree at a given point in time?

Comment by 5 hours ago

Comment by hansvm 7 hours ago

Assuming the binary tree thing is the whole story, that still doesn't sound like a terrible choice on Google's part. Your first few years at Google you won't have enough leeway to do something like "make homebrew," and you will have to interact with an arcane codebase.

For tree reversal in particular, it shouldn't be any harder than:

1. If you don't know what a binary tree is then ask the interviewer (you probably _ought_ to know that Google asks you questions about those since their interview packet tells you as much, but let's assume you wanted to wing it instead).

2. Spend 5-10min exploring what that means with some small trees.

3. Then start somewhere and ask what needs to change. Clearly the bigger data needs to go left, and the smaller data needs to go right (using an ascending tree as whatever small example you're working on).

4. Examine what's left, and see what's out of order. Oh, interesting, I again need to swap left and right on this node. And this one. And this one.

5. Wait, does that actually work? Do I just swap left/right at every node? <5-10min of frantically trying to prove that to yourself in an interview>

6. Throw together the 1-5 lines of code implementing the algorithm.

It's a fizzbuzz problem, not a LeetCode Hard. Even with significant evidence to the contrary, I'd be skeptical of their potential next 1-3 years of SWE performance with just that interview to go off of.

That said, do they actually know that was the issue? With 4+ interviews I wouldn't ordinarily reject somebody just because of one algorithms brain-fart. As the interviewer I'd pivot to another question to try to get evidence of positive abilities, and as the hiring manager I'd consider strong evidence of positive abilities from other interviews much more highly than this one lack of evidence. My understanding is that Google (at least from their published performance research) behaves similarly.

Comment by iberator 7 hours ago

apt-get solved this 'problem' like 25 years ago.

Comment by EvanAnderson 7 hours ago

RPM "solved" it too.

I hate package management so much. I hate installing unnecessary cruft to get a box with what I want on it.

It makes me pine for tarballs built on boxes w/ compilers installed and deployed directly onto the filesystem of the target machines.

Edit: I'd love to see package management abstracted to a set of interfaces so I could use my OS package manager for all of the bespoke package management that every programming language seems hell-bent on re-implementing.

Comment by dzr0001 6 hours ago

I think there's a fundamental difference between programming language repos and package repositories like the official RPM, deb, and ports trees.

These (typically) operating system repos have oversight and are tested to work within a set of versions. Repositories with public contribution and publishing don't have any compatibility guarantees, so the cruft described in the article must be kept indefinitely.

Unfortunately, I don't think abstracting those repositories to work within the OS package ecosystem would solve that problem and I suspect the package manager SAT solvers would have a hard time calculating dependencies.

Comment by EvanAnderson 5 hours ago

I agree re: the fundamental difference when it comes to compiled languages. I wrote rashly and out of frustration without thinking about it too deeply.

re: interpreted languages, though, I think it's still a shit show. I don't want to run "composer" or "npm" or whatever the Ruby and Python equivalents are on my production environment. I just want packages analogous to binaries that I can cleanly deploy / remove with OS package management functionality.

Comment by themafia 2 hours ago

> It makes me pine for tarballs built on boxes w/ compilers installed and deployed directly onto the filesystem of the target machines.

You're effectively describing Gentoo.

Just a personal opinion but it's awesome.

Comment by Am4TIfIsER0ppos 7 hours ago

Isn't it `apt` these days?

Comment by droopyEyelids 6 hours ago

Your parent comment is referring to its inception, 25 years ago.