typical programming usage
- Someone writes a program in language $foo.
- Language $foo has some concept of import statements, or another existing package manager, or both.
- Sometimes you can just take these as hints.
- Sometimes they're load bearing and contain additional information (not just content references alone) and can't be elided.
- Often these are unhashed (so, super problematic).
- We want to populate the formula's input map with minimal user effort, and relegate the $foo language's package manager to picking up the files we present it with.
- Why?
- Using WF inputs means better universal caching. (At best, the pkgman can match what we can do. But we can do what we do with global language-agnostic view. And many langpkgmans are actually quite sloppy.)
- Using WF inputs mean we can enforce the cache being readonly and CA. We know it can't get tainted.
- Using WF inputs means we can use our dependency selection systems — including the cool features like minimum-version-sprawl-selection or downstream autopropagation and testing.
- And we want to do all this reading from the language $foo norms because user adoption will be "thhhtppbt" if it takes any more work than that.
- But this is tricky...
- We need to figure out where to put the behavior at all (see "Two Major Options", below).
- Figuring out how to read the $foo language systems requires custom code per $foo.
- Not all $foo languages have a system that has enough information for us to do this offline, and sometimes not enough info to do this deterministically at all.
- Arguably this means we're contributing even more value! But it also means we're doing more work, and it's more interesting, which means more can go "wrong".
- We're generally still going to have users experiencing the language $foo tooling inside the build environment container...
- Which means we have to splay out the input mounting points in exactly the same way as that third party tool expects them.
- Means we need yet more code in our tooling to understand this.
- Sometimes this is fairly textual and straightforward.
- Sometimes it involves dealing with the third-party tool's hashing system (!).
- We're going to be desperately hoping we never find something that has a dumb enough folder layout that we can't trivially emulate it with our mounts.
- Have to also pass some config to the build container, sometimes.
- Maybe all the mounts have to go in a certain place in the homedir. Okay, great — but we don't support
~
in mount paths, so, that's fun if the build container specifies a non-default user name or $HOME
.
- Maybe the container needs some config (like e.g.
$GOPATH
) and that just has to agree.
- If anything needs more CLI flags for this, ugh. Let's hope nothing's that dumb.
Two Major Options
(Spoiler: approach#1 seems the pretty clear winner at the moment.)
- Make this a tool users run on their host.
- Pro: nobody's surprised if this needs network.
- Con: just mild "ugh", really.
- Running on the host. Squick.
- Actually, no. Realistically, I think we might do this in containers too. Just ones with some mostly prebaked formulas that we're not really showing to the user.
- Con: desync is easy.
- But it's not super clear if that's actually problematic. Sometimes it could even be intentional.
- We could do best-effort checking for it. (Though it would be very very low priority on the roadmap.)
- Pro: people will see this as transitional tooling. Which is good, because hopefully it is.
- Con: we usually try to stay out of writing "layer 3" stuff.
- I guess the time for this policy has simply ended.
- Figure out how to make it into a formula-that-generates-more-formulas.
- Con: seriously doubt this is even actually the desired user flow. Unless they do want the input extraction to be re-run on every build, and have a story between complicated and impossible for warpforge propagation tools overriding it.
- Con: this kind of graph expander sucks for static analysis.