I think you're dismissing a critical consideration: There are systems that are i...

Geminidog · on Dec 28, 2020

Except I never dismissed this point. The problem with your post is that you assume I categorize design as something useless. I have not. Obviously there are tons and tons of things within the universe where the only possible solution is design. I am not arguing against this.

I am talking about a very specific aspect of the usage of "design" within software. I am specifically complaining about the endless iterations of exposes on software design patterns and architecture. The trends where history continuously repeats itself with FP becoming popular OOP becoming more popular than FP becoming popular again. What about the whole microservices/monoliths argument where monoliths started out more popular than microservices became popular and now monoliths are coming back into vogue again? Endless loops where nobody knows what is the optimal solution for specific contexts.

These are all methods of software organization AND my point is that THIS specific aspect of design is ripe for formalization especially with the endless deluge of metaphor drenched pointless exposes on "design" that are inundating the HN front page feed. It's obviously an endless circle of history repeating itself. I am proposing a way to break the loop for a specific aspect of software by pointing out the distinction between "design" and "formal theory." We all know the loop exists because people confuse the two concepts and fail to actually even know what to do to optimize something.

System architects are artisans not scientists and they will as a result suffer from the exact same pointless shifts in artistic styles/trends decade after decade and year after year as their artisan peers do. Totally ok for styles to shift, but our goal in software is to converge at an optimum as well and that's not currently happening in terms of software patterns and design architecture.

The path out of this limbo is to definitively identify the method for optimization formally, not add to the teeming millions of articles talking about software design metaphors.

zekrioca · on Dec 28, 2020

Any idea on how to formalize this area? Is anyone even trying to do that?

Twisol · on Dec 28, 2020

Personally, I'm nursing a thesis that the study of concurrency is fertile ground for a formalization of modular design. Where parallelism is the optimization of a software system by running parts of it simultaneously, concurrency has much more to do with the assumptions held by individual parts of the program, and how knowledge is communicated between them. Parallelism requires understanding these facets insofar as the assumptions need to be protected from foreign action -- or insfar as we try to reduce the need for those assumptions in the first place -- but I expect that concurrency goes much further.

Concurrent constraint programming is a nifty approach in this vein -- it builds on a logic programming foundation where knowledge only increases monotonically, and replaces get/set on registers with ask/tell on lattice-valued cells. LVars is a related (but much more recent) approach.

A different approach, "session types", works at the type system level. Both ends of a half-duplex (i.e. turn-taking) channel have compatible (dual) signatures, such that one side may send when the other side may receive. Not everything can be modeled with half-duplex communications, but the ideas are pretty useful to keep in mind.

I try to keep my software systems as functional as possible (where "functional" here means "no explicit state"). But there are always places where it makes sense to think in terms of state, and so I try to model that state monotonically whenever possible. At least subjectively, it's usually a lot simpler (and easier to follow) than unrestricted state.

(Note, of course, that local variables are local in the truest sense: other programmatic agents cannot make assumptions about them or change them. Short-lived, local state is as good as functional non-state in most cases.)

yowlingcat · on Dec 28, 2020

> I try to keep my software systems as functional as possible (where "functional" here means "no explicit state"). But there are always places where it makes sense to think in terms of state, and so I try to model that state monotonically whenever possible. At least subjectively, it's usually a lot simpler (and easier to follow) than unrestricted state.

Agreed. You mention LVars so I'm curious what you think about MVars and STM in general. I've always been fond of STM because relational databases and their transactions are a familiar and well understood concept historically used by the industry to keep state sane and maintain data integrity. SQLite is great, but having something that's even closer the core language or standard library is even better.

It's part of why I like using SQL to do the heavy lifting when possible. I like that SQL is a purely functional language that naturally structures state mutations as transactions through the write-ahead log protocol. My flavor of choice (Postgres) makes different levels of efficient read and write available through read isolation levels that can give me up to ACID consistency without having to reinvent the wheel with my read and write semantics. If I structure my data model keys, relations and constraints properly, I get a production strength implementation with a lot of the nice properties you talk about. And that's regardless of my service layer choice for my language that I can trust to stand up.

There's one exception in particular that I've seen begin to gain steam in the industry which I think is interesting, and that's Elixir. Because Elixir wraps around Erlang's venerable OTP (and distributed database mnesia), users can build on the top of something that's already solved a lot of the hard distributed systems problems in the wild in a very challenging use case (telecom switches). Of course, mnesia has its own issues so most of the folks I know using Elixir are using it with Phoenix + SQL. They seem to like it, but I worry about ecosystem collapse risk with any transpiled language -- no one wants to see another CoffeeScript.

Twisol · on Dec 29, 2020

I'm not especially familiar with either MVars or STM, so you'll have to make do with my first impressions...

MVars seem most useful for a token-passing / half-duplex form of communication between modules. I've implemented something very similar, in Java, when using threads for coroutines. (Alas, but Project Loom has not landed yet.) They don't seem to add a whole lot over a mutable cell paired with a binary semaphore. Probably the most valuable aspect is that you're forced to think about how you want your modules to coordinate, rather than starting with uncontrolled state and adding concurrency control after the fact.

STM seems very ambitious, but I struggle to imagine how to build systems using STM as a primary tool. Despite its advantages, it still feels like a low-level primitive. Once I leave a transaction, if I read from the database, there's no guarantee that what I knew before is true anymore. I still have to think about what the scope of a transaction ought to be.

Moreover, I get the impression that STM transactions are meant to be linearizable [1], which is a very strong consistency requirement. In particular, there are questions about determinism: if I have two simultaneous transactions, one of them must commit "first", before the other, and that choice is not only arbitrary, the program can evolve totally differently depending on that choice.

There are some situations where this "competitive concurrency" is desirable, but I think most of the time, we want concurrency for the sake of modularity and efficiency, not as a source of nondeterminism. When using any concurrency primitive that allows nondeterminism, if you don't want that behavior, you have to very carefully avoid it. As such, I'm most (and mostly) interested in models of concurrency that guarantee deterministic behavior.

Both LVars and logic programming are founded on monotonic updates to a database. Monotonicity guarantees that if you "knew" something before, you "know" it forever -- there's nothing that can be done to invalidate knowledge you've obtained. This aspect isn't present in most other approaches to concurrency, be it STM or locks.

The CALM theorem [2] is a beautiful, relatively recent result identifying consistency of distributed systems with logical monotonicity, and I think the most significant fruits of CALM are yet to come. Here's hoping for a resurgence in logic programming research!

> There's one exception in particular that I've seen begin to gain steam in the industry which I think is interesting, and that's Elixir.

I've not used Elixir, but I very badly want to. It (and Erlang) has a very pleasant "functional core, imperative shell" flavor to it, and its "imperative shell" is like none other I've seen before.

[1] https://jepsen.io/consistency/models/linearizable

[2] https://rise.cs.berkeley.edu/blog/an-overview-of-the-calm-th...

Geminidog · on Dec 28, 2020

There's many topics in this area. Ones that are well known in industry are algorithmic complexity theory and type theory. Ones that are less well known include the two resources below.

http://www4.di.uminho.pt/~jno/ps/pdbc.pdf

https://softwarefoundations.cis.upenn.edu

I suggest you get use to ML style languages before diving into those two resources (Haskell is a good choice) as it's not easy to learn this stuff and I think it's also part of the reason why it hasn't been so popular in industry.

The first resource builds towards a prolog like programming style where you feed the computer a specification and the computer produces a program that fits the specification.

The second resource involves utilizing a language with a type checker so powerful that the compiler can fully prove your program correct outside of just types.

Both are far away from the ideal that the industry is searching for but in terms of optimizing and formalizing design these two resources are examples of the right approach to improving software design.

I haven't found anything specifically on the organization of software modules so as far as I know none exists. But given the wide scope of software research I'm sure at least one paper has talked about this concept.

Twisol · on Dec 28, 2020

> I haven't found anything specifically on the organization of software modules so as far as I know none exists.

Parnas (1971) is seminal on this topic. https://apps.dtic.mil/sti/pdfs/AD0773837.pdf

Geminidog · on Dec 28, 2020

Sure this paper introduces the concept of modules in software back when modules were non-existent. As far as I know, there's nothing on the optimal way to organize (keyword) modules.

Twisol · on Dec 28, 2020

> Sure this paper introduces the concept of modules in software back when modules were non-existent.

No... that's not correct. The very first non-clerical page quotes from a textbook discussing modular systems, and Parnas himself notes a distinct lack of material on how to actually organize and break down the system into modules.

The paper is literally called "On the criteria to be used in decomposing systems into modules."

It is prudent to at least read the preexisting material if you are going to dismiss it.

Geminidog · on Dec 28, 2020

My mistake, I skimmed it and saw assembly language and I assumed it was that really early seminal paper that introduced modules to the programming world. Obviously my guess was wrong.

Twisol · on Dec 28, 2020

> Your complaints about the words "architecture" and "design" sound like a cached rant.

To be completely fair, my notes were similarly cached, and it was a little poor of me not to respond more directly to the substance of the original post.

Geminidog · on Dec 28, 2020

Mine wasn't. He's just making that up.