It is an incredible failure of modern version control systems that distributed b...

busterarm · on Jan 4, 2018

> It is an incredible failure of modern version control systems that distributed branching means that they're impossible for novices to use.

No it isn't. This is entirely a sales problem.

You may not have noticed, but engineers typically can't use distributed branching either. The volume of tutorial and "i'm in trouble, plz hlp" material out there for Git (and really every other VCS) is substantial. Everyone has a story about how a certain workflow got them into trouble on a project.

The thing is, collectively we see a huge value in it, so we make the learning investment to use it. It's only after we learn it (and the pitfalls to avoid) that it's easy.

Novices don't use it because they haven't had the value proposition made to them yet. If you can reframe the argument for why, you'll get them using it.

crdoconnor · on Jan 4, 2018

It isn't a sales problem either. It's a tooling problem.

Git really ought to be treated as a low level tool that is called by a higher level tool that assembles content management workflows designed by developers that can be used by developers and non-developers alike.

Non-developers ought to be able to make 'commits' with git and pull requests without even realizing that git is the underlying tool.

Developers should just script their most common workflows and then use them 99% of the time.

Unfortunately the tooling that sits atop git is largely all shit and people have gotten it into their heads that one needs to have a deep understanding of git's quirks to be considered a 'serious' developer.

bonesss · on Jan 5, 2018

I agree with your premise, but don't understand your conclusion...

> Non-developers ought to be able to make 'commits' with git and pull requests without even realizing that git is the underlying tool.

Absolutely. By why 'ought'? What's stopping you?

Git has an API, lots of tools use it. There are commercial products that wrap git for designers and such. Many of my systems are GIT and Github enabled (Octokit), and the users are none the wiser.

> Developers should just script their most common workflows and then use them 99% of the time.

I am almost positive this already exists in many various forms online (on github in particular), for personal needs. I know a significant number of projects script checkins, builds, etc.

It breaks down, IMO, because of environmental constraints. You're trying to maintain alias sets across machines, on colleagues machines, or handle tricky merge issues per-project. At the end of the day the APIs breadth is required to handle corner cases, and familiarity with the breadth makes a lot of noise 'straightforward'.

crdoconnor · on Jan 5, 2018

>Absolutely. By why 'ought'? What's stopping you?

I don't know of any decent tools. Something that would give an end user a nice UI to let them edit a markdown or JSON/YAML file and issue a pull request following a team defined workflow, for example? Or a WYSIWYG markdown editor that can issue pull requests following a team defined workflow at the click of a button?

>Git has an API

I noticed.

>I am almost positive this already exists in many various forms online

Most teams I've worked with don't have anything. One of the first things I do when I join a team is figure out their workflows (typically baked into people's heads) and get them down in a script.

>It breaks down, IMO, because of environmental constraints. You're trying to maintain alias sets across machines

A) git aliases are a poor substitute for a decent workflow program.

B) A lot of workflows integrate a lot of different things - e.g. by convention, branch names on new tickets often contain the JIRA ticket number followed by a slugified name. That means to properly script a workflow you need to make an API call to get a ticket number and slugify the name and use it to create the new branch.

paulddraper · on Jan 7, 2018

> Git really ought to be treated as a low level tool that is called by a higher level tool > Developers should just script their most common workflows and then use them 99% of the time.

FWIW, that is how git is organized: low-level "plumbing" commands for the content-addressable file system, and the high-level "porcelin" commands for a VCS. Moreover, most git commands have human and machine readable variants, to facilitate abstraction.

lurr · on Jan 6, 2018

or people could put in a little bit of effort to learn how to use their tool well enough.

I'm not against automating it, but introducing magic seems like the wrong approach to the problem.

(I mean for developers)

codemac · on Jan 4, 2018

Entirely a sales problem.. but your next line you mention that engineers have significant troubles using current tools.

Sounds like it's not just a sales problem to get novices to use distributed branch management correctly.. but maybe a design problem with the current tooling. If it requires significant experience to use a tool correctly, then it is impossible for novices to use as you must turn them into experienced users.

My original comment was just to point out that GitHub is another UI on git repos, and thus might work around some of these design flaws.

busterarm · on Jan 4, 2018

The point is that all tools are hard if you don't have context about why they work the way they do. All of the tutorials are aimed at engineers, walking them through problems that they have and workflows that they'll want to use. Git Immersion is probably the best one I've ever seen and it's still aimed at budding Ruby developers.

Has anyone thought about doing this for content/marketing/design people? The tools that they use are hard too, but they know how because they were sufficiently motivated to learn them.

It's a big mistake for engineers to think that their jobs and the tools they use are significantly harder than in other fields.

codemac · on Jan 5, 2018

> It's a big mistake for engineers to think that their jobs and the tools they use are significantly harder than in other fields.

I completely agree with everything you state in this comment, especially your last sentence. I have a strong sense that we have similar opinions but different ways of stating them.

busterarm · on Jan 5, 2018

Likewise, though I guess my style of commenting is a bit argumentative.

geebee · on Jan 5, 2018

It's a big mistake for engineers to think that their jobs and the tools they use are significantly harder than in other fields.

Hmmm... ok, I'd agree it's a mistake to simply assume this. But I'm not going to dismiss the possibility, either. Consider this thesis: there is a constant degree of churn in software development that adds significant complexity to the task of software development relative to many other fields.

Is keeping up with javascript framework churn par for the course in other fields? It's very difficult to make that call, since few of us have worked in a wide variety of fields in a meaningful way (I worked in a law firm right after college, but I wouldn't say I can meaningfully comment on the practice of law). But I have a sneaking suspicion that our jobs actually are pretty difficult. We deal with what I feel comfortable calling a hellacious degree of technical complexity at times.

busterarm · on Jan 5, 2018

Outside of JavaScript frameworks, which is just one area of dev, there is _significantly less_ churn. You do not have to keep up with this churn to be a developer.

geebee · on Jan 6, 2018

There's been a decent amount of change on the backend, too. I hesitate to call it churn because not all of it deserves to be characterized this way. I started working with backend technologies in the late 90s, and there was a lot of talk about building data warehouses, mainly with distributed SQL databases, often in Oracle or another high cost private solution. Because my work involved industrial engineering, we also did a lot of analytics and prediction - collaborative filtering was one interesting approach to recommendation systems back then.

Since then, the "noSQL" movement occurred, and while I'm delighted to see a resurgence of interest in SQL, mongo, Cassandra, various graph databases, and other non-sql data stores do have valuable uses (I think some people just went too far against SQL and the relational model, which also remains exceptionally valuable). I didn't name all of them, but while there was value in there, some of it was churn and adopting technologies for the hell of it. But even when it wasn't churn and was legit, that's still a lot to learn.

Also, spark, hive, Hadoop, and a lot of storage and infrastructure has moved to a cloud environment (AWS is the biggie, plenty of others). Again, not necessarily bad tech, but a lot to learn.

Also, in my own particular field (this may not apply to all backend developers), machine learning did become a big factor in work, especially prediction systems. I did a lot of math in undergrad and grad school, but there was still plenty to learn there, including dusting off the old math books and re-learning how gradient descent works for neural nets, logistic regression, and then other approaches like bayesian methods, random forests, and so forth.

Also, in this time, the middle tier technology you needed to know also changed - to a large extent for the better, but C++ gave way to Java, which grew in to a mountain of a mess with EJB and other crap, then ruby and python, rails and Django (for the record, I loved ruby and rails but don't work in it much anymore). If you want to do distributed computing, you can use python, but you really might want to consider learning Scala.

Now, obviously you don't have to learn all this, you can specialize, and I wouldn't call it all churn (then again, I wouldn't call all the javascript framework stuff churn, either). But to work on the backend, you probably did have to keep up with a lot of that.

And like I said, I couldn't tell you what it's like to keep up with other fields... but yeah, I'm pretty comfortable saying that keeping up is pretty hard on the backend as well, and that learning these tools is no easy thing.

busterarm · on Jan 6, 2018

I don't see that the underlying concepts have changed very much from your examples, or that there are even many (any) new (as less than 25 years) old ideas are in there.

The tool isn't the skill, and honestly I don't want to work somewhere that values "x years using y" over "thoroughly understands concepts behind/around y".

I've been in IT since the late 90s, only development the last 5 years, but knowing how protocols work and what makes systems/architectures reliable gets way more play in my day to day job than knowing whatever the current tech is.

cat199 · on Jan 5, 2018

> It is an incredible failure of modern version control systems that distributed branching means that they're impossible for novices to use.

no, people are usually lazy about being extra precise about tracking transaction state of any sort (in conversation, finance, rational discussion, etc), and moving this imprecision from a linear sequence into a tree and from there into a forest only compounds things.

most of the problem with (d)VCS is that people are used to fuzzy transactional thinking, and furthermore, need to learn a new vocabulary to interact with it.

sarchertech · on Jan 4, 2018

They're still teaching them about branches, pull requests, commit messages, merging, and build statuses.

Not to mention that mixing data and code into the same versioning system adds complexity that I don't think is maintainable.

There is a reason we tend to version data separately from code.

rpedela · on Jan 4, 2018

What complexity is added by keeping data in the tree? If we are talking about GB+ datasets, then yes that shouldn't be stored in the tree. But a small list of products that needs to be manually updated regardless of where it is stored? That is perfectly fine to store in the tree and I think the author was talking about that sort of data.

As always, it depends on the data. Sometimes git is fine. Sometimes you need S3. Sometimes Postgres. There is no one size fits all solution.

sarchertech · on Jan 4, 2018

What complexity is added by keeping data in the tree?

Polluting the commit logs with thousands of extra commit messages if you're dealing with fast changing data for one.

If a developer is working on a "data file" that happens to contain logic (its going to happen no matter how diligent you are about keeping them separate), dealing with unnecessary merging from the constant edits is another.

>That is perfectly fine to store in the tree and I think the author was talking about that sort of data.

The individual datasets may be small, but from reading the authors comments here, the scope of what they're storing this way seems to be way beyond what I'd feel comfortable managing this way.

madeofpalk · on Jan 4, 2018

No, they’re not.

We’ve used this approach before we were able to intergrate required CMS, and it was fine. Our Product Owner would go into the YAML files in Github, click "Edit", make the changes, then hit the big green ”Create Pull Request" button. GitHub actually makes it fairly easy to do this flow and comes with sane default commit messages.

sarchertech · on Jan 4, 2018

>No, they’re not.

The article explicitly mentions that they teach them these things.