Mapbox does this too, even as we've grown to 300 people. When I first joined every single employee had to put themselves on the team page, going through the whole commit, test, and push process with the help of someone who knew how. We have repos for everything, and use issues for things most people would write an email about.
Not only is our website easier for non-technical folks to update, but that means we have built-in transparency for much of the business. There are private repos for things that don't need to be aired company-wide, but public is the default.
> We have repos for everything, and use issues for things most people would write an email about.
I really like this.
When it comes to productivity in the enterprise I'm quite fascinated by the intersections of communication immediacy versus persistence. We like things that are immediate when we speak, but like things that are persistent when we're digging around years later.
A searchable per-theme repo with searchable issues and dialog sounds like a dream. I dunno if github has this built-in, but their integration API is very nice so having direct email integration and notifications or even local notification clients should be highly solvable.
Of course, you have to work with people where reading/writing isn't an issue. I've worked in several environments where it seemed nobody could read, and nobody could fathom why what they wrote wasn't 100% clear.
Instead of teaching employees to use git, why don't you put all of your data files on S3 and enable versioning?
If the non-engineers are just using git to upload and version json files full of data without modifying the code that reads them, uploading to s3 seems like a much safer and easier way to accomplish the same goal (without building CRUD interfaces to handle it).
> It is an incredible failure of modern version control systems that distributed branching means that they're impossible for novices to use.
No it isn't. This is entirely a sales problem.
You may not have noticed, but engineers typically can't use distributed branching either. The volume of tutorial and "i'm in trouble, plz hlp" material out there for Git (and really every other VCS) is substantial. Everyone has a story about how a certain workflow got them into trouble on a project.
The thing is, collectively we see a huge value in it, so we make the learning investment to use it. It's only after we learn it (and the pitfalls to avoid) that it's easy.
Novices don't use it because they haven't had the value proposition made to them yet. If you can reframe the argument for why, you'll get them using it.
It isn't a sales problem either. It's a tooling problem.
Git really ought to be treated as a low level tool that is called by a higher level tool that assembles content management workflows designed by developers that can be used by developers and non-developers alike.
Non-developers ought to be able to make 'commits' with git and pull requests without even realizing that git is the underlying tool.
Developers should just script their most common workflows and then use them 99% of the time.
Unfortunately the tooling that sits atop git is largely all shit and people have gotten it into their heads that one needs to have a deep understanding of git's quirks to be considered a 'serious' developer.
I agree with your premise, but don't understand your conclusion...
> Non-developers ought to be able to make 'commits' with git and pull requests without even realizing that git is the underlying tool.
Absolutely. By why 'ought'? What's stopping you?
Git has an API, lots of tools use it. There are commercial products that wrap git for designers and such. Many of my systems are GIT and Github enabled (Octokit), and the users are none the wiser.
> Developers should just script their most common workflows and then use them 99% of the time.
I am almost positive this already exists in many various forms online (on github in particular), for personal needs. I know a significant number of projects script checkins, builds, etc.
It breaks down, IMO, because of environmental constraints. You're trying to maintain alias sets across machines, on colleagues machines, or handle tricky merge issues per-project. At the end of the day the APIs breadth is required to handle corner cases, and familiarity with the breadth makes a lot of noise 'straightforward'.
I don't know of any decent tools. Something that would give an end user a nice UI to let them edit a markdown or JSON/YAML file and issue a pull request following a team defined workflow, for example? Or a WYSIWYG markdown editor that can issue pull requests following a team defined workflow at the click of a button?
>Git has an API
I noticed.
>I am almost positive this already exists in many various forms online
Most teams I've worked with don't have anything. One of the first things I do when I join a team is figure out their workflows (typically baked into people's heads) and get them down in a script.
>It breaks down, IMO, because of environmental constraints. You're trying to maintain alias sets across machines
A) git aliases are a poor substitute for a decent workflow program.
B) A lot of workflows integrate a lot of different things - e.g. by convention, branch names on new tickets often contain the JIRA ticket number followed by a slugified name. That means to properly script a workflow you need to make an API call to get a ticket number and slugify the name and use it to create the new branch.
> Git really ought to be treated as a low level tool that is called by a higher level tool
> Developers should just script their most common workflows and then use them 99% of the time.
FWIW, that is how git is organized: low-level "plumbing" commands for the content-addressable file system, and the high-level "porcelin" commands for a VCS. Moreover, most git commands have human and machine readable variants, to facilitate abstraction.
Entirely a sales problem.. but your next line you mention that engineers have significant troubles using current tools.
Sounds like it's not just a sales problem to get novices to use distributed branch management correctly.. but maybe a design problem with the current tooling. If it requires significant experience to use a tool correctly, then it is impossible for novices to use as you must turn them into experienced users.
My original comment was just to point out that GitHub is another UI on git repos, and thus might work around some of these design flaws.
The point is that all tools are hard if you don't have context about why they work the way they do. All of the tutorials are aimed at engineers, walking them through problems that they have and workflows that they'll want to use. Git Immersion is probably the best one I've ever seen and it's still aimed at budding Ruby developers.
Has anyone thought about doing this for content/marketing/design people? The tools that they use are hard too, but they know how because they were sufficiently motivated to learn them.
It's a big mistake for engineers to think that their jobs and the tools they use are significantly harder than in other fields.
> It's a big mistake for engineers to think that their jobs and the tools they use are significantly harder than in other fields.
I completely agree with everything you state in this comment, especially your last sentence. I have a strong sense that we have similar opinions but different ways of stating them.
It's a big mistake for engineers to think that their jobs and the tools they use are significantly harder than in other fields.
Hmmm... ok, I'd agree it's a mistake to simply assume this. But I'm not going to dismiss the possibility, either. Consider this thesis: there is a constant degree of churn in software development that adds significant complexity to the task of software development relative to many other fields.
Is keeping up with javascript framework churn par for the course in other fields? It's very difficult to make that call, since few of us have worked in a wide variety of fields in a meaningful way (I worked in a law firm right after college, but I wouldn't say I can meaningfully comment on the practice of law). But I have a sneaking suspicion that our jobs actually are pretty difficult. We deal with what I feel comfortable calling a hellacious degree of technical complexity at times.
Outside of JavaScript frameworks, which is just one area of dev, there is _significantly less_ churn. You do not have to keep up with this churn to be a developer.
There's been a decent amount of change on the backend, too. I hesitate to call it churn because not all of it deserves to be characterized this way. I started working with backend technologies in the late 90s, and there was a lot of talk about building data warehouses, mainly with distributed SQL databases, often in Oracle or another high cost private solution. Because my work involved industrial engineering, we also did a lot of analytics and prediction - collaborative filtering was one interesting approach to recommendation systems back then.
Since then, the "noSQL" movement occurred, and while I'm delighted to see a resurgence of interest in SQL, mongo, Cassandra, various graph databases, and other non-sql data stores do have valuable uses (I think some people just went too far against SQL and the relational model, which also remains exceptionally valuable). I didn't name all of them, but while there was value in there, some of it was churn and adopting technologies for the hell of it. But even when it wasn't churn and was legit, that's still a lot to learn.
Also, spark, hive, Hadoop, and a lot of storage and infrastructure has moved to a cloud environment (AWS is the biggie, plenty of others). Again, not necessarily bad tech, but a lot to learn.
Also, in my own particular field (this may not apply to all backend developers), machine learning did become a big factor in work, especially prediction systems. I did a lot of math in undergrad and grad school, but there was still plenty to learn there, including dusting off the old math books and re-learning how gradient descent works for neural nets, logistic regression, and then other approaches like bayesian methods, random forests, and so forth.
Also, in this time, the middle tier technology you needed to know also changed - to a large extent for the better, but C++ gave way to Java, which grew in to a mountain of a mess with EJB and other crap, then ruby and python, rails and Django (for the record, I loved ruby and rails but don't work in it much anymore). If you want to do distributed computing, you can use python, but you really might want to consider learning Scala.
Now, obviously you don't have to learn all this, you can specialize, and I wouldn't call it all churn (then again, I wouldn't call all the javascript framework stuff churn, either). But to work on the backend, you probably did have to keep up with a lot of that.
And like I said, I couldn't tell you what it's like to keep up with other fields... but yeah, I'm pretty comfortable saying that keeping up is pretty hard on the backend as well, and that learning these tools is no easy thing.
I don't see that the underlying concepts have changed very much from your examples, or that there are even many (any) new (as less than 25 years) old ideas are in there.
The tool isn't the skill, and honestly I don't want to work somewhere that values "x years using y" over "thoroughly understands concepts behind/around y".
I've been in IT since the late 90s, only development the last 5 years, but knowing how protocols work and what makes systems/architectures reliable gets way more play in my day to day job than knowing whatever the current tech is.
> It is an incredible failure of modern version control systems that distributed branching means that they're impossible for novices to use.
no, people are usually lazy about being extra precise about tracking transaction state of any sort (in conversation, finance, rational discussion, etc), and moving this imprecision from a linear sequence into a tree and from there into a forest only compounds things.
most of the problem with (d)VCS is that people are used to fuzzy transactional thinking, and furthermore, need to learn a new vocabulary to interact with it.
What complexity is added by keeping data in the tree? If we are talking about GB+ datasets, then yes that shouldn't be stored in the tree. But a small list of products that needs to be manually updated regardless of where it is stored? That is perfectly fine to store in the tree and I think the author was talking about that sort of data.
As always, it depends on the data. Sometimes git is fine. Sometimes you need S3. Sometimes Postgres. There is no one size fits all solution.
What complexity is added by keeping data in the tree?
Polluting the commit logs with thousands of extra commit messages if you're dealing with fast changing data for one.
If a developer is working on a "data file" that happens to contain logic (its going to happen no matter how diligent you are about keeping them separate), dealing with unnecessary merging from the constant edits is another.
>That is perfectly fine to store in the tree and I think the author was talking about that sort of data.
The individual datasets may be small, but from reading the authors comments here, the scope of what they're storing this way seems to be way beyond what I'd feel comfortable managing this way.
We’ve used this approach before we were able to intergrate required CMS, and it was fine. Our Product Owner would go into the YAML files in Github, click "Edit", make the changes, then hit the big green ”Create Pull Request" button. GitHub actually makes it fairly easy to do this flow and comes with sane default commit messages.
That's the niche that Box, Dropbox, GDrive, OneDrive and others are already living in. In fact, there is new generation of tools that offer way more value adds for niche uses like Zeplin for designers.
Branching, assuming that non-engineers are using a good GUI, is a really useful concept and idea for people to be exposed to. It helps people take more intellectual risks safely and cheap
Because non-technical staff are expected to access, interact, and keep up to date with developer workflow using git now. Seems like abstracting some of the higher-order functionality away while adding project management amenities and making git friendly for those people might be an opportunity.
You're essentially replacing a database and a CRUD interface, which doesn't have any of those features either.
I can't see how any of that is worth the extra time spent teaching git, or the extra complexity that comes along with merging your code and data into one repository.
A python file with a list in it, is significantly simpler than a remotely stored data file, which brings with it parsing, error handling, issues developing offline, and still requires AWS permission understanding and some S3 teaching.
Git, and GitHub, at the level we teach them, are not very complex.
Many of these files in our case are imported in Python so need to be in the codebase and committed. Some of them are templates which also need to be in the repositories.
>Some of them are templates which also need to be in the repositories.
Letting people who are editing templates use git seems perfectly fine to me. Using git to version constantly changing data is what I think isn't maintainable.
I think at some point that polluting the commit logs with data changes, dealing with merging issues, and training everyone who wants to change data on git imposes a hard limit on scaleability for this approach.
You may never hit that limit, but I don't think this would work for very large teams, and I suspect that if you git to the point that it stops working, it will be very difficult to sort it out.
I agree, the point here is to push off the cost of building the interfaces and making the data fully dynamic until they are actually needed, because we often find they aren't.
As for the commit pollution, I don't see this as much of a problem. In the long term we're always going to have lots of commits!
Square peg in a round hole. It sounds "non-engineers" means content managers and they just invented a really crummy CMS. Surely they could at least use Jekyll or something and let the content managers use markdown. They're shifting the cost of dev time to content management resources when they could just spend some cash on a commercial CMS like Contentful or Prismic that is easy to configure and far safer and more scalable.
We actually have various CMSs for content like blog posts and human readable text.
We use the solution in the article for things like the current delivery estimate, which changes twice a year and is a Python timedelta object (example, but more complex than that) or adding new tags to our support ticket system (not user facing). For those, we don’t want or need to spend the overhead of making an editing interface.
As for integrating another CMS, the whole point is that it’s just Python files in the repo, zero effort for developers, very little effort for everyone else, when editing is infrequent.
I find it interesting that you mention you are using Django, since my feeling has always been that easy/dumb/quick CMS CRUD interfaces has long been its bread-and-butter, especially the Django Admin.
That said, I don't question the benefits of version controlling data, and the usefulness of GitHub's web interface for quick and dirty commit work.
I am surprised that a quick search doesn't seem to turn up someone already trying to build a NetlifyCMS-like tool for inside the Django Admin. That seems like a natural fit for Django's CRUD philosophies.
We use a Django admin based CMS for part of our product, but most CMS solutions are about things like blog posts or knowledge bases, not for complex data models and changing configuration of an application (like our current shipping estimate). Building editing interfaces in Django is relatively straightforward, but by not doing it we can save hours, maybe a day, and that’s worth it if we’d only use the editing interface for 5 minutes a month, or are testing a new thing that we might throw away in a week.
We did exactly this and it's been great.
We started with our content editors just using WordPress and on one project we transitioned them to using Kramdown inside WordPress to build a Jekyll site. Once they were familiar with both, the next project we gave them Jekyll and they can enter content and meta info via Forestry.
Forestry has been a big help to us with support and quick turnaround for feature requests and it's been pretty smooth-sailing for our content team.
For internal documentation/knowledge-sharing, we started a MediaWiki.
Yep. We use MediaWiki for internal documentation, particularly for the support team, and we use FeinCMS (a Django CMS/blogging package) for our “tips” blog.
Unfortunately neither of these work for obscure internal data models that we have, or the time to integrate one isn’t worth it for an A/B test we might throw away next week (although if we decide to ship the A/B test we would take that time to integrate a CMS or add an editing interface on top of our DB).
I actually do like A/B tests in Git, especially if you have great commit messages and tags around this, because there really should be a historical record of all experiments that have been run.
Throwing them away is a tremendous waste of knowledge. We found this to be true with our marketing and that was the reason for the push to setting up a Wiki. Even if we don't retain code around an experiment, we have a documentation format for recording what we made, what our expectations were and our results.
I mean more about the data required for an A/B test.
For example, we want to test showing a set of images on a page. Some users get set A, some get set B, some get none, and images need to be changed weekly.
The list of image URLs might be in a file in git, and get changed by a product manager each week for the duration of the test.
The A/B test would be run by our A/B testing service, which stores historical tests and reports indefinitely.
We’d then come and move the URLs into the database and add an editing tool later on if we decided to ship the A/B test.
Yes, but the point is that now we don’t have to build a full CMS. Off the shelf ones won’t typically work for the sort of content being edited (i.e. it’s not blog posts etc).
But you do need an Engineer to hit the merge button for you when a build is green.
This is okay now, but could become a major pain when your company grows and Nancy in Accounting who is an aspiring writer starts making _hundreds_ of copy changes with pull requests every week. You're going to really wonder where all your engineering time went when suddenly you have a dozen Nancy in Accountings...or a hundred.
Then somebody sits down with Nancy and reminds her what her job is. If she can't handle it still then you revoke her access or fire her for abusing business tools.
But she gets her job done and probably very well. The point is that you gave _everyone in the company_ the ability to contribute to the product.
If you do that, you are communicating to your employees that everyone is welcome to make contributions to the product. If not, why the f* would you do it?
Because you expect your employees to behave like adults? I assume you don't just give everyone access without and set them free without any training or guidance. If someone can't handle a responsibility you take away the responsibility and/or replace them.
I have access to pretty much every Confluence page at work, including ones other people set up for executives to look at, but I don't go around making changes and editing those pages just because I have access.
People aren't robots. You can hold them accountable for their actions and they can make their own (good or bad) choices.
Fair point. Currently we don’t encourage this across the whole company for copy changes. We mostly teach this to specific people who need to edit specific data.
Most CMSes suck at versioning and in any case, you create massive headaches for yourself by having multiple versioning systems that need synchronization on the same project.
Since you're using python and you're trying to store configuration style data in something that you might want a non-programmer to maintain, might I suggest storing the data as YAML and parsing with my library?
I wrote it partly to make it easier for non-programmers to maintain structured data - which seems to be your use case. YAML is cleaner and easier to maintain than python dicts/lists/etc, and with a strictly validated schema you can trigger failures caused by accidental insertion of invalid data at an earlier stage.
Python has worked fairly well for us so far though, we have tests that would detect invalid syntax, and importing a Python file is easier than parsing and doing the error handling on YAML/JSON.
Shouldn't be much more than a 3-4 lines of code to import from YAML with this library, especially if you have a simple schema:
e.g.
import strictyaml as sy
MY_TRANSLATIONS_DICT = sy.load(
Path("translations.yml").text(),
sy.MapPattern(sy.Str(), sy.Map({"French": sy.Str(), "German": sy.Str()})),
).data
translations.yml:
Hello:
French: Bonjour
German: Guten Tag
Goodbye:
French: Au revoir
German: Auf wiedersehen
It would negate the need for most of those tests checking for invalid data, too (the above would raise an exception instantly if there is a YAML syntax error or if the data doesn't match the schema).
I convinced our BA to use git for internal documentation and meeting notes.
VSTS has a similar online text editor for editing and previewing markdown files.
I have a web hook that builds the markdown using mkdocs so we can have a nice web interface for others who do not have edit privileges.
VSTS recently released a wiki feature so I’m considering moving to that instead of mkdocs.
I think people see the value in wikis but it’s hard to sell git to someone who hasn’t used it before.
I’m always interested in this type of workflow but the problem that no one has seemed to solve is the local editing for a draft document for someone who understands markdown but not git.
> ...the problem that no one has seemed to solve is the local editing for a draft document for someone who understands markdown but not git.
.bat or .sh files? :)
A decade ago I was transitioning a small dev shop to Subversion, and daily user #2 was an non-tech accountant who was a few years away from retirement...
I didn't teach her branching. I didn't teach her merging. I didn't teach her what a trunk was, or how slicing works. She got two batch files: "Get_Newest" and "Share_Work", and presto-chango she was contributing user documentation and keeping our help-files in order inside our development project.
These days, UI competence willing, I'd just install VS Code or Atom with a markdown plugin and point it at the right GitHub repo. With smart branching and repo division I think it's a low-barrier way to get people committing via Git.
We tried the markdown support in TFS but it was not what we wanted and then we created a small tool using a static site generator. It builds the documentation site after someone pushes new content to the documentation folder. This worked well enough for all devs and some non-devs.
Don't be taking work out of my wheelhouse now! I learned django templates and how to make redundant rails models back in 2005 and I don't want the world to ever change, thank you very much!
While this is a much quicker solution in the short term, an engineer will have to context switch out of their work, watch the release go out and make sure nothing goes wrong—that all hurts productivity.
Perhaps it's less stressful and more natural for engineers to view deployment as part of "productivity." The assumption in these comments and many others I've seen across the industry is that engineers tend to view any work that requires context-switching away from coding as impeding their productivity. In startups you have to wear many hats even if you are an engineer, and it seems to me that maintaining the view that deployment is a drag on developer productivity is going to lead to a lot of unnecessary frustration. Why not just accept deployment as a part of "their work"?
>While this is a much quicker solution in the short term, an engineer will have to context switch out of their work, watch the release go out and make sure nothing goes wrong—that all hurts productivity.
Do you think this has caused a productivity increase? I would guess the additional overhead (unit tests for data files, training sessions for git, engineers doing code reviews, documentation, etc.) would cause more of a time suck.
I do like the idea of letting the stakeholder take care of it themselves on their schedule though without having to code up a GUI. The initial time investment for a GUI usually isn't that bad for us but the added maintenance is a huge PITA.
Take for example a scoreboard with targets on it. The simplest version is to have a dictionary of datetime to target. We would start with that, show the metric owner how to edit it, and let them maintain their targets. Won't need many tests, total time probably 30 mins max to get that solution set up (not including GitHub training, but that's amortized over multiple uses).
The alternative is building a data model, migration, editing interface which is probably at least 2 screens (list, edit) and 3 endpoints (list, modify, delete), needs validation on the input, needs to be put in a menu somewhere, might need a date input library on the frontend, etc.
This solution would probably end up being at least several hours all-in (code review, etc), and this is a simple data model that can easily be put in the database, there are others, with logic for example, that would need a lot more thought to get into the database. For an interface that might be used quarterly for 10 mins, it's not worth it.
Don't forget history, because someone will eventually make a mistake and a finger will need to be pointed :)
In my current org I would still lean on engineers making the config changes, but if we had a group of more tech-savvy stakeholders I could see this being useful. I hope you all post an update down the road on this process.
Nice to see something from Thread here. If anyone from the team is looking at the comments: my SO is eagerly awaiting the launch of the womenswear section.
I'm really digging all your engagement in these comments. I also like the attitude of the Thread engineering revealed in this piece. Very sensible dudes over there by the sounds of it. I'd savor the day when source control is common knowledge across many domains.
We also care deeply about diversity and are trying to improve our gender balance, so I'd strongly encourage any women or people of other groups traditionally under-represented in tech who might be reading this to apply, even if you don't feel you meet the requirements for the job.
Github needs a way to limit control to a directory or file in a repo. Its fine if a non-engineer changes text in a static html file, but they wouldn't know which files could possibly break tests.
It would also solve a lot of problems with mono-repos.
> Github needs a way to limit control to a directory or file in a repo.
Monolithic version control systems tend to do this very well.
In my head a distributed system like GIT really needs awesome and integrated sub-module features to make it go around. Last I tried this was in-theory easy but in-practice hard with Git.
As-is I believe separate repos or custom commit scripts/hooks might get you farther :/
All sorts. It could be a Python file with a dictionary of datetimes to targets for that week, it could be a file containing an enum of our suppliers, or (less a data file) it could be a template for a page on the site.
The point is that they are code, or a file directly loaded in, stored in source control, rather than dynamic data stored in a database that needs an editing interface.
This is a relatively common pattern I've come across over the years. Often used with translation files by people who are not coders. Sanity checking them in a pre-commit hook can be useful.
Not only is our website easier for non-technical folks to update, but that means we have built-in transparency for much of the business. There are private repos for things that don't need to be aired company-wide, but public is the default.