I don't know how many hours the author has spent/wasted on this topic (and how many people are now wasting their time coming up with other solutions that in the end do NOT fix the real problem), but...
Quite frankly, I find this whole "look i did a horrible hack and let's see who can make the best worst horrible hack" thing quite stupid and silly.
GNU Make is free software released under a free license, my opinion is that instead of doing that crazy thing, the author could have just written a patch for GNU Make in order to make it export a "JOBS" environment variable to all its child processes.
Oh but yes, "I felt this feature was missing so I added it" is way way way less cool than "geez the gnu make folks are insane lollerplex they have no way to know how many jobs they're running".
> I don't know how many hours the author has spent/wasted
> on this topic
The author, John Graham-Cumming, has been extremely active on the GNU Make mailing list, is the author of the "GNU Make Standard Library" of supplemental functions for Make, has written at least 2 books on GNU Make, and has developed commercial products that integrate with GNU Make (namely, "Electric Make").
That is to say: He is one of the world experts on GNU Make.
I'm confident that he did this more "for fun" than because he actually wanted to get the number of jobs.
Do you know if this person hangs out in IRC? I asked this precise question 10 days ago in #emacs, and I was curious to see my precise question answered, almost the way I phrased it.
Ignoring the fact that I doubt that the original post was serious:
Of course you can patch the software and in the long term that will of course be the best solution. But it can take a long time until that patch actually makes it into the main repo, and even longer until it will be in the default `make` on all major distributions. So if you want to use this feature right now then writing a patch is useless for you, unless you only ever use it on your own machine.
The big advantage of the workaround is that it actually works with current versions of make.
Oh, boy. Take a chill pill man! Is no one allowed to do anything for fun any more?
It took me about 15 minutes to come up with the solution. I wrote it up for fun because it illustrates some of the functionality of GNU make that people might not be aware of (order-only prerequisites, $(eval)) and using $(call) in a recursive fashion.
You can use relative paths like in most places automake accepts file names. The only time I've run into trouble using non-recursive automake (1.13.4) is adding flex+bison source. This requires BUILT_SOURCES, which is confusing, and automake is very picky and non-obvious about the names of the generated source files. I think it may not have been updated to support relative paths yet? It required some heavy workarounds:
AM_YFLAGS = -d
# this was VERY important
# (must be abs_top_srcdir, not top_srcdir)
foo_LFLAGS = --header-file=$(abs_top_srcdir)/src/xyzzy_lexer.h
foo_SOURCES += src/xyzzy_lexer.l src/xyzzy_parser.y
EXTRA_DIST += src/xyzzy_lexer.h
# note the inconsistency here
BUILT_SOURCES += src/foo-xyzzy_lexer.c \
src/xyzzy_parser.h
# ...and this doesn't match BUILT_SOURCES
MAINTAINERCLEANFILES += src/foo-xyzzy_lexer.c \
src/xyzzy_lexer.h \
src/xyzzy_parser.h
Even with this annoyance, I would still highly recommend using this style instead of the traditional (recursive) automake. Autotools Mythbuster has more information on this other modern-autotools features:
"Why the tab in column 1? Yacc was new, Lex was brand new. I hadn't tried either, so I figured this would be a good excuse to learn. After getting myself snarled up with my first stab at Lex, I just did something simple with the pattern newline-tab. It worked, it stayed. And then a few weeks later I had a user population of about a dozen, most of them friends, and I didn't want to screw up my embedded base. The rest, sadly, is history."
If you don't care about compatibility with existing Makefiles, you can do this already with GNU make; just set ".RECIPEPREFIX" to some other character.
At least every 2nd C programmer working on a non-unix platform has been burned by this at least once.
It's like a 10ft deep pothole on Broadway that won't get fixed because traditionalists see it as a 'national monument'.
If you are willing to use /proc, you basically have Linux, and can assume just as well that your shell is bash.
If so, $$ is the easiest way to get the PID of the current shell. That works in several other shells, too, but I am not sure sh is guaranteed to have it (FreeBSD's sh has it. See http://www.freebsd.org/cgi/man.cgi?query=sh)
It's an amusing exercise, but for actual practical use, the right solution is to share make's jobserver. If you have a build rule that wants to run multiple things in parallel, put + in front of the command, and then your command will receive the file descriptors for the jobserver pipe on the command line. Then, when you want to start a parallel job, do a blocking read of a byte from the read fd, and when you're done with that job, write a byte to the write fd.
This does not generalize to arbitrary purposes. If you just want to detect that the user has run more jobs than your build system can possibly benefit from and print a warning, there is no reason to start messing with the jobserver. This is not the Ockham solution :)
Why print such a warning? For most projects, make -j65536 will be effectively equivalent to make -j, both of which are perfectly fine. Suppose you're on a system with more CPUs than you have build rules in your build system; you shouldn't print a warning just because someone does make -j$(nproc).
$ cat Makefile
test:
+env | grep MAKEFLAGS
$ make -j4 test
env | grep MAKEFLAGS
MAKEFLAGS= -j --jobserver-fds=3,4
If you want to act like a sub-make, then parse MAKEFLAGS from your environment, and if you see -j and --jobserver-fds=R,W , then parse R and W as file descriptor numbers, block waiting for a byte from R before starting an additional parallel job, and write a byte to W when done with a parallel job.
As crufty as autotools is, nothing comes close to being a more robust build system. It does everything. Learning to use the autotools is difficult, but what's most frustrating is trying to build software that uses clever Makefiles with a bunch of hardcoded assumptions about the platform you are building on that you have to hack to get to work.
Because your average configure.in actually works on platforms the devs haven't tested it on, right?
Truth is most autoconf setups are just arbitrary pastiches of other autoconf setups that the devs didn't really understand. Half the time a bunch of the test results aren't even used, or are testing for something that isn't an issue on any platform it's ever going to be compiled on.
And when that makefile breaks? At least you have a hope in hell of understanding why it broke. Autoconf breaks? Good luck.
1000x this. Nothing is more "entertaining" than watching a configure script run for a full minute, testing for the presence of standard headers like stdio.h (like wtf are you going to do without that?), in order to build a makefile that can then actually compile the project in only four seconds.
This is probably also why libtool's configure probes no fewer than 26 different names for the Fortran compiler my system does not have, and then spends another 26 tests to find out if each of these nonexistent Fortran compilers supports the -g option.
I did this to see for myself if autotools bashing is justified and I can tell you this: If a configure script is that bad, then the project's developers did a poor job using autotools.
It's hard to believe the problem lies solely with the developers when every project is this bad. For fun, I just downloaded the source for GNU grep. You'd figure they might have some experience. Among the 600 tests configure performs is
checking whether mkfifoat is declared without a macro... yes
Why? Nowhere in grep is there a single call to mkfifoat. Why does it care?
I also liked this:
checking for MAP_ANONYMOUS... yes
checking for MAP_ANONYMOUS... yes
checking for MAP_ANONYMOUS... yes
checking for MAP_ANONYMOUS... yes
checking for MAP_ANONYMOUS... yes
checking for MAP_ANONYMOUS... yes
checking for MAP_ANONYMOUS... yes
The downside is that after you make it work there is no incentive/glamor in getting back to clean-up. Thus, not enough people get to have the minimum knowledge to fix things risking whatever worked to stop working.
And the work has a bigger risk because you can't actually check on more than a couple of distributions before you ship.
I wish this were true. Getting cmake to do something off the wall like building static executable binaries is...frustrating.
I had high hopes for scons, but that turned into a quagmire. Any build system that's going to unseat autotools will need to obviate the need for almost all custom scripts. AKA, specify the 'what', and let the user apply a 'recipe' for the 'how'. Or something...if I knew what it was supposed to look like I'd build it.
Cmake missed a huge opportunity when they developed their own syntax. Ironically, Cmake was invented specifically[0] for a project that was embedding Tcl, which is what they should have done with Cmake, too (embed Tcl). Cmake can be really nice to use for simple projects, but for the cases it gets you 95% to where you want to be... So frustrating.
The bigger problems with building a static binary on Linux are that:
1. Most Linux distros don't install static versions of libraries. You can't link in a static library dependency that doesn't exist.
2. glibc can't "really" be linked statically because of the way it's designed. See here for the gory details: https://gcc.gnu.org/ml/gcc/1998-12/msg00083.html
Naturally, you have problems #1 and #2 with autotools or any other build system as well, since they aren't CMake problems.
Out of curiosity, what are the quagmire-ish aspects of scons? I've been using it for a while, and it hasn't seemed too bad, which probably means that I haven't run into them yet.
Scons itself isn't bad, but it's power is also it's downfall: the build script is an unrestricted Python file. I almost always find that something 'clever' has been stashed in the SConscript. Plus, everybody writes wrapper functions for the env.* stuff - meaning just about every project has a second, hidden codebase that you need to learn first.
Ah, got it. I'll admit that I tend to make build scripts that are a bit too complicated, due to the freedom of having a full language, but it is quite nice to have when needed. It does end up with build scripts that are far more flexible, though, not having read build scripts that others have written, I don't know how much of a cost there is to it.
A lot of large projects* I use were formally autotools, and now use CMake. CMake is by far easier to maintain and allows easier integration this code in other projects with dependencies. I will agree that the documentation isn't great, the only way I was able to use it meaningfully by understanding how larger established projects utilize it.
I have a problem with make. It's super slow. And impossible to get right. Autotools is awful but better than hand-edited Makefiles. CMake is far far better than either of the previously mentioned options.
The last one I tried was drake, the "Make for data." I was really excited when I saw the demo, it was exactly what I needed!
And then I tried to run it. It took 6.5 seconds to run without any input. The "can't handle filenames with spaces" bug has been open for almost two years.
Sometimes the modification time is not useful for deciding when a target and prerequisite are out of date. The P attribute replaces the default mechanism with the result of a command. The command immediately follows the attribute and is repeatedly executed with each target and each prerequisite as its arguments; if its exit status is non-zero, they are considered out of date and the recipe is executed."
When I want this in GNU Make I end up hacking together something resembling:
i think the true insanity lies within the mind of the developer whose build system depends on the number of jobs being ran in parallel..!
interesting academic exercise though. unless i'm mistaken, this will only detect up to 32 build jobs? i'd hate to see the Makefile that works for arbitrary values..
Say one of your build rules generates a data file by performing some expensive computations. The parallelism is internal to the code (say, it's a Monte-Carlo simulation), so it should run on a lot of cores, but if you're doing it multiple times (with, say, different parameters) in parallel, you don't want the total number of threads spawned to be more than the machine has.
Why would you want to do it multiple times with different parameters in parallel? If your code can already scale to multiple cores, then just have each parameter set run one at a time, and you're already maximizing your CPU use.
Or integrate with make's jobserver as someone else has described.
It makes some sense to want to know this. You might know that your build process doesn't benefit from more than 4 jobs, and want to alert this to the developer who tries to use more.
Please avoid such things. Useless warnings just create noise that mask real ones. Just this one may not matter, but the same rationality would apply to hundreds more.
tup [1], redo [2] and ninja [3] are probably the build systems with the most potential.
redo is the simplest, which is unsurprising given that it was originally proposed by djb. Alternatively, you could look into mk, which is the make replacement used by Plan 9, Inferno and derivatives.
I've used redo for a while but I didn't like the sprawl of shell script files for the build to work.
I switched to ninja mostly for more complex projects and it's very nice, you can do anything in the generation stage and get all the power you need and then run it through ninja and get the full speed benefit of native code for the most common operation (build it!).
I used in the past both scons and waf and they are in a way like ninja but since the actual build process is still in python the time to build is longer. scons was a big pain in the ass for a large project and I regretted introducing it. waf was ok though.
I still do use make and makefiles for simple projects. It is usually simplest to get running and requires only a build stage without any configuration like ninja does. I also have the basic constructs hard-wired into my brain and can write that simple makefile faster than I can find the example code for ninja.
There are really only three sane choices for C/C++ build systems right now, in my opinion:
1. Traditional Make files. Keep it simple, don't feel compelled to use all the dark corners of the language, and accept that it will be a little more verbose than some other solutions. And don't expect portability to Windows, since you won't get it (cygwin doesn't count as Windows). There are some portability problems between UNIX variants, but actually I think it's pretty easy to deal with these in plain old Make.
2. CMake. It makes the simple things simple and the hard things possible. It has its own simple scripting language which build files are written in, so you don't need to tear out your hair worrying about whether the the user's version of bash / python / perl / whatever matches yours. The language has the abstractions and functions that you need built-in, so most CMakeLists.txt tend to look the same (less wheel reinvention than you would get just using a general purpose language like Python or Perl for Makefiles.)
CMake is built with backwards compatibility in mind, so you can easily use old projects with newer versions of CMake. This is something you just don't get with autotools, where you have to have the correct version (not newer, not older) of autotools installed to build the project.
CMake does the detection of which header files are needed to rebuild which .c or .cc files which plain old Make doesn't do (without adding clunky extensions). CMake is also portable to Windows, which autotools is not. (again, limping along under cygwin doesn't count.) CMake can generate visual studio projects which can be used to directly build software.
3. Visual studio. If you only care about Windows, this is a sane choice.
I've gotten used to CMake and I kind of like it now. The language is atrociously ugly, but it's very functional at least and feature-rich. Ugly-but-functional is better than pretty-but-inherently-broken.
The tool has one critical function: to generate files that calculate dependencies properly.
It does not do this. Been using it since 2.6. Get one of their super-tweaky ever-changing commands wrong and it doesn't generate dependencies AT ALL on some platforms. Sure, your project builds. And it's also a poisoned dart trap for anyone who attempts to modify your code.
A quick skim of the bug list is enough to make any truly-fastidious engineer's sphincter clinch. When you have over 1000 bugs that aren't even assigned to a dev ... in a dev tool ... hoo boy.
> A quick skim of the bug list is enough to make any truly-fastidious engineer's sphincter clinch. When you have over 1000 bugs that aren't even assigned to a dev ... in a dev tool ... hoo boy.
Er, what? Chromium has over 50000 unassigned bugs. Any software that sees a lot of use is going to have lots of bugs, and depending on the workflow sometimes none of them will be assigned.
I asked you "how is it buggy?" and your answer essentially is "It's buggy, trust me."... weasel answers, much?
I'm by no means a fan of cmake. But if I use and trust it, yet can generate much better criticism than you do, how are you going to convince anyone they shouldn't be using it?
The bug list for a dev tool, that devs use, containing bugs meticulously entered by devs, carries weight. I also provided sample code above, so I clearly use the tool. And I've written and published scripts that show that it does not generate dependencies reliably.
"CMake is actually very reliable" --scrollaway
That's not criticism. That's just an opinion. No more or less valid than mine. But certainly less supported by evidence.
"Shake build systems are Haskell programs, but can be treated as a powerful version of make with slightly funny syntax. The build system requires no significant Haskell knowledge, and is designed so that most features are accessible by learning the "Shake syntax", without any appreciation of what the underlying Haskell means."
For C++, and if you don't need anything really crazy. It's very readable (JSON-like), and custom rules can be defined in Javascript instead of "learn another language". It's also very fast - kind of highlights how slow Make is! No build system should be slow.
It's usable now but still kind of a work in progress.
I always like waf. Runs a lot faster than scons and can actually handle a larger project.
It still has its own arcane syntax rather than being pythonic, which I think is its greatest shortcoming - it emulates make syntax too much, where I think it would be nicer to just be a good proper python library that specializes in build scripts.
I need more information as to why it's required to get the value of that parameter. Can't it be simultaneously be passed in as an environment variable?
There is no reason your makefile should ever depend on the parallelism level, with the possible exception of requiring no parallelism at all (i.e. "-j1"). That's why make doesn't bother to expose it. This little hack is just for fun.
I was the one who asked the original question. My purpose was so that Mercurial's test suite could specify the amount of parallelism from the Makefile. The "test" rule in the Makefile invokes "run-tests.py". This latter script itself accepts a -j option. It's possible to invoke it directly, but this misses the second part of the "test" rule in the Makefile, which executes some tests that run-tests.py can't do.
This probably won't work if you just pass -j. If I needed to solve this problem, I would have just grepped the output of `ps' to get the make invocation. If I wasn't sure which invocation was correct, I would have used heuristics like the output of `pstree' to figure it out.
I would think a process launched by make can easily obtain the process ID of its parent process. If so, no heuristics are needed to find the correct instance of make.
On the other hand, reading the command line isn't sufficient. The -j argument does not require a job count, defaulting to "as many as possible".
The real insanity with make is that it has no interface to check whether a Makefile contains a given target.
Best you can do is to run make -n target to probe, but it's possible to write Makefiles that run code even though -n is used, which will defeat such probes at distribution scale. It quickly becomes an exercise in heuristics and output parsing.
The "bash-completion" tab completion scripts for "make" do just that (run "make -npq") so presumably it's an acceptable thing to do for interactive use, that works reasonably well for some (most?) makefiles.
"At distribution scale"... judging by your username, I take it your use case is something like finding out if a source tarball provides the standard targets like "install" and "distcheck", for all packages in debian? I can see how that would be rather painful.
I wonder if there's room for adding a "--dry-run-yes-really-dont-run-any-recipes-at-all" flag to make. As far as I remember the main use case for the current behaviour is (a) to run recipes that create makefiles that are then included into the main makefile, used in build systems that automatically figure out header dependencies and things like that; and (b) process makefiles in subdirectories for "recursive make" build systems. I wonder if you don't do those steps, would you actually miss out on any significant information re. top-level targets? I imagine that use case (a) is used to add prerequisites to existing targets, rather than creating new targets.
Or maybe a more general solution, instead of "--dry-run-yes-really", would be providing access to make's parser as a library, instead of having to parse the output of "make -p". I'm sure IDEs could make good use of that. Who's got time to implement that though.
As shown in the post, targets are not always grep-able. The post's makefile contained the target "par-30" even though that string appears nowhere on the page or in the makefile.
You need to write a full parser to discover all the targets.
Running "make -p" is running a full parser which someone else wrote. Grep is not extracting the targets from a makefile there, it's extracting the targets from an intermediate representation of a makefile.
> The real insanity with make is that it has no interface to check whether a Makefile contains a given target.
>
> Best you can do is to run make -n target to probe, but it's possible to write Makefiles that run code even though -n is used, which will defeat such probes at distribution scale. It quickly becomes an exercise in heuristics and output parsing.
The comment I replied to, chaosfactors, specifically specified grepping the makefile. You're arguing that you can find all the targets and I never said you can't. I simply said you can't find them all by only grepping the makefile.
I usually add two targets for my my projects: 'help' and 'showconfig'. 'make help' explains which targets to use and/or how to combine them. 'make showconfig' shows all the important what the important internal vars and common vars (CFLAGS, LDFLAGS, etc) are and their values.
On a semi-related note... do any of the make gurus out there know if there's anything similar to "-j" that instead sends parallel jobs off to a cluster, e.g. Sun Grid?
Thanks for the thought. That sounds fine for simple compiler makefiles, although I'm using it for general purpose work (data analysis and visualization). I'm hoping for something that wouldn't require each command to be wrapped up, as there are many.
For that sort of thing, maybe roll your own script using something like Fabric (http://www.fabfile.org/) for doing all the communication with the remote hosts?
It will reduce total compile times for large projects by running make in parallel. Files dont compile faster, but make will act (compile) on N files at the same time together.
Quite frankly, I find this whole "look i did a horrible hack and let's see who can make the best worst horrible hack" thing quite stupid and silly.
GNU Make is free software released under a free license, my opinion is that instead of doing that crazy thing, the author could have just written a patch for GNU Make in order to make it export a "JOBS" environment variable to all its child processes.
Oh but yes, "I felt this feature was missing so I added it" is way way way less cool than "geez the gnu make folks are insane lollerplex they have no way to know how many jobs they're running".
ALSO: http://www.catb.org/esr/faqs/hacker-howto.html#believe2
Here it is a patched version of GNU Make, providing a #J variable that holds the number of jobs as passed via -j/--jobs:
As can be seen in the screenshot, it can be passed as environment variable to programs called by makeSource: https://github.com/esantoro/make