Saturday 3 October 2020

Apple made system calls horribly slow in the 10.14.6 update. Thank you, Apple!

After installing the Mac OS 10.14.6 update from the end of September 2020, I noticed something wasn't right. A simple Perl script that scanned a directory for all files and that invoked the stat system command for every file, had become way slower than before. I'm not sure if the same would have happened if it had relied on Perl's own stat() call, but this script needed the shell command because it can offer information that Perl's own stat couldn't provide.

Also when running a homebrew update, it seemed way slower than usual. It looks like Apple had messed something up w.r.t. system calls from within scripts, and perhaps other programs as well, although I didn't notice anything in regular apps. I knew this was likely to be reported by a gazillion developers and fixed soon, so I didn't bother. That horribly slow script of mine was annoying though, so I started seeing if it couldn't be improved.

The problem here was that the Perl script did the obvious thing of invoking a new instance of stat through backticks, for every single file it encountered. That's a lot of overhead. Until now the overhead wasn't bad enough for me to be sufficiently annoyed, but Apple's “update” had now pushed this way into the zone of bad words and gnashing teeth. The solution was pretty obvious: reduce the overhead by reducing the number of system commands. Luckily this could be done: the stat command does accept multiple files as argument, and returns the results as multiple lines in the same order as the arguments.

So, all I had to do was postpone the invocation of `stat` and build a queue of file paths, and execute the aggregated system command whenever it reached a certain size. In theory I could wait until the command line was about to exceed 262144 bytes (minus some margin to allow environment handling), which is the maximum system command length as reported by “getconf ARG_MAX.” In practice, I used a lower limit of 16 kBytes because above a certain threshold, the gain becomes pretty negligible anyway.

The result was that for a particular run, the script went from a 120 second runtime to 3 seconds. After Apple fixed the performance issue in a supplemental 10.14.6 update, the runtime became sub-second. So in the end I guess I should thank Apple for forcing me to refactor my script and making it way more efficient. Maybe every operating system should now and then introduce a temporary penalty on certain operations, or a limit on resources, just to force developers to be less lazy and actually improve their software instead of writing sloppy code and letting the machine just brute force it…

No comments: