Laravel Deep Dives: Queues and Async Design
Why is PHP so "bad"?
PHP has a bad reputation for not being performant, although most of that reputation is based on hearsay and misunderstandings.
For example, many people believe that Node is "all asynchronous" and that it is therefore orders of magnitude faster than PHP. However, some devs might be surprised to learn that like PHP (by default), Node is singled-threaded. While I/O is efficiently non-blocking, you can still run into the same exact bottlenecks in many use cases. CPU-bound code is still blocking. That's the nature of Javascript as of today.
Tech is not immune to the influence of culture and rumor, though. Software engineers might hate to admit it, but a lot of technical choices are driven based on industry trends and emotional bias, not objectivity. That said, there's plenty of research to suggest that most big choices people make are driven by some level of emotion...we aren't robots and remembering that helps us make better choices.
My point? It's important to remember these biases when evaluating technology. You can't avoid them, so you'd best understand them. I love JavaScript and Node -- I think there's many advantages to a full-javascript stack and that there's many cases where Node's non-blocking IO is a big advantage. Still, I do not think the blanket statement that "Node is just better" is completely fair or true.
Don't get me wrong, I'm not saying the two are equal. Node does have advantages. Non-blocking I/O is important, and the JavaScript just makes sense...not because the language itself is so great (talk to anyone that's touched javascript about the word "null"), but because it has so much utility in front end development. One language to rule them all. Network effect. Whatever you want to call it, the utility is real.
Caching is not Optional
As Phil Karlton famously said (and is often-quoted to the point of annoyance): "There are only two hard things in Computer Science: cache invalidation and naming things."
Personally, I like to focus on product, that is...I care about what I'm building, not just how I'm building it. There's an adage in game development that premature optimization is a great evil, because engineers can spend a mess of time optimizing every frame...for little actual benefit.
Especially for web applications, one of the most common and effective ways to improve server-side performance is basically to avoid server-side code via caching.
Anyone that's worked on a site at-scale knows that not all caching is equal, and it isn't always simple. Application level caches (like those commonly found in Wordpress) might not be as effective as higher-level caches like Varnish, which entirely "skips" PHP to obtain very good performance (at least for non-dynamic pages).
Combined with in-memory systems like Redis/Elasticache or a Lucene-based search like Elastisearch, PHP can scale very nicely. Don't get me wrong, PHP does have performance flaws (all platforms have tradeoffs when it comes to performance), but my point is that non-technical stakeholders shouldn't be nervous about embracing a LAMP stack. So long as you hire experts that understand the tech, it will absolutely scale.
It's important to consider the big picture. Engineers tend to be very concerned about optimizations, but there's a subtle difference between performance and scalability...it's entirely possible to scale an application that isn't especially efficient or performant.
Depending on the use case, the cost to add another node and scale horizontally might be negligible. The cost to squeeze more performance out of less hardware, however, might be more than stakeholders want to pay...and might not yield much actual benefit relative to the time.
Like with all things in life, it depends.
Asynchronous Design
Today's, developers are expected to know about using things like queues to implement asynchronous logic. First, it's important to understand why queues give your application such a performance benefit.
Laravel's docs are fairly good; they mention that queues "run in the background", but what does that really mean? Especially if PHP is single-threaded?
Most developers are familiar with the concept of "workers": processes that "run in the background" and "do work" by pulling it off the queue. For Laravel, it runs workers in a separate PHP process. A process is not the same as a thread.
A process represents an "instance of a program". It's like opening multiple programs on your desktop. Each is a process. It has its own memory, which means processes can be run concurrently or via multiple threads. Wait, what does "run concurrently" mean? Isn't that the same as multiple threads? Well, no...but because it runs workers as their own PHP process, it means that process is isolated in memory. That means your main PHP processes running your application can respond without workers ever blocking the CPU.
If you've ever done multi-threaded code, you know that memory management is the complexity in performance bottlenecks. So you can already understand how workers are beneficial, and very different than something like Wordpress's default "cron" system, which runs in the same process as the main application.
Since they are memory-safe, worker processes can run concurrently if your CPU has multiple cores (it surely does). Wait, doesn't that just mean it runs in "another thread"? It doesn't, because threads and concurrency are related, but not the same. The benefit is the same, though: multiple operations happening at the same time.
All this being said, modern versions of PHP can support actual multithreading, too. Processes, threads, concurrency...programming is fun because lower-level details like this can shape your approach to application design.
How Laravel's Queues Work
Now that I've rambled about performance for a while, let's look into what actually happens when you run a worker (php artisan queue:work). We can see the corresponding command in the Worker class.
We can see the body of the handle method for more details:
if ($this->downForMaintenance() && $this->option('once')) { return $this->worker->sleep($this->option('sleep')); } // We'll listen to the processed and failed events so we can write information // to the console as jobs are processed, which will let the developer watch // which jobs are coming through a queue and be informed on its progress. $this->listenForEvents(); $connection = $this->argument('connection') ?: $this->laravel['config']['queue.default']; // We need to get the right queue for the connection which is set in the queue // configuration file for the application. We will pull it based on the set // connection being run for the queue operation currently being executed. $queue = $this->getQueue($connection); if (Terminal::hasSttyAvailable()) { $this->components->info( sprintf('Processing jobs from the [%s] %s.', $queue, str('queue')->plural(explode(',', $queue))) ); } return $this->runWorker( $connection, $queue );
The first conditional handles "sleeping" the worker if the site is in maintenance mode. This code is fairly readable as-is, especially understanding some of the background on bootstrapping.
The "Terminal::hasSttyAvailable()" conditional might not be familiar, and it isn't super important to the topic of workers. This simply checks if the terminal has the "stty" command, which can be used to manipulate terminal output or features. Let's look at the "runWorker" method:
return $this->worker ->setName($this->option('name')) ->setCache($this->cache) ->{$this->option('once') ? 'runNextJob' : 'daemon'}( $connection, $queue, $this->gatherWorkerOptions() );
Simple, eh? Similar to how the Router/Route dynamically invokes methods, this either runs the "runNextJob" method or the "daemon" method depending on the command line arguments. Since "runNextJob" is conceptually simple (running the job once), we can look at the Worker class's daemon method.
First, tangent. It's pronounced the same as the word "demon". Trust someone more educated than me. I don't actually care, but I think it's more fun to say the "right" way.
Onto the "daemon" method. You will see the familiar while(true) loop, an infinite loop familiar to many developers. We don't use this often in web development, but of course it makes sense in this context since we want the daemon to run "forever".
// First, we will attempt to get the next job off of the queue. We will also // register the timeout handler and reset the alarm for this job so it is // not stuck in a frozen state forever. Then, we can fire off this job. $job = $this->getNextJob( $this->manager->connection($connectionName), $queue ); if ($supportsAsyncSignals) { $this->registerTimeoutHandler($job, $options); } // If the daemon should run (not in maintenance mode, etc.), then we can run // fire off this job for processing. Otherwise, we will need to sleep the // worker so no more jobs are processed until they should be processed. if ($job) { $jobsProcessed++; $this->runJob($job, $connectionName, $options); if ($options->rest > 0) { $this->sleep($options->rest); } } else { $this->sleep($options->sleep); } if ($supportsAsyncSignals) { $this->resetTimeoutHandler(); }
As is often the case, the comments make it clear what's going on. You might be curious what this business about "$supportsAsyncSignals" is. This simple checks if the pcntl extension is loaded, which is used to handle multi-process PHP applications. As we know, processes and threads are not the same, but the "point" is the same: being able to "do multiple things at the same time".
Also worth noting is the preference to let a worker stop if it runs into memory limits and rely on an external monitor to restart it:
// Finally, we will check to see if we have exceeded our memory limits or if // the queue should restart based on other indications. If so, we'll stop // this worker and let whatever is "monitoring" it restart the process. $status = $this->stopIfNecessary( $options, $lastRestart, $startTime, $jobsProcessed, $job ); if (! is_null($status)) { return $this->stop($status, $options); }
This is important to remember. Not every application will hit a memory limit, especially early in prototyping, when dealing with less memory-intensive jobs, or early in an application's life. That's why it's important to understand this logic in advance and implement a monitor/supervisor before it fails. Just because it runs fine for a month or a year doesn't mean it won't fail or stop for some reason. It's good practice because the code makes it clear: workers might stop.
Now we see "runJob" -- and here is where the logic converges with a one-off job (runNextJob in the Worker class). This in turn calls the "process" method:
try { // First we will raise the before job event and determine if the job has already run // over its maximum attempt limits, which could primarily happen when this job is // continually timing out and not actually throwing any exceptions from itself. $this->raiseBeforeJobEvent($connectionName, $job); $this->markJobAsFailedIfAlreadyExceedsMaxAttempts( $connectionName, $job, (int) $options->maxTries ); if ($job->isDeleted()) { return $this->raiseAfterJobEvent($connectionName, $job); } // Here we will fire off the job and let it process. We will catch any exceptions, so // they can be reported to the developer's logs, etc. Once the job is finished the // proper events will be fired to let any listeners know this job has completed. $job->fire(); $this->raiseAfterJobEvent($connectionName, $job); } catch (Throwable $e) { $this->handleJobException($connectionName, $job, $options, $e); }
Exception handling is important because we don't want the while (true) loop to fail just because a job throws an exception. It makes sense that we'd want to handle every exception manually in "handleJobException" in the context of being inside an infinite loop.
Note that it also fires events through Laravel's events system (which I may delve into next?) so that listeners can be informed about jobs about to execute (or jobs that just finished).
The abstract class "Job" implements the "fire" logic:
$payload = $this->payload(); [$class, $method] = JobName::parse($payload['job']); ($this->instance = $this->resolve($class))->{$method}($this, $payload['data']);
Similar to routing, this finally invokes the underlying job's logic. The specific details (like the method) is parsed from the job's payload. It makes sense that this is in JSON -- that's how Laravel handles job data and how queues like SQS would expect the data.
Conclusion
You can read many descriptions of Laravel's jobs that say that they are good for performance, but it's nice to delve into the underlying details and see why. It's also important because asynchronous operations sometimes have a "bad reputation", maybe because of working with more low-level implementations like throwing things into a cronjob or (*gasp*) a Worpdress cron.
Digging into the lower-level details, we can see why and how queues offer us performance improvements. Further, we can have confidence that this system does indeed work at scale. That isn't to say it can't fail or have issues...but for many use cases, asynchronous design is critical for scale.
Also, knowing how this works, it helps us pick the right hardware for the job. For a Laravel application that needs a scaled-out worker pipeline, you want as many cores as is feasible to maximize concurrency. You also want enough memory to avoid workers from constantly stopping.
At scale, you'd likely not use the worker application for anything other than handling jobs, isolating the user-facing application, which is easy since it's merely letting a different application process the queue! You could even switch the backend to Go or Node if you need to squeeze out every ounce of performance from the hardware. Maybe I will explore those options in the future.