The event loop’s job is to look at the stack and look at the task queue. If the stack is empty it takes the first thing on the queue and pushes it on to the stack which effectively run it. So here we can see that now the stack is clear, there’s a callback on the task queue, the event loop runs, it says, oh, I get to do something, pushes the callback on to the stack.
- Push main() onto the call stack.
- Push console.log() onto the call stack. This then runs right away, and gets popped.
- Push setTimeout(2000) onto the stack. setTimeout(2000) is a Node API. When we call it, we register the event-callback pair. The event will wait 2000 milliseconds, then callback is the function.
- After registering it in the APIs, setTimeout(2000) gets popped from the call stack.
- Now the second setTimeout(0) gets registered in the same way. We now have two Node APIs waiting to execute.
- After waiting for 0 seconds, setTimeout(0) gets moved to the callback queue, and the same thing happens with setTimeout(2000).
- In the callback queue, the functions wait for the call stack to be empty, because only one statement can execute a time.This is taken care of by the event loop.
- The last console.log() runs, and the main() gets popped from the call stack.
- The event loop sees that the call stack is empty and the callback queue is not empty. So it moves the callbacks (in a first-in-first-out order) to the call stack for execution.
Key points in Summary
Any of the web APIs pushes the callback on to the task queue when it’s done. The event loop is like the simplest little piece in this whole equation, and it has one very simple job. The event loop’s job is to look at the stack and look at the task queue. If the stack is empty it takes the first thing on the task-queue and pushes it on to the stack which effectively run it. When the event loop sees that the stack is clear, there’s a callback on the task queue, the event loop runs, it says, oh, I get to do something, pushes the callback on to the stack.
A side point – Most of what you write in JavaScript, that is NOT asynchronous code, is Synchronous procedural code read from top to bottom and executed in the single main thread of the JavaScript process. Because, at its base, JavaScript is a synchronous, blocking, single-threaded language.
When Node is single-threaded how does it handle concurrency
The official Doc has some very good explanation.
1> Some Key points >>
-
A> The simple ans is – With callback function and event-loop
-
B> With callback function and event-loop, Node transfer execution of the callback function (like, in setTimeout, Promise, fs.readFile) to a queue the output of which follows a FIFO design, while the rest of the code, will get executed. So that’s how node is non-blocking.
And whatever was transferred to the separate queue for those callbacks will be executed on a FIFO basis, so the code that was in the queue first, will return its result first. And then the next one in the queue.
https://codeburst.io/how-node-js-single-thread-mechanism-work-understanding-event-loop-in-nodejs-230f7440b0ea
2>Some of the popular server side technology like PHP, ASP.NET, Ruby & Java Servers all follow Multi-threaded where each client request results in the instantiation of a new thread or even a process, but in Node.js, requests are run on the same thread.
3> Single Threaded Event Loop Model Processing Steps:
https://www.journaldev.com/7462/node-js-architecture-single-threaded-event-loop
-
Clients Send request to Web Server.
-
Node JS Web Server internally maintains a Limited Thread pool (4 by default) to provide services to the Client Requests.
-
Node JS Web Server receives those requests and places them into a Queue. It is known as “Event Queue”.
-
Node JS Web Server internally has a Component, known as “Event Loop”. It uses indefinite loop to receive requests and process them.
-
Event Loop uses Single Thread only. It is the main heart of Node JS Architecture.
-
Event Loop checks if any Client Request is placed in the Event Queue. If no, then waits for incoming requests indefinitely.
-
If yes, then picks up one Client Request from Event Queue
-
Starts processing that Client Request
-
If that Client Request Does Not requires any Blocking IO Operations, then processes everything, prepares response and sends it back to client.
-
If that Client Request requires some Blocking IO Operations like interacting with Database, File System, External Services then it will follow different approach as below
-
A) Checks Threads availability from Internal Thread Pool
-
B) Picks up one Thread and assign this Client Request to that thread.
-
C) That Thread is responsible for taking that request, process it, perform Blocking IO operations, prepare response and sending it back to the Event Loop – PAUL NOTE – THIS IS THE MOST IMPORTANT KEY POINT
-
D) Event Loop in turn, sends that Response to the respective Client.
-
ACTUAL IMPLEMENTATION OF THE ABOVE EVENT-LOOP-CALLBACK PROCESS
-
Say “n” number of Clients Send request to Web Server. Let us assume they are accessing our Web Application concurrently.
-
Let us assume, our Clients are Client-1, Client-2… and Client-n.
-
Web Server internally maintains a Limited Thread pool. Let us assume “m” number of Threads in Thread pool.
-
Node JS Web Server receives Client-1, Client-2… and Client-n Requests and places them in the Event Queue.
-
Node JS Even Loop Picks up those requests one by one.
-
Even Loop pickups Client-1 Request-1
-
Checks whether Client-1 Request-1 does require any Blocking IO Operations or takes more time for complex computation tasks.
-
As this request is simple computation and Non-Blocking IO task, it does not require separate Thread to process it.
-
Event Loop process all steps provided in that Client-1 Request-1 Operation (Here Operations means Java Script’s functions) and prepares Response-1
-
Event Loop sends Response-1 to Client-1
-
Even Loop pickups Client-2 Request-2
-
Checks whether Client-2 Request-2does require any Blocking IO Operations or takes more time for complex computation tasks.
-
As this request is simple computation and Non-Blocking IO task, it does not require separate Thread to process it.
-
Event Loop process all steps provided in that Client-2 Request-2 Operation and prepares Response-2
-
Event Loop sends Response-2 to Client-2
-
Even Loop pickups Client-n Request-n
-
Checks whether Client-n Request-n does require any Blocking IO Operations or takes more time for complex computation tasks.
-
As this request is very complex computation or Blocking IO task, Even Loop does not process this request.
-
Event Loop picks up Thread T-1 from Internal Thread pool and assigns this Client-n Request-n to Thread T-1
-
Thread T-1 reads and process Request-n, perform necessary Blocking IO or Computation task, and finally prepares Response-n
-
Thread T-1 sends this Response-n to Event Loop
-
Event Loop in turn, sends this Response-n to Client-n
Here Client Request is a call to one or more Java Script Functions. Java Script Functions may call other functions or may utilize its Callback functions nature.
So Each Client Request looks like as shown below:
function (other-function, callback-function)
So in a nutshell – event loop allows Node.js to perform non-blocking I/O operation, despite the fact that JavaScript is single-threaded & by offloads operations to the system kernel whenever possible. So, the question is if Node pushes all those responsibilities down to the kernel then why would a thread pool be needed?” It’s so because the kernel doesn’t support doing everything asynchronously. In those cases Node has to lock a thread for the duration of the operation so it can continue executing the event loop without blocking.
Official Doc – Since most modern kernels are multi-threaded, they can handle multiple operations executing in the background. When one of these operations completes, the kernel tells Node.js so that the appropriate callback may be added to the poll queue to eventually be executed.
https://www.fpcomplete.com/blog/2016/12/concurrency-and-node
2>When Node.JS first came onto the scene it successfully popularized the event-loop. Ryan Dahl correctly identified a serious problem with the way that I/O is generally handled in concurrent environments. Many web servers, for example achieve concurrency by creating a new thread for every connection. In most platforms, this comes at a substantial cost. The default stack size in Java is 512KB, which means that if you have 1000 concurrent connections, your program will consume half a gigabyte of memory just for stack space. Additionally, forking threads in most systems costs an enormous amount of time, as does performing a context switch between two threads.
To address these issues, Node.JS uses a single thread with an event-loop. In this way, Node can handle 1000s of concurrent connections without any of the traditional detriments associated with threads. There is essentially no memory overhead per-connection, and there is no context switching.
General Read
There is only one thread that executes JavaScript code and this is the thread where the event loop is running. The execution of callbacks (know that every userland code in a running Node.js application is a callback) is done by the event loop.
2> Great example – https://codeburst.io/how-node-js-single-thread-mechanism-work-understanding-event-loop-in-nodejs-230f7440b0ea
var sockets = require('websocket.io');
httpServer = sockets.listen(4000);
httpServer.on('onConnection', function(socket) {
console.log('connected……');
httpServer.send('Web socket connected.');
httpServer.on('message', function(data) {
console.log('message received:', data);
});
httpServer.on('close', function() {
console.log('socket closed!');
});
});
Here when sockets.listen(4000) executes, a Web-Socket server is created on a single thread — event loop which listens continuously on port 4000. When a web or app client connects to it, it fires the ‘onConnection’ event which the loop picks up and immediately publishes to the thread pool and is ready to receive the next request and this is the main functionality differentiation between NodeJs based servers and other IIS/ Apache based servers, NodeJs for every connection request do not create a new thread instead it receives all request on single thread and delegates it to be handled by many background workers to do the task as required. Libuv library handles this workers in collaboration with OS kernel. Libuv is the magical library that handles the queueing and processing of asynchronous events utilizing powerful kernel, today most modern kernels are multi-threaded, they can handle multiple operations executing in the background. When one of these operations completes, the kernel tells Node.js so that the appropriate callback may be added to the poll queue to eventually be executed.
Node has a pool of Thread and you must be scratching your head wondering if Node pushes all those responsibilities down to the kernel then why would a thread pool be needed?” It’s so because the kernel doesn’t support doing everything asynchronously. In those cases Node has to lock a thread for the duration of the operation so it can continue executing the event loop without blocking.
3> node.js handle thousands of concurrent requests per second, when writing them to Mongo?
Generally the web server and the database server are 2 different machines, because of Async nature, the event loop gets free after forwarding the read/write request to database server. That is why, a Node JS HTTP server can handle a large number of requests while the process of complex read/write operations could be in-progress on database server(s).
-
- Further Resources – [https://youtu.be/8aGhZQkoFbQ](Very famous video)
Callbacks – How exactly the flow of non-blocking code works in Node
Callbacks are functions that are executed asynchronously, or at a later time. Instead of the code reading top to bottom procedurally, async programs may execute different functions at different times based on the order and speed that earlier functions like http requests or file system reads happen.
The difference can be confusing since determining if a function is asynchronous or not depends a lot on context. Here is a simple synchronous example, meaning you can read the code top to bottom just like a book:
var myNumber = 1
function addOne() { myNumber++ } // define the function
addOne() // run the function
console.log(myNumber) // logs out 2
The code here defines a function and then on the next line calls that function, without waiting for anything. When the function is called it immediately adds 1 to the number, so we can expect that after we call the function the number should be 2. This is the expectation of synchronous code – it sequentially runs top to bottom.
Check the .js file in ./code/Non-blocking-mechanism.js
Node, however, uses mostly asynchronous code. Let’s use node to read our number from a file called number.txt
:
const fs = require('fs'); // for requiring this, I dont need any separate package.json as my machine is already running in node env
var myNumber = undefined;
addOne = callbackFunction => {
fs.readFile('number.txt', doneReading = (err, fileContent) => {
myNumber = parseInt(fileContent);
myNumber++
callbackFunction()
})
}
logMyNumberFromCallback = () => {
return console.log(myNumber);
}
addOne(logMyNumberFromCallback); // => 2
// The below line will get executed first (before readFile is done) logging out 'undefined' -- Even thought its placed after addOne() in the top-down flow in this file - This is because when readFile() is non-blocking, meaning when its doing its job of reading the number.txt file, the code right below its execution block will continue to get executed */
console.log(myNumber) // => undefined
Why do we get undefined
when we log out the number this time? In this code we use the fs.readFile
method, which happens to be an asynchronous method. Usually things that have to talk to hard drives or networks will be asynchronous. If they just have to access things in memory or do some work on the CPU they will be synchronous. The reason for this is that I/O is reallyyy reallyyy sloowwww. A ballpark figure would be that talking to a hard drive is about 100,000 times slower than talking to memory (e.g. RAM).
When we run this program all of the functions are immediately defined, but they don’t all execute immediately. This is a fundamental thing to understand about async programming. When addOne
is called it kicks off a readFile
and then moves on to the next thing that is ready to execute. If there is nothing to execute node will either wait for pending fs/network operations to finish or it will stop running and exit to the command line. When readFile
is done reading the file (this may take anywhere from milliseconds to seconds to minutes depending on how fast the hard drive is) it will run the doneReading
function and give it an error (if there was an error) and the file contents. The reason we got undefined
above is that nowhere in our code exists logic that tells the console.log
statement to wait until the readFile
statement finishes before it prints out the number.
If you have some code that you want to be able to execute over and over again, or at a later time, the first step is to put that code inside a function. Then you can call the function whenever you want to run your code. It helps to give your functions descriptive names.
Callbacks are just functions that get executed at some later time. The key to understanding callbacks is to realize that they are used when you don’t know when some async operation will complete, but you do know where the operation will complete — the last line of the async function! The top-to-bottom order that you declare callbacks does not necessarily matter, only the logical/hierarchical nesting of them. First you split your code up into functions, and then use callbacks to declare if one function depends on another function finishing.
The fs.readFile
method is provided by node, is asynchronous, and happens to take a long time to finish. Consider what it does: it has to go to the operating system, which in turn has to go to the file system, which lives on a hard drive that may or may not be spinning at thousands of revolutions per minute. Then it has to use a magnetic head to read data and send it back up through the layers back into your javascript program. You give readFile
a function (known as a callback) that it will call after it has retrieved the data from the file system. It puts the data it retrieved into a javascript variable and calls your function (callback) with that variable. In this case the variable is called fileContents
because it contains the contents of the file that was read.
logMyNumberFromCallback
function can get passed in as an argument that will become the callbackFunction
variable inside the addOne
function. ONLY After readFile
is done the callbackFunction
variable will be invoked (the variable being a function here callbackFunction()
). Only functions can be invoked, so if you pass in anything other than a function it will cause an error.
callbackFunction
is actually logMyNumberFromCallback
.
To break down this example even more
addOne
will first run the asynchronous fs.readFile
function. This part of the program takes a while to finish.
readFile
finishes it executes its callback, doneReading
, which parses fileContent
for an integer called myNumber
, increments myNumber
and then immediately invokes the function that addOne
passed in (its callback), logMyNumberFromCallback
> which in turn just logs out the incremented myNumber
.
Perhaps the most confusing part of programming with callbacks is how functions are just objects that can be stored in variables and passed around with different names.
This is an example ‘evented programming’ or ‘event loop’. They refer to the way that readFile
is implemented. Node first dispatches the readFile
operation and then waits for readFile
to send it an event that it has completed. While it is waiting node can go check on other things.
Inside node there is a list of things that are dispatched but haven’t reported back yet, so node loops over the list again and again checking to see if they are finished. After they finished they get ‘processed’, e.g. any callbacks that depended on them finishing will get invoked. Here is a pseudocode version of the above example:
function addOne(thenRunThisFunction) {
waitAMinuteAsync(function waitedAMinute() {
thenRunThisFunction()
})
}
addOne(function thisGetsRunAfterAddOneFinishes() {})
Imagine you had 3 async functions a, b and c. Each one takes 1 minute to run and after it finishes it calls a callback (that gets passed in the first argument). If you wanted to tell node ‘start running a, then run b after a finishes, and then run c after b finishes’ it would look like this:
a(function() {
b(function() {
c()
})
})
When this code gets executed, a will immediately start running, then a minute later it will finish and call b, then a minute later it will finish and call c and finally 3 minutes later node will stop running since there would be nothing more to do. There are definitely more elegant ways to write the above example, but the point is that if you have code that has to wait for some other async code to finish then you express that dependency by putting your code in functions that get passed around as callbacks.
The design of node requires you to think non-linearly. Consider this list of operations:
read a file
process that file
If you were to turn this into pseudocode you would end up with this:
var file = readFile()
processFile(file)
This kind of linear (step-by-step, in order) code isn’t the way that node works. If this code were to get executed then readFile and processFile would both get executed at the same exact time. This doesn’t make sense since readFile will take a while to complete. Instead you need to express that processFile depends on readFile finishing. This is exactly what callbacks are for! And because of the way that JavaScript works you can write this dependency many different ways:
var fs = require('fs')
fs.readFile('movie.mp4', function finishedReading(error, movieData) {
if (error) return console.error(error)
// do something with the movieData
})