The first is Parallel.js, which uses web workers (think similar to threads without the control over how many threads are being used or to mutex lock and join like in pthreads). It only has four functions - spawn, which spawns a worker to do a task; map, which spawns a worker to apply a function to each element in a list; reduce, which applies a function to elements in a list and returns a scalar value; and require, which allows the workers to share some state, like a a function that they will all use.
This is an example of finding the Fibonacci number 1, 1, 2, 3, 5, 8, 13... at the 40th, 41st and 42nd index. When done serially, takes a list 40, 41, 42 and sequentially applies the function to each of them. It takes 10.423 seconds:
When done in parallel by calling map on every element of the list and then printing the result, it takes only 6.369:
The second package is child_process, which is mostly used to allow responses to user input (like requests to a server) to happen in parallel with the main program. For example, in this program, the parent process is constantly listening for messages from the child process:
When a user input is made, it can send a message to the child:
The child then executes whatever action must take place (which could in some cases be a very long process) then send its output in a message back to the parent:
Thus, the parent can constantly be listening for user input rather than waiting for the response to finish. You can see in the output that the parent constantly executes only stopping to print the message from the child.
Going beyond these, there are three main projects in the works.
Nashborn - which would be a fully multi-threaded data model that can run multiple Java threads in a single program. But the programmer needs to be extra careful that there are no race conditions in the underlying Java threads. The author of the paper I was reading thinks this is too big of a step.
The last and in my opinion most realistic is the Shared Buffer Array, which allows allows the workers to have access to shared memory but with the additional ability to lock data when it is being used. This is not full threading, but it is closer and it has a much easier implementation.
- I completed task 1 of part 1. Similar to how I parallelized quick sort in Ruby, I used p.spawn to parallelize the process:
When a Parallel object is made, it takes in an argument. The functions that are called in the quicksort function need to be required - partition and comparte. Then when the spawn function is called, it takes quicksort as an argument and calls the function. Then it logs the arguments once it is done. This works for two recursive steps then reaches an error that a message argument is not given to the function and so it cannot finish. The documentation for parallel.js is not great, so I could not figure out how to fix this problem.
2. The second extension that I completed was to experiment with the child_process package to parallelize the process of modifying an image. The parent process reads in a png image, loops over the pixels in the image and for each pixel, sends a message with its information to the child process:
The child process then modifies the pixel and sends it back to the parent:
Then it prints the result to an output png.