There are two main options for parallelism in Ruby: threads and forking.
Threads are very similar to pthreads in C: they are used to parallely or concurrently run code. In Ruby, rather than passing in a worker function like in C, a do block is used to tell the thread what it needs to do. For example:
This adds a new thread to the threads list that will execute the quickSort algorithm on the arguments given. A caution is that there are only a limited number of threads that can be used by a Ruby file, so creating too many threads will throw an error.
Similar to C, threads need to be joined to get them to all complete their actions, so
Will join all of the threads in the threads list.
Also similar to C, Ruby has the ability to mutex lock resources to eliminate race conditions. For example, I was having problems with this section of code executing correctly because the threads.do block were not getting the correct value of start and ending:
But, adding the mutex lock to synchronize the threads eliminated the problem because they all ran with the correct j, start, and end values.
The second option for parallelism in Ruby is using Process.fork to fork threads to execute different sections of code. For example:
This does something very similar to the example above: it forks a process to run the quickSort function on a different process, parallelizing the program. There is also a need to be sure not to create too many forked processes because it will not only crash the program, but there is no way to close the once the program crashes, so a computer restart is required. Also note: THIS DOES NOT WORK. It took me a while to figure this out, but the fork cannot share variables outside of its scope, so it cannot return the updated list, which means this does not sort the list. Testing is important, people!
There is much less control over forks in Ruby than over threads. There is no list of threads that can be mapped and reused - the forks operate independently from this program so control over than is lost until they finish.
I completed tasks 1 and 2 from part 1 in Ruby.
For task 1, I wrote a quick sort algorithm that breaks an array into the number of threads parts, then uses my quick sort algorithm from project 4 to sort each part. Then I merge these sorted lists into an overall sorted list to get the whole sorted list using this loop:
I experimented with this algorithm to see how the the time it takes to sort an array with 500,000 elements changes with the number of threads used:
Since my computer has two cores, it can run two threads truly in parallel, so going from one to two threads sees the biggest speed up. After that there is a small speed up when using 4 threads, and then the time actually increases with 8 and 16 threads.
I then looked if I used two threads how the time changes with the number of items in the list. This is what I found:
Thus, as the number of items in the list doubles the time increases by about double and when the size increases by a factor of 5, the time increases by approximately a power of 5. Thus, the time it takes is proportional to the number of items in the list.
For task2, I first needed to determine how to modify images in Ruby at the pixel level. I found the pnm package of ruby, which reads in an image using its read function, then the object that it returns has a data field that contains an array of arrays for each pixel, which is an array with the red, green, blue and alpha values. I could loop over each pixel to change its rgb value similar to in colorize.c:
Then, the write function will write an image to a file.
To parallelize this program, I made use of threading in Ruby in a more explicit way. I divided the image by rows into the number of threads and used a thread to modify a part of the image. This section of the code is in an image above because I ran into a race condition that was only modifying some parts of the image and skipping some sections. Therefore, I added a semaphore object, which locks critical section of code. The entire image is modified under these conditions. Here is the picture before:
When this process is run serially, it takes 1.243734 seconds. When it is run with 1 thread, it takes 1.2295 seconds. When it is run with two threads, it takes 1.158219 seconds. When it is run with 4 threads, it takes 1.053219 seconds. If it is run with 8 threads, it takes 1.042291 seconds. And when it is run with 16 threads, it takes 1.074798 seconds. Since my machine has 2 cores that can run programs truly parallely there is a big speed up from 1 to 2 threads. There is a moderate speed up from 2 to 4 threads. Then after that there is no speed up from using more cores. The best option for using threads is 4 threads because it runs fast without using a lot of CPU power to manage a lot of threads.
1) The first extension that I completed is explained above - I implemented the pixel modification of an image in such a way that it uses a semaphore/mutex lock to synchronize the critical section of code.
2) The second extension that I completed was to write a haiku in Ruby regarding how forks are declared and then destroyed in a Ruby program:
This shows that the process is automatically joined back to the main thread when it hits the end of the do block - the user does not manually have to join it like a thread.
3) The third extension that I completed was to write a haiku in Ruby regarding how threads are declared and then destroyed in Ruby:
This shows how even after the end of the block for the thread, there is still a need to join the thread back to the main thread. Otherwise, the main thread will not wait for the child thread to finish executing and could, therefore, terminate before the child process is done.
4) The fourth extension that I completed was to create a second image filter in Ruby. I created an image filter that swaps the red and blue color values. This results in a very purple image, which I love because I love purple:
5) The fifth extension that I completed was to implement another useful parallel algorithm: sum, which sums all of the values in a list of random integers. I used a list with 500,000 elements in the list and ran the function with various numbers of threads, which resulted in the following output:
As usual, there is the greatest speed up from one thread to two threads, then there is a small speed up from 2 threads to 4 threads, then an increase in the time it takes with 8 threads and 16 threads. Thus, the optimal number of threads to use in this case is 2.