The goal of this project is to explore File I/O, functions, and to create a word counter in Ruby.
Task 1: File I/O
The first task required me to explore how to interact with standard in, standard out, reading files and writing files in Ruby. Ruby has built in support for reading files, although there is a csv package that can be imported to make it easier to read csv files (for example, it has built in functions to read in each line as a list). Ruby can read various kinds of text files, for example, text files (where filename is the name of a text file from the command line):
Binary files, like pdfs:
And csv files, which are also text files but have their own package to help read them:
The user can input information to the command line interactively, although I found it can get confused if this is used in a program where another text file is being read. Therefore, I made a separate program task1b.rb that shows user interaction:
The main difference between files and strings in Ruby is how they are collected. Files need to be opened, and each line is collected from the file using the gets command. Once a line has been read in, however, it is treated as a string that can be concatenated with another string and the chomp function can be called on it (which takes off the end of line character from a string).
The output of task1a is:
And the output of writing to a file:
Task 2: Functions
The second task in this project was to investigate functions in Ruby. Functions in Ruby do not need to have return types. Unlike functions in C and Java, their return types are not given with the function declaration so functions can return nothing or more than one type:
As the previous function shows, functions in Ruby can be overloaded since the types of the parameters aren't specified. The previous function can be used on strings and integers, and the two different versions of the function are differentiated by the type of argument given as the parameters. Ruby operators (like +, -, etc) can also be overwritten to have additional meanings on different types as the old function. These meanings are differentiated by the types of the variables it is called on. For example, the + operator is overloaded here in a class:
Ruby does allow functions to be defined for any number of arguments using the *args parameter. args is a list that can be iterated through to gather all arguments given at a function call. For example:
Ruby also can dynamically create functions. In the example I gave, the member functions for a class are dynamically created so there is not a need for a get method for each field in the class:
This creates methods brown?, black?, and white? that returns whether the dog object is that color.
Functions in Ruby can also be overriden to given it a new definition. For example:
The first hello calls the first version of the hello function and the second hello calls the updated version of the hello function.
And finally, functions can be called in other functions as long as they are defined first. For example,
add(3,4) cannot be called in hello because add is defined after hello. Functions can be used in other files by loading the file where the function is defined into the file where the function is used. For example:
Note that any functions called in task2a that are not enclosed in another function are called when task2a is loaded.
The output for task2a is:
And the output for task2b is:
Task 3: Word Counter
The third task for this project was to create a word counter to count the number of occurrences of each word in a text file (ignoring case and punctuation). I used an array to keep a pair of the word and the number of times it appear.
It starts by opening the file given on the command line, and reads each line, splitting the line by whitespace and then making sure each word is all lowercase and removing all punctuation. Then once all of the words are in a list, I make a final list by looping through the words and for each word checking to see if it is already in the final list and if it is, then I increment its associated number and if not, I add it to the list with a number 1. Then I use the qsort algorithm from project 4 with a new compare function to compare each words number so the final list is sorted from most common word to least common word. Then I print either the first twenty words that appear or the whole list if it has less than 20 words.
This is the output when the program is called on wctest.txt:
- I added to the word counter so that it throws an error if exactly one file is not given to the program or if the file given is not able to be opened:
- The second extension that I completed was to write a haiku about reading from a text file:
- The third extension that I completed was to determine how to read in a text file from the internet using its URL. This required the module open-uri, but then the opening and reading process was not that much different than a usual text file: