Skip to end of metadata
Go to start of metadata

You are viewing an old version of this page. View the current version.

Compare with Current View Page History

« Previous Version 2 Next »


The goal of this project was to find the most common words in a text document of reddit comments and find trends related to how many times a set of words appeared over 8 years of reddit comments.  To find the most common words, counts files that include each word and how often it appeared were found in Project 7 using a binary search tree and were read into a binary search tree again, then put into a priority queue.  The priority queue removes items in order of how many times the word appears and prints the list to the terminal.  In an extension, it reads the word-value pairs directly into a priority queue.  Then to find trends, it reads the word-value pairs into a binary search tree and the finds each word in a list and returns its frequency (how many times it appears divided by the total number of words in the document) and prints a list of how many times it appears to the terminal.  Then, a graph is made with the results. In extensions, it prints the results to files rather than printing to the terminal.    ***add results


1) The first task was to create a FindCommonWords task that will print the words in a word count file (from project 7)


  • No labels