no more than 300 words that describes the data set and your work with it
This is a UCI data set from 2014 about customersclients' annual spending at a wholesale store. 440 customersdistrubutor. Every row represents how many monetary units within a category of item the given client bought in a year.440 clients' spendings on each category were recorded. There are 8 attributes: channel (e. The cell information is on a scale of 1-10 otherwise. I am going to look at the relationship between various attributes and see if we can predict which cells are cancerous.
clearly describing the data set, its source, and the main problems for which you are developing data analyses and visualizations.g. the client type: "1" for hotel/restaurant and "2" for retail such as a supermarket), region (Lisbon, Oporto, or other, numbered 1-3), fresh products, milk products, grocery products, frozen products, detergents/paper products, and delicatessen (deli products like cold cuts). The numbers under the product types represent monetary units (m.u.), which is a substitute for measuring in regular currency. Wikipedia defines it as "the change in the utility from an increase in the consumption of that good or service". I will be working with this data by observing relationships between certain features, performing clustering, and doing a PCA analysis.
As I mentioned before, this data set is from the UCI site, and it was donated in 2014. Based on the region names, this seems to be data collected in Portugal. The main questions I seek to answer are, in general terms, "Is there a relationship between the kind of client and the type of goods of which the most were purchased?", "Is there a relationship between certain product types, whether it be positive or negative correlation?", and "Can we cluster the data into groups based on similar number of products purchased in certain categories?"
describe the visualization and analysis tools/methods you used
Maybe naive bayes and knn if I'm feeling masochistic
show the visualizations and analysis results
highlights the important results and concludes the writeup
Thank you to my high school buddy John for explaining some economics things