
For example, it would be a bad idea to use white or both red and red-orange. Under the colors tab, use whatever colors you would like, but be sure they are bold and distinct.
#Xlstat cluster analysis software
Step 3: Use different options in the software to create 5 different “data stories”: if you’re overwhelmed about what to pick, you can use these options: *Scatterplots will have to be created separately using the Results by Object output. Click on the XLSTAT tab on the top of your Excel sheet. For the following steps, be sure that you have installed the XLSTAT add-in. This will save time later when you go to plot your clusters. Next, remove any rows with missing observations. I prefer to do this so that I am not overwhelmed by variables that I am not using. Copy and paste your two variables and their corresponding sampled data (there should be 400 rows of data, two columns) into a new sheet. Step 2: Choose two QUANTITATIVE variables that you would like to work with. Check out: Simple Random Sampling in XLSTAT alternatively there are several tutorials online for taking a random sample with regular excel. Lucky for us, XLSTAT is quite good at taking a random sample. Note that “Weight,” just like last time in project 1, is probably not what you think it is! I’m not forbidding “weight” but realize you’ll have to do a bunch of research about what “weight” is in order to write about it well and get credit! Process: Step 1: Select your sample of n=400. You will take a small subset of this data (I recommend n=400, so as not to upset XLSTAT too much: some clustering algorithms grind to a halt with large data sets).

If you don’t have a data set of your own you’d like to explore, I recommend taking a random subset of the OregonPUMS_data.
#Xlstat cluster analysis trial
Trial and error is the spice of life! You will begin by opening a small data set. Please relax into it and have some fun with the process: think of it as an exploration. Please note this project will likely take some trial and error. Your “story” will be an explanation of your data that highlights some interesting feature(s) or makes a point about the data. I’m asking you to divide the Oregonians in your sample into groups or clusters based on two quantitative variables. For this project, we’re going to use cluster analysis to “tell a story” about our data.
