Can anyone give me a good guide to describing the distribution of data, i.e, skewed distribution, outliers, etc...... I really don't understand what they mean. "HELP"...........
I can certainly help. Would you like me to post some notes here or do you want to post a question to get help with ?
You need to begin by ensuring you are clear on the basics : Mean , Median, Mode, Range, Quartiles . You cannot build a house on rocky foundations!
They can ask you for a measure of location , these is another way of saying average (mean , median or mode) . Often you will have to decide which is the best average to use.
Measure of spread is also another way to say range/variability.
I have recommended some videos in the resources section once they are approved by Studyclix you will be able to view them, I will also paste some links at the end of this post.
Outliers are extreme values. They are below the lower Quartile or above the upper quartile. They are well below or well above the average and therefore lie outside most of the data. They can distort the mean. Think of salaries for employees in a small factory. If we include the Manager/Owner's salary this will make the average salary look greater than they actually are. This data would have a positive skew. The manager's salary will be an outlier.
Think of skew as a tail. This will help you in understanding skewed distributions
A tail to the right means the data is positively skewed. Of course if the tail is to the right this means most of the data is to the left.
A tail to the left means the data is negatively skewed. Of course if the tail is to the left this means most of the data is to the right.
If the data seems to be spread pretty evenly between two tails it is considered a Normal Distribution or Symmetrical distribution.
It is important to understand where the mean, mode and median will lie on these types of distributions. The mode is the easiest to place, it will be where the shape peaks (the highest point of the curve). The mean will always be dragged towards the direction of the tail. The median will come in between the mode and the mean.
For a positive skew: Mode < Median < Mean
For a negative skew: Mode > Median > Mean
From looking at distributions you should also have an idea of the spread. The more spread out the data (think of how much of the data is away from the mean) the greater the standard deviation.
You may want to touch on Uniform, Bi-modal and Multimodal distributions also. Be able identify them.
This guys channel is worth subscribing to as he has some nice videos on stats. https://www.youtube.com/user/cylurian/videos