Scatter Plot

What is it?

  • A scatter plot is a two dimensional chart that uses points or “dots” to represent specific values.
  • Scatter plots most commonly have two different values that are represented using the X and Y axis.
  • This specific type of chart is best used to show the relationship between the X values and the Y values.
  • Instead of simply representing X and Y as values, scatter plots effectively show the correlation between these two variables.

Usages:

  • Scatter plots are beneficial to see the correlation between two variables, such as how X is affected when Y is increased.
  • It is important to note that while scatter plots have been referred to as “disconnected line graphs”, they do not necessarily have to be linear.
  • The example below represents a scatter plot of the average daily high temperatures by month using a non-linear graphing method.
  • In some instances, the points on scatter plots may be completely random, showing little to no correlation at all.
figure 2
https://chartio.com/learn/dashboards-and-charts/what-is-a-scatter-plot/ 
  • In addition to X and Y, the color, shape and size of the points on a scatter plot can also be seen as variables.
  • The example below depicts a scatter plots of the height and weight of children by gender, with height and weight being the X and Y variables.
  • The example also uses color as a variable to distinguish between male and female children.  
https://chartio.com/learn/dashboards-and-charts/what-is-a-scatter-plot/

Advantages:

  • The many variables within scatter plots, such as color, shape and size, allow data to be categorized within the chart.
  • Scatter plots are effective at giving an overview of the correlation between the X and Y variables.
  • Within scatter plots, it is easy to find "outliers" as some points are made very obvious if they are not near the rest of the clusters.
  • The example below demonstrates how scatter plots can be helpful for quickly finding deviances in data
http://faculty.virginia.edu/ASTR3130/lablinks/GuidePlots.html

Disadvantages:

  • If scatter plots contain too many points, they encounter "over plotting" problems.
  • This occurs when points are placed over top of one another, making the chart difficult to interpret and read.
  • However, as evidenced by the example below, transparency can be used as a solution to over plotting
https://www.infragistics.com/community/blogs/b/tim_brock/posts/jitter-another-solution-to-overplotting
Show Comments