

There are also other options that I can opt for, like naming the legends, deciding the scales, and others. C) Selected the respective columns where I stored my variable observations. B) Since my data is of two variables, I chose a simple chart. To perform this, I chose the option of Scatterplot under the “Graphs” tab. Minitab - labeled screens of Scatter plotĪ) I took 100 random observations under X and Y variables to plot a Scatter plot, which would help me find the relationship between these variables. A line chart is used to understand the trend of the data. Time series analysis is a different mammoth all together where the line chart/run chart is used extensively. We use a line chart for understanding the data collected in a timely order. From the resultant Boxplot, we can see how the box got shrunken, and the observations are lying far beyond its quartile values. F) For the sake of example, I have added data points in the column, which are far distant from most data points. The data points are distributed according to quartiles, and observations beyond the whiskers are considered outliers. E) This the Boxplot that the tool has plotted based on the given data. In our case, because we have only one variable, the choice should be “simple.” D) This box requires you to input the column to be considered for plotting, and also you can add more analysis that can be performed under Boxplot. C) After choosing for Boxplot, we have to specify which category of the plot we need for our analysis. B ) We can find the option for “Boxplot” under the “Graphs” section in the tab bar. Boxplot is constructed based on the percentile concept.įrom the image above, each screen (label alphabetically) signifies:Ī ) We have pasted 1000 observations to perform boxplot analysis. (Outlier Data: Observation which is at an abnormal distance from other values). We plot our observations/data points under the box plot to find outliers in our data collection. This activity of plotting the given data points/observations under different types helps us find properties of the data like Central tendency, Shape of the data, and Dispersion/spread of the data.īroadly there 5 types of data visualizations based on which the given data points are plotted: The first topic we should know under descriptive statistics is Data Visualisation Charts. Now, let’s start from descriptive statistics, then carry that understanding towards inferential statistics. Inferential statistics help us to infer the data to find population parameters. In inferential statistics, our focus would be to find deductions from the data by applying a few theories on it. Our focus would be more on trying to understand the given data or describing the data collected in descriptive statistics. For instance, the average (Sum of elements / # of elements) performed on sample data is known as “Average,” and the same performed on population data is known as “Mean.”

The analysis done on sample data is called statistic, and the study done one population data is known as parameters. The first perspective is sample data, and another perspective is population data.

Generally, the data is considered from two different perspectives. The foundational topic in statistics required for data science can be segregated into two parts: one is descriptive statistics, and another one is inferential statistics. In this article, I would lay out some necessary foundations in statistics required to understand some of the upcoming topics performed on Minitab.
