Hello friends, This is my second article on my blog. In this article, we'll continue our discussions on series of introductory articles and this is Part - 2 of introductory articles. Now, it's time to dive into the article 😊
What is Data
In Statistics, Any information available on an individual (or group of individuals) is known as data.
For Example :
- If I measure my (an individual) weight, say 62 kg , then it is weight information available on an individual (me) and hence is a data.
- If I measure the average weight of students of a class (group of individuals), say 52 kg, then it is weight information available on a group of individuals (students of a class).
Note –
- Here it must be noted that individual may be anything (living or non – living). For Example – we can collect data on the performance (in terms of production per day) of a particular machine.
- In second example, I have used a term average height. If you know the meaning of this term then it is fine, otherwise, just leave it here. I'll explain you in upcoming articles.
What is Variable
As in Mathematics, You will have already read that variables are those quantities whose value is not fixed. In a similar way, In Statistics, Variable is any characteristic of an individual or group of individuals that may take at least two values. These values can be measured or counted or observed but represents a category (like - Male and Female) or observed but does not represent a category (like - name of a person).
For Example –
- If we collect data on the weight of a person, say Ram (an individual), over a given time interval (say, a week), then here weight is known as variable since it takes seven numerical values (at least two values, From Sunday to Saturday). In this case, weight is measured.
- If we collect data on the number of cars having rich people (group of individuals) at a specified time and geographical area, then here number of cars is known as variable since it takes at least two numerical values (0 and/or 1 and/or 2.... and so on). In this case, number of cars is counted.
- If we collect data regarding sex/gender of a person, then here sex/gender is known as variable since it takes at least two values (Male or Female or Transgender). In this case, the values of variable "sex/gender" is obserevd and represents three different categories.
- In statistics, we always collect data on some variable. OR
- We can say that, In Statistics, A variable contains same type of data on an individual or group of individuals.
Classification of Data and Variable
There are various classifications of data and variable. Some major classifications are as follows -
On the basis of measurement scales, there are four types of data and variable in statistics which are as follows –
- Nominal Data and Nominal Variable : If the data, available on an individual or group of individuals, is on "nominal scale", data is called nominal data and the variable that contains nominal data is known as nominal variable.
- Ordinal Data and Ordinal Variable : If the data, available on an individual or group of individuals, is on "ordinal scale", data is called ordinal data and the variable that contains ordinal data is known as ordinal variable.
- Interval Data and Interval Variable : If the data, available on an individual or group of individuals, is on "interval scale", data is called interval data and the variable that contains interval data is known as interval variable.
- Ratio Data and Ratio Variable : If the data, available on an individual or group of individuals, is on "ratio scale", data is called ratio data and the variable that contains ratio data is known as ratio variable.
- Nominal Data - Ram, Sita and hence, Name is Nominal Variable
- Ordinal Data - 1, 2 and hence, Rank (in a test) is Ordinal Variable
- Interval Data - 35 degree, 37 degree and hence, Temperature (in Celsius) is Interval Variable
- Ratio Data - 60, 55 and hence, Weight (in kg) is Ratio Variable
Confused 😇, how have i identified measurement scales of available data ? If yes, Please, read Part - 1 of series of introductry articles where i have explained in detail the concept of measurement scales. Otherwise, continue reading this article 😊.
- Quantitative Data and Quantitative Variable : Data that are only "measured" or "counted" is known as Quantitative Data. Further, a variable that contains quantitative data is known as Quantitative Variable.
In the above table, the data available in column – 2 as well as 3 are Quantitative Data. Why ?
Because, Weight of any person is measured by an instrument and No. of Mobiles he/she uses is counted. Hence, these data (60.2 kg, 55.1 kg, 2, 3) are Quantitative Data and the variables (Weight and No. of Mobiles) are Quantitative Variable.
- Qualitative Data and Qualitative Variable : Data that are only "observed (may or may not represent a category)" not measured or counted is known as Qualitative data. Further, a variable that contains qualitative data is called Qualitative Variable.
Because, while collecting data you simply go to a person and ask his/her name and gender. We can not measure or count name and gender of a person. Hence, these data can be observed only. That's why, the data (Ram, Sita, Male, Female, Gita, Suraj) is Qualitative Data and the variable (Name, Gender) is Qualitative Variable.
- Numerical Data and Numerical Variable : Data that are only "measured" or "counted" is known as Numerical Data. Further, a variable that contains numerical data is known as Numerical Variable. (Note that, definition is same as in case of Quantitative Data and Quantitative Variable)
Because, Weight of any person is measured by an instrument and No. of Mobiles he/she uses is counted. Hence, these data (60.2 kg, 55.1 kg, 2, 3) are Numerical Data and the variables (Weight and No. of Mobiles) are Numerical Variable.
- Categorical Data and Categorical Variable : Data that are only "observed" and represents a "category" is known as Categorical Data. Further, a variable that contains categorical data is known as Categorical Variable. (Note that, definition is almost same as in case of Qualitative Data and Quanlitative Variable. The only difference in this case is that Data must represent a "category" )
Because, while collecting data you simply go to a person and ask his/her gender. We can not measure or count gender of a person. Hence, this data can be observed only. Moreover, the data (Male and Female) represents two different categories as well. That's why, the data (Male and Female) is Categorical Data and the variable (Gender) is Categorical Variable.
Note that, The data available in column - 1 of above table does not represent any category. That's why these data (Ram, Sita, Gita, Suraj) are not categorical data and consequently the variable "Name" is not a categorical variable but it is qualitative as explained in definition of Qualitative data and Qualitative variable.
- Discrete Data and Discrete Variable : Data that are "obsreved (may or may not represent a category)" or "counted" is called Discrete data and a variable that contains such type of data is known as Discrete Variable.
- Continuous Data and Continuous Variable : Data that are only "measured" is known as Continuous Data and a variable that contains such type of data is called Continuous Variable.
Diagram Connecting all types of Data and Variable
Nominal Variable - Mobile Number, Name, Gender, Colour, Species, DirectionOrdinal Variable - Rank in a test, Education level, Place in Race, Place in Election, Social Status, Ratings of PerceptionInterval Variable - Temperature (in celcius or fahrenheit), IQ score, Test scoreRatio Variable - Age (in years), Education years, Family size, Weight, Height, Temperature (in Kelvin), Mass, Length, Income (in rupees)
Similarly for Discrete, Continuous, Numerical, Categorical, Qualitative and Quantitative variables.
If you find any mistake or have any suggestions, just let me know using Suggestion Form given below (for mobile users) or in sidebar (for laptop users). Thank you in Advance ! 😊
1 Comments
Very insightful concepts👍
ReplyDeletePlease, Do not enter any spam link in the comment box