## Definition of Statistics– Data Collection & Sources of Data – Variable and Its Types

What is Statistics?

“The study of collection, presentation, analysis of data and drawing conclusion about parameter on basis of statistical inference is called Statistics”. Following is a block diagram that defines statistics:

All four terms mentioned in bold in the above block diagram are defined below:

• Population: “Any well-defined group of individuals/items/objects whose characteristics are to be studied is called Population”. For example, students of a college or books in a library.
• Parameter: “Any quantity which defines the characteristic of whole population is called a Parameter”; it’s therefore the average of a population.
• Sample: “A part of population is called a Sample”. Sample is a subset of population (universal set). In statistics, a sample is drawn to avoid calculations on a large population.
• Statistic: “A characteristic of sample is called Statistic”. It is thus an average of a sample. Statistic is a singular word while statistics is plural.

Types of Statistics:

There are two types of statistics which are descriptive statistics and inferential statistics. In the definition of statistics; collection, presentation and analysis of data are parts of descriptive statistics whereas drawing conclusion about parameter on basis of statistical inference is a part of inferential statistics. Descriptive and Inferential Statistics are defined below:

Descriptive Statistics: It is a type of statistics that deal with organizing and summarizing data.

Inferential Statistics: It is a type of statistics that deal with using data you have collected to form conclusions.

Data Collection & Sources of Data

Data is necessary for statistical analysis and it is collected from two sources which are internal sources and external sources. Internal and external sources are mentioned below:

1. Internal Sources:

If information is available; then, we use internal sources for collecting data. Internal sources of data are internal reports of an organization. For instance, a factory publishes its annual report on total production, total profit and loss, total sales, loans, wages to employees, bonus and other facilities to employees etc.

1.  External Sources: If desired information is not available; then, it is obtained from external sources which are primary sources and secondary sources. Primary and secondary sources are mentioned below:

Primary Sources: Primary sources of data are sample survey (information is collected from a correspondent; its tool is questionnaire) and experimentation either fields like agriculture or labs (research sides/industries). Data collected from primary sources is a primary data and it is defined as:  “Data collected by the investigator himself/herself for a specific purpose or from a primary source”. It is also called First Hand Data.

Secondary Sources: Secondary sources of data are official sources (government sources); for e.g. ministries and departments, private sources (non-government sources) for e.g. magazines and newspaper, semi-private sources and unofficial sources.  “Data already collected or taken from a secondary source is called a Secondary Data”. It is also called Second Hand Data. Secondary data may be used or unused, published or unpublished.

Variable & Types of Variable

Variable: “A characteristic of individuals of a population or of a sample which varies from individual to individual is called a Variable”. There are two types of variables which are mentioned below:

• Categorical or Qualitative Variable: “Variable that cannot be specified in numbers; therefore, category has to be made for it is called a Categorical Variable”. For example: title of a book and blood group of a student.
• Quantitative or Numerical Variable: “Variable that can be specified in numbers is called a Quantitative Variable”. For instance: number of pages in a book and weight of a student. Quantitative variable has two subgroups which are defined below:
1. Discrete Variable: “A characteristic which is countable and can take on discrete values is called a discrete variable”. A discrete variable take limited numbers from 5-10 like 6, 8 and 10. Number of siblings in a family, number of tickets sold in cinema for a particular movie in a day and birth year of student are some examples of discrete variable.
2. Continuous Variable: “A characteristic which is measurable and can assume all possible values within a given range of values is called a Continuous Variable”; as between 5-10; some of which may be 5.6, 7.0 or 8.9. Some examples of continuous variable are weight, pressure and mass.