Introduction to Data Science

DATA SCIENCE

EVOLUTION

25 years ago - Data Science is all about gathering and cleaning datasets and then applying statistical methods (Conventional methods like Regression, Factor Analysis, Cluster Analysis, Time series forecasting and so on).
Around 2018 it had grown so humungous as the field encompasses Data Analysis, Predictive analytics, Data Mining, Business Intelligence, Machine Learning, Deep Learning and so on.

INTRODUCTION

Data Science is the study of data to extract meaningful insights to improve performance of the business firms and drive strategic decision making of the organizations. In other words, the past data can be collected , preprocessed ,analysed and patterns would be extracted to predict future outcomes.
There is no father for Data Science but many have contributed to the domain knowledge of it.
Now let us discuss about the base of Data Science (i.e) DATA
 

DATA

Data can be of any form such as numbers, letters, words, images, audio , video, symbols , graphs and goes on. It might also be person's height, weight, age, gender. So, data is the raw form of knowledge. Two different types of data were
1) Traditional Data
2) Big Data

Traditional Data (Static)

The picture that comes to our mind while thinking about data would be like this---

ID NO

NAME

AGE

GENDER

1

Alan

25

Male

2

Bency

24

Female

3

Crum

26

Male


Above table was a perfect example of traditional data.
It is structured and stored in Databases which can be managed from one computer.It will be in table format with numeric values or text values as above table. Few examples are customer information , inventory records and students marklist etc.

Big Data (Dynamic)

This is extremely large amount of data which is impossible to manage from one computer.It can be structured ,unstructured or semi-structured. 
The characteristics of Big Data can be defined as 3V's , 5V's , 7V's or even 11V's.
The main 5V's characteristics of Big Data are
  1. Volume- Vast amount of Data would be generated and collected
  2. Velocity- Speed at which data is generated and processed
  3. Variety- Many different types & formats of data
  4. Veracity- which defines its complexity
  5. Value- what valuable things would be done by organizations 
Big Data would be of any type such as photo, video, audio etc.This Big Data requires so much of storage space like petabytes or exabytes.
Online platforms like Google, Facebook, Twitter generates millions of Data per second.



Comments