Posts

Who deals with Data?

 Data Architect/Big Data Architect He/She creates the databases from the scratch They design the way data can be retrieved, processed and consumed. Data Architect and Data Engineer were crucial in the data science market. Data Engineer/Big Data Engineer They preprocesses the obtained data from Architects. They ensure whether the data is clean & organized and ready for the analysts to take over. Database administrator They handles the flow of data. Database administrator's mainly deal with traditional data since bigdata is majorly of automated administration. Business Intelligence (BI) Analyst They perform analyses and creates reports and dashboards of historical data. BI Consultant Many firms outsource their datascience department as they don't need one or don't want to maintain one. BI consultant are also known as external BI analyst. They hop on and off different projects. BI Developer This was mostly encountered job position. BI developers handles more advanced progr...

What do you do to the raw data?

 Data pre-processing Raw data is untouched data that needs to be converted into the form that is more understandable and useful for further processing.The group of process to do this are called pre-processing. Class labelling the observations Arranging data by category or labelling data points to the correct datatype. Example: Foe traditional data this can be numerical/categorical whereas for bigdata it can be text,digital image,digital audio. Data cleansing/Data scrubbing Dealing with inconsistent data(misspelled categories & missing values) Data Balancing Performing balancing methods for unequal number of operations. Data Shuffling Rearranging data points to eliminate unwanted patterns(patterns due to sampling emerge) & improve predictive performance. Data Masking It involves concealing the original data(personal information) with random & false data.

Introduction to Data Science

DATA SCIENCE EVOLUTION 25 years ago - Data Science is all about gathering and cleaning datasets and then applying statistical methods (Conventional methods like Regression, Factor Analysis, Cluster Analysis, Time series forecasting and so on). Around 2018 it had grown so humungous as the field encompasses Data Analysis, Predictive analytics, Data Mining, Business Intelligence, Machine Learning, Deep Learning and so on. INTRODUCTION Data Science is the study of data to extract meaningful insights to improve performance of the business firms and drive strategic decision making of the organizations. In other words, the past data can be collected , preprocessed ,analysed and patterns would be extracted to predict future outcomes. There is no father for Data Science but many have contributed to the domain knowledge of it. Now let us discuss about the base of Data Science (i.e) DATA   DATA Data can be of any form such as numbers, letters, words, images, audio , video, symbols , graphs an...