Data Science

Data science is a deep study of the massive amount of data, which involves extracting meaningful insights from raw, structured, and unstructured data that is processed using the scientific method, different technologies, and algorithms.

Data science is the process of using data to find solutions or to predict out comes for a problem statement.

    In short, we can say that data science is all about:

  • Asking the correct questions and analysing the raw data.
  • Modelling the data using various complex and efficient algorithms.
  • Visualizing the data to get a better perspective.
  • Understanding the data to make better decisions and finding the final result.

Some years ago, data was less and mostly available in a structured form, which could be easily stored in excel sheets, and processed using BI tools.

But in today's world, data is becoming so vast, i.e., approximately 2.5 quintals bytes of data is generating on every day, which led to data explosion. It is estimated as per researches, that by 2020, 1.7 MB of data will be created at every single second, by a single person on earth. Every Company requires data to work, grow, and improve their businesses.

  • Data science is working for automating transportation such as creating a self-driving car, which is the future of transportation.
  • Data science can help in different predictions such as various survey, elections, flight ticket confirmation, etc.


Let suppose we want to travel from station A to station B by car. Now, we need to take some decisions such as which route will be the best route to reach faster at the location, in which route there will be no traffic jam and which will be cost-effective. All these decision factors will act as input data, and we will get an appropriate answer from these decisions, so this analysis of data is called the data analysis, which is a part of data science.

Difference between BI and Data Science:

Criterion BI Data science
Data Source Structured Data Example. Data warehouse Unstructured Data Example. Web logs,Customer feed backs
Method Analytical (historical) Scientific (know the reason for the data report)
Skills Statistics, Visualization Statistics, Visualization and Machine learning.
Focus Past and present data Present and future prediction.

Tools for Data Science:

Following are some tools required for data science:

  • Data Analysis tools: R, Python, Statistics, SAS, Jupyter, R Studio, MATLAB, Excel and RapidMiner.
  • Data Warehousing: ETL, SQL, Hadoop, Informatica/Talend, AWS Redshift.
  • Data Visualization tools: R, Jupyter, Tableau and Cognos.
  • Machine learning tools: Spark, Mahout, Azure ML studio.
How to solve a problem in Data science Using Machine learning?

Problems are solved using algorithms

Data Science Life Cycle