Data Science using Python


Data Science using Python Course Details

Data science is an interdisciplinary field that uses scientific processes and various algorithms to extract knowledge and insights from data that may be e structured and unstructured.


Python has gathered a lot of interest recently e as a choice of language for data analysis/ science. Python is a free and open-source general-purpose programming language that is easy to learn. Python, due to itsversatility, is ideal for implementing the steps involved in data science processes. Python is being used for web development,data analysis, Artificial intelligence,and scientific computing.


The three best of most important Python libraries for data science and NumPy, Pandas, and Matplotlib. NumPy and Pandas are used for analyzing and exploring data. Matplotlib is a data visualization library used for making various types of graphs depicting the analysis.

With the growth in the IT industry,there is a booming demand for skilled data scientists and Python has been involved as the most preferred programming language for the same.the course will focus on fundamental Python Programming techniques,reading and manipulating “.csv” files, And various libraries for data science.

After completing the module,the student will be able to:

  1. Take tabular data and clean it
  2. Manipulate the data
  3. Run basic inferential statistical analyses
  4. Perform data analysis
  5. Perform Visualization of analysis
  6. Built a Front-end GUI


120 Hours – (Theory: 48 hrs + Practical: 72 hrs)

Outline of Module:

Module Unit Duration (Theory) in Hours Duration (Practical) in Hours Written Marks (Max.)
1.      Python Language, Structures, Programming Constructs 6 9 14
2.       Data Science and Analytics Concepts 2 3 6
3.       Introduction to NumPy library 8 12 20
4.       Data Analysis Tool : Pandas 14 21 24
5.       Statical Concepts and Functions 6 9 10
6.       Matplotlib 6 9 10
7.       GUI – Tkinter 4 6 12
8.       Machine Learning : The Next Step 2 3 4
Total Marks 100

Detailed Syllabus:

  1. Python Language, Structures, Programming Constructs

Review of Python Language, Data Types, Variables, Assignments, Immutable Variables, Strings, String Methods, Functions and Printing, Lists and Its operations, Tuples and Dictionaries programs, slicingof strings, Lists and tuples.


  1. Data Science and Analytics Concepts

What is data science and Analytics? The Data science process, Framing the problem, collecting, processing, cleaning and Munging Data, Exploratory data analysis, visualizing results.


  1. Introduction to NumPy library

NumPy: Array processing package, Array type, Array Slicing, Computation on NumPy Arrays – Universal Functions, Aggregations: Min Max etc., N-Dimensional Arrays, Broadcasting, Fancy indexing, sorting arrays, loading data in NumPy from various forms.


  1. Data Analysis Tool: Pandas

Introduction to the data Analysis Library Pandas, Pandas Objects – Series and Data Frame, Data Indexing and selection, Nan objects, manipulating data frames, Grouping Filtering, Slicing, Sorting, Ufunc, Combining Datasets – Merge and Join.

Query Data Frame structures for cleaning and processing, lambdas, Aggregation functions and applying user defined functions for manipulations.


  1. Statical Concepts and Functions

Statistics module, manipulating statistical data, calculating results, of statistical operations. Python probability Distribution, Functions like means, median, mode and standard deviation. Concept of Correlation and Regression.


  1. Matplotlib

Visualization with Matplotlib, Simple line plots, scatter plots, Density and Contour plots – visualizing functions, multiple subplots, Plotting histograms, bar charts, scatter graphs and line graphs.


  1. GUI – Tkinter

TKinter as Inbuilt Python module creating GUI applications in Python. Creating various widgets like button, canvas, label, entry, frame, check buttons etc. Geometry Managements: Pack, Grid, Place, organizing layouts and widgets, binding functions, mouse clicking events. Building the complete interface of a projects.


  1. Machine Learning: The Next Step

What is machine learning? Types of Machine Learning Algorithms, Training the data and introduction to various learning algorithms. Applications Machine Learning.


  1. Reference Books/Study Material
    1. Python for Data Analysis by O’Reilly
    2. Getting Started with Python Data Analysis
    3. Python Data Science Handbook: Essential Tools for Working with Data by O’Reilly
    4. Python for Data Science by Dummies