Course Page

Data Science & Artificial Intelligence






This page holds all the courses that the department offers. Course contents are updated and maintained periodically. Students are advised to keep checking for any important updates.





The following are the list of courses:


  • DS 100: Mathematical Foundations of Data Science (4 Credits)

  • Bayes Rule and its connection to inference, various sampling methods, Modern PAC analysis (probably approximately correct). Geometry of high-dimensional space, distance metrics used for numerical and text data. Locality sensitive hashing (LSH). Matrix approximation techniques: Principal Component Analysis, SVD and dimensionality reduction. Application of transforms (Fourier, Laplace) to data analysis. Linear regression problem, gradient descent. Introduce some representative datasets using images, documents and tables. Use Matlab/Python/R to demonstrate and explore basic concepts.
    Prerequisites: NA



  • DS 200: Architecture for Management of Large Datasets (6 Credits)

  • Design of distributed program models and abstractions, such as MapReduce, Dataflow and Vertex-centric models, for processing volume, velocity, and linked datasets, and for storing and querying over NoSQL datasets. Approaches and design patterns to translate existing data-intensive algorithms and analytics into these distributed programming abstractions. Distributed software architectures, runtime and storage strategies used by Big Data platforms such as Apache Hadoop, Spark, Storm, Giraph, and Hive to execute applications developed using these models on commodity clusters and Clouds in a scalable manner. Design of distributed program models and abstractions, such as Map Reduce, Dataflow and Vertexcentric models, for processing volume, velocity, and linked datasets, and for storing and querying over NoSQL datasets. Approaches and design patterns to translate existing data-intensive algorithms and analytics into these distributed programming abstractions. Distributed software architectures, runtime and storage strategies used by Big Data platforms such as Apache Hadoop, Spark, Storm, Giraph and Hive to execute applications developed using these models on commodity clusters and Clouds in a scalable manner.
    Prerequisites: NA



  • DS 201: Statistical Programming (4 Credits)

  • Probability and statistics: Review, Statistical measures and tests, Statistical analyses using Rand Python, and MATLAB, Linear Regression, Hypothesis Testing, Resampling Techniques, and Bootstrapping, Introduction to contemporary statistical packages
    Prerequisites: NA



  • DS 250: Data Analytics and Visualization (6 Credits)

  • Data science workflow, Automated methods for data collection, Data and Visualization Models, Data wrangling and cleaning, Exploratory data analysis Building Models for: Classification, Clustering, Regression, Time-series, Association Analysis, Recommendation Systems. Model evaluation, statistical tests for significance of predictors. Model regularization: ridge, lasso, elastic-net. Visualization Software and Tools, Visualization Design, Multidimensional Data, Graphical Perception, Interaction dynamics for Visual Analysis, Using Space Effectively, Stacked Graphs, Geometry & Aesthetics. Networks, Graph Visualization and navigation in information Visualization, mapping & Cartography, Text Visualization.
    Prerequisites: NA



  • DS 251: Artificial Intelligence (6 Credits)

  • Problem solving, search techniques, control strategies, game playing (mini-max), reasoning, knowledge representation through predicate logic, rule-based systems, semantic nets, frames, conceptual dependency formalism. Planning. Handling uncertainty: Bayesian Networks, Dempster-Shafer theory, certainty factors, Fuzzy logic, Learning through Neural nets - Backpropagation, radial basis functions, Neural computational models - Hopfield Nets, Bolzman machines, MATLAB programming, introduction to Machine Learning, Supervised and Unsupervised Learning, Introduction to Machine Learning libraries.
    Prerequisites: NA



  • DS 252: DSAI Lab (2 Credits)

  • Introduction, Data in Data Analytics, Descriptive Statistics, Programming with R, Probability Distributions, Sampling Distributions, Statistical Inference, Statistical Tables Relation Analysis, Analysis of Variance (ANOVA), Bayesian Classifier, Information Based Classification. Support Vector Machine Sensitivity Analysis Similarity Measures.
    Prerequisites: NA



  • DS 500: Big Data Algorithms (6 Credits)

  • Introduction to big data and its peculiarities. Map Reduce as a datacenter-scale programming abstraction. Parallel algorithm design to process massive datasets. Algorithms to solve problems from a variety of domains: web search, e-commerce, social-networking, machine learning. Streaming Algorithms, sketching algorithms. Brief discussion of next generation systems like Spark and Flink.
    Prerequisites: Introductory courses in probability, statistics, linear algebra and algorithms.



  • DS 501: Information Retrieval (6 Credits)

  • Introduction, Document Indexing, Storage and Compression, Retrieval Models, Performance Evaluation, Text Categorization and Filtering, Text Clustering, Web Information Retrieval, Learning to rank, Advanced Topics (Text Summarization, Question answering, Recommender Systems)
    Prerequisites: NA



  • DS 503: Advanced Data Analytics (6 Credits)

  • Analysis techniques for high dimensional datasets; Algorithms for massive data problems; Graph representation learning and Graph Neural Networks; Link Prediction, Graph and Node classification, Applications of Graph learning; Network algorithms including those for the World Wide Web; Clustering algorithms for high dimensional datasets; Advanced techniques for Time Series analysis: Motifs, Anomaly detection, Matrix Profile Technique
    Prerequisites: DS250 or equivalent.



  • DS 601: Digital Image Processing (6 Credits)

  • Fundamentals - Visual perception, image sampling and quantization; Intensity transformations - nonlinear transformations for enhancement, histogram equalization; Spatial filtering - convolution, linear and order statistic filters, unsharp masking. Image Transforms - discrete Fourier transform, discrete cosine transform; Transform domain processing - image smoothing, specialized filters (Gaussian, Laplacian, etc); Image restoration - using spatial filters, Wiener filter; Introduction to colour spaces and colour image processing; orphological image processing - erosion and dilation, opening and closing, hit-or-miss transform, thinning and shape decomposition; Binarization and Image segmentation - edge detection, thresholding, region-based segmentation; Image compression - fundamentals, lossless coding, predictive coding, transform coding.
    Prerequisites: NA





    Curriculum   /  Research Areas   /  Projects  /  About



    Copyright © Vishesh Thakur