This course is part of the MS in Data Science program at the University of San Francisco.
The goal is to give students a deeper and more general view of data structures and algorithms. While students have examined a number of data structures, such as binary trees, already, this course provides a much more in-depth study. This depth will benefit them greatly in the advanced machine learning course. This course also tends to address many of the difficult algorithm questions students get during job interviews. The critical data structures covered in this class are: lists, linked lists, trees, graphs, hash tables, and tries. The course also covers a variety of common and useful recursive and non-recursive algorithms, such as searching and sorting.
By the end of this course, students will be able to:
- Describe object-oriented programming and use objects to construct linked lists, trees, and graph data structures
- Understand the fundamental data structures such as hashtables, disjoint union, stack and queue.
- Analyze and compare algorithmic complexity using “big-O” notation
- Apply classical algorithms and data structure to problems in data science.
INSTRUCTOR. Mustafa Hajij.
Lectures are Friday
INSTRUCTION FORMAT. Class runs for 1:50 hours 1 day/week. Instructor-student interaction during lecture is encouraged and we'll mix in mini-exercises / labs during class. All programming will be done in the Python 3 programming language, unless otherwise specified.
TARDINESS. Please be on time for class. It is a big distraction if you come in late.
Artifact | Grade Weight | Due date |
---|---|---|
HW1 | 8% | See Canvas |
HW2 | 18% | See Canvas |
HW3 | 18% | See Canvas |
Exam 1 | 30% | See Canvas |
Final Project | 20% | See Canvas |
Attendance and participation | 6% | attend all lectures and actively participate in class |
Note: In order to pass this course, students are expected to receive at least 50% of each category.
Here is a great free book on algorithms by Jeff Erickson.
A great book that is not free is here Introduction to Algorithms
We will give a total of 6 lectures, the outline of topics covered in these lectures are going to be :
Lecture 1 : An introduction to algorithm complexity. Application : k-means algorithm.
- complexity_lecture [PDF]
- k_means_lecture [PDF]
- count_common_elements [Notebook]
- k_means_notebook [Notebook]
- Want to get deeper understanding of complexity of algorithms ? Check this youtube channel.
Lecture 2 : core data structures : Arrays, linked list, hashtables, stack and queue.
- Essential data structures [PDF]
- Hashtables[Notebook]
- Hashtable problems[Notebook]
Lecture 3 : Recursion, trees, introduction to graphs.
- Trees[PDF]
- Analysis of recurive algorithms. Video.
- Binary search tree. Video
- Introduction to NetworkX notebook
- Introduction to graph algorithms
Lecture 4 : Minimal Spanning tree, union find and applications to clustering algorithms
- union find, minimal spanning tree and Kruskal algorithm)[PDF]
- Hierarchical Clustering and MST [PDF]
- Union find and Hierarchical Clustering using Euclidean MST code
Lecture 5 : Dijkstra algorithm, More on graphs in data science.
- Shortest distance on graph [PDF]
- PageRank [Notebook]
- Application : graph k-means Notebook, graph k-means PDF
Lecture 6 :
- More examples on applications algorithms in data science Topological Sort and Backprob
- Sorting algorithms
Exam1 review questions notebook.
More exam questions notebook.