Welcome to CSCA48

TA name: William Granados

Tutorial: Wednesday 4-5pm, at BV 264

Practical: Monday 4-5pm, at BV 471

Email Policy: Before emailing me, consider attending my practical or posting the question on Piazza. My email address is william.granadosmatamoros@mail.utoronto.ca. When emailing me, use your utoronto email and in the subject include "CSCA48".

Expectations: There may be a quiz held in tutorial, so please be on time.

    We went over several sorting algorithms, and you should be able to simulate them after N passes of each algorithm. This is the last tutorial so good luck on final exams!
  • Bubble Sort
    • General Idea: You want to loop through the array and swap adjacent pairs of numbers if they're out of place. Then repeat this process N times.
    • Implementation: geeksforgeeks
    • Time Complexity: $O(n)$ to swap the elemens in the inside loop, $O(n)$ for the outer loop, resulting in $O(n^2)$.
  • Selection Sort
    • General Idea: In N successful passes of the unsorted array, select the minimum element, remove it, then append it to your sorted list.
    • Implementation: geeksforgeeks
    • Time Complexity: $O(n)$ to find the minimum element, $O(n)$ for the passes,resulting in $O(n^2)$.
  • Insertion Sort
    • General Idea: Assume the first element is a sorted sequence, starting from the second element, you find its position in the sorted portion to the left.
    • Implementation: geeksforgeeks
    • Time Complexity: $O(n)$ to loop over all the elements, $O(n)$ to find its place , resulting in $O(n^2)$.
  • Heap Sort
    • General Idea: Put all of the elements into a heap, then successfully pop the minimum element.
    • Implementation: geeksforgeeks
    • Time Complexity: $O(\log{n})$ for popping min , $O(n)$ repeating process on n elements, resulting in $O(n\log{n})$.
  • Quick Sort
    • General Idea: Choose a pivot location (usually the median), move elements smaller than the pivot to the left half of the array, and elements greater to the right half. Then call quick sort on each half.
    • Implementation: geeksforgeeks
    • Time Complexity: $O(n\log{n})$, more complicated
  • Merge Sort
    • General Idea: Continually split array into halves, call merge sort on both halves, then combine the sorted subproblems.
    • Implementation: geeksforgeeks
    • Time Complexity: $O(n\log{n})$ using Master Theorem
  • Big-Oh for recursion
    • It might be a bit easier to count the number of steps in for a, for loop, but a recursive function is a lot more complex.
    • Remember, at the end of the day we have to base the run time on the parameters, be it nodes in a tree, or some numerical value.
    • In general there are 3 types of recursive algorithms that you study, and it's a requires some foresight to figure out the runtime.
      • Linear Recursion: These are pretty straightforward. Since it branches linearly, you just have to count the number of overall recursive calls, and whatever happens in the function.
      • General Recursion: These are a bit tougher, since they involve several branches per recursive call, and finding an upper bound on this requires some pattern matching.
      • Divide & Conquer Recursion: You won't have to analyze these types for this course, but you you can cheat by using the Master Theorem!
        This approach involves learning about recurrences which is not required for this course.
  • Linear Recursion
    • Usually should be $O(n)$ unless there's sufficient work being done inside each call.
  • General Recursion
    • For these types of recursion, in general you will be in one of two cases.
      • Each recursive call is unique, and is only called once. An example would be flood fill, so figuring out the input size is easy.
      • Each recursive call isn't necessarily unique. An example is naive fibonacci calculation, in this case if you were to draw out the recursion tree
        you'd notice it was exponential with respect to the recursive branches. I.e. $O(2^n)$ for this case.
  • Divide & Conquer Recursion
    • These are a lot more difficult and show up in CSCC73 (Algorithm Design), so you won't have to worry about them too much.
    • Algorithms like quick sort and merge sort are generally analyzed using the Master Theorem which you don't have to know for this course.
    Here's a nice blog post on the topic.
    TT2 will be handed back today.
  • Big-Oh
    • When measuring the efficiency of algorithms, you measure it by the amount of steps it takes to termination.
      For example, the line for i in range(0, n) takes n steps to terminate.
    • For most algorithms you write, you can model the amount of steps by a mathematical function with respect to the amount of steps taken.
    • The notation $O(g(n))$ basically says that whatever algorithm you write is bounded above by $g(n)$ at some point.
      I.e. in the worst case, your algorithm takes this amount of steps.
  • Time Complexity Hierarchy
    • It's important to have some kind of intuition for the hierarchy of time complexity when it comes to algorithms, here are some helpful examples.
      • $O(1)$ - some examples are: arithmetic operations, appending to the back of the list.
      • $O(log(n))$ - some examples are: inserting into a heap, binary search.
      • $O(n)$ - an example is a simple for loop that loops n times.
      • $O(n^d), d > 1$ - an example is having d nested for loops each looping n times.
      • $O(b^n), b > 1$ - generally you get this when you're doing recursion.
      • $O(n!)$ - generally you get this when you're generating permutations or something similar.
  • Formal Definition
    • You might be asked to prove something like $5n^4 + 3n^3 + 2n^2 + 4n + 1 \in O(n^4)$
    • You start off the proof by stating : $f(n) \in O(g(n)) \iff \exists c > 0, \exists n_0 > 0 , s.t \, \forall n > n_o, f(n) \leq c \cdot g(n)$
    • The above is a bit confusing, so lets break it down a little.
      • Essentially what we're trying to prove is that, if we were to graph both functions, after some point $n_0$, $g(n)$ be bigger than $f(n)$, for all n greater than $n_0$
      • You can make this task easier by also scaling $g(n)$ by some constant $c$
    • For the above formula, if you substitute $n_0 = 1, c = 15$, then you get $15 \leq 15$, and your proof is done.
  • Challenge problem
    • Describe an algorithm for finding the 10 biggest elements in an unsorted array of size N.
    • Send me an email with your solution(code) and time complexity analysis, and if you have the fastest asymptotic time algorithm, we'll post your name on piazza.
    Here's the code sample we went over vowel_count.
    Last day to pick up midterm from tutorial. Pickup will now be done from Marzieh's office.
  • Recursion
    • Did you mean: recursion?
    • Problems that branch out lend themselves easier to a recursive solution that one that involves loops and if statements.
    • When designing a recursive function you think about the parameters, the base case, the recursive case, and the combining step.
      • The parameters are what is generally gonna change on each recursive call, they're essentially the problem you're trying to break down
      • The recursive case is how you're gonna break the problem down.
      • The base case is once you've found the solution to a broken down problem.
      • The combining step is how you're gonna combine the solutions found from the broken down problems.
    Here's the code sample we went over week6_heap, week3_pq. From last time you can read the heap section on these extra notes.
    TT1 will be returned today.
  • Priority Queue ADT
    • min or max amongst all elements in the container.
    • insert and delete elements into the container.
  • Binary Heap Representation Invariant and properties
    • Internally represented with a list, heap, where heap[:] contains the elements.
    • heap[0] contains the overall min/max element
    • for all i on a max heap, heap[i] < heap[2*i+1] and heap[i] < heap[2*i+2]. Same applies for max heaps but heap[i] is bigger.
    • abstractly the heap represents a complete tree with n nodes. So the maximum height of the tree is about log(n)
    Here's the code sample we went over in here, and here are some extra notes. Note that the way deletion is done in these notes is incorrect for this course.
  • Binary Search Trees Properties
    • Has a tree-like structure to it
    • Each node has at most 2 children
    • Invariant: for any node u, u is greater than all of its left descendents, u is less than all of its right descendents
  • Insertions
    • Traverse down the tree until you get to an empty node
    • When you traverse, compare the value you want to insert with the node you're currently at. If your node is bigger, go right, otherwise go left.
  • Deletion
    • Four cases to consider based on how many children the node has.
    • 0 children: just delete the node, don't have to worry about the invariant
    • 1 child: just replace the node with it's child, the invariant holds
    • 2 children: replace the node to delete, with the leftmost child in its right subtree
  • Traversals
    • Basically define in what manner we're going to traverse and print the values of the tree. There are 3 you should know.
    • Pre-order: Node Left Right (NLR), In-order: Left Node Right (LNR), Post-order: Left Right Node (LRN).
    • General tip when you're doing this problems is to write the letters of the alphabet to help you reason faster.
    Here are the code sample gone over in class: indexed_dll, week4_DLL.
  • Efficiency
    • Compared to the single linked list (SLL), removing elements from the back or front are constant instead of requiring one traversal.
    • A lot easier to perform logic with respect to adding/deleting nodes
  • Augmenting
    • Your challenge is to augment the DLL data structure, to incorporate the following functionality
      • insert(e,i) - e is an element and i is the index where it should be inserted.
      • remove(i) - i is the index of the node to be removed.
    • Your solution should make use of the provided week4_DLL file above. It must make use of OOP principles.
    • Make sure you understand how traversals works with DLL, and making use of get/set functions provided by the node class
    Here's the code sample gone over in class: deque.
  • Properties & Applications
    • Supports inserting or popping elements from either side of the ADT in constant time.
    • Essentially combines the functionality of both a queue and stack.
    • Used in some OS Scheduling Algorithms like the A-Steal scheduling algorithm.
    Here's are the code samples gone over in class: decimal_to_binary, StackADT. And an extra problem we went over Parenthesis.
  • Decimal to Binary
    • Here's more than enough information you need to know about representing numbers in different bases Computer Number Systems
    • You only need to understand how to convert decimal to binary numbers for the first example.
  • Applications
    • Think about how the data is structured, and how you want to access it. This will help you work through problems.
    • Remember that Stacks provide you data in a LIFO(Last In First Out) fashion which differs from how a queue does it in a FIFO(First in First Out) fashion.
    • Here's a list of fun (optional) problems involving stacks: Geneva Confection, Frog Mutants
    Here's the code we went over.
  • Abstract Data Types
    • Is a logical description of how we view data and the operations that are allowed without regard to how they will be implemented (source).
    • Understand how behaviour can change based on the type of data that we provide.
    • Note that if certain conditions aren't satisfied for our ADT, we should throw an exception.
    • Generally docstrings and method signatures shouldn't change when implementing an ADT.
There are no tutorials this week. Instead here's a link to a funny comic XKCD - Trees.