Data Structures & Algorithms

Hash Tables

Guide: How to implement a hash table (in C) (benhoyt.com)

  1. Create Array of certain size (generally any power of 2)
  2. Generate Hash from key
  3. hash(key) & (len(array)-1) -> storage position
    • Modulo would be the general case, but since size is 2n we can use binary-AND which is much faster
  4. Probe position. If already occupied (collision):
    • Open addressing, linear probing: Add 1 to index and try again (until empty slot found)
    • Separate chaining: Store values as linked list of results
    • Other methods: Hash table - Wikipedia
  5. Store value as well as hash in the resulting position
    • Hash is needed to resolve collisions
    • Some tables put the data separately in a flat array and only store hash+index, this saves space for large data structs
  6. Lookup is constant (O(1) amortized) as index = hash(key) & (len(array)-1) + collision resolution

Linked List vs Array

C++ benchmark - std::vector VS std::list | Blog blog("Baptiste Wicht"); (baptiste-wicht.com)

https://www.rfleury.com/p/in-defense-of-linked-lists

!memory-layout.mp4

Sorting

Vergleich mit Animation: Sorting Algorithms Animations | Toptal®

Insertion Sort

Einer der besten Algorithmen, wenn Liste bereits fast sortiert ist. Ebenfalls gut, wenn Datenmenge klein ist. Wird daher oft in anderen Algorithmen verwendet, wenn Liste in kleinere Stücke geteilt wurde.
Nicht gut geeignet wenn Liste invertiert ist.

i ← 1
while i < length(A)
    x ← A[i]
    j ← i
    while j > 0 and A[j-1] > x
        A[j] ← A[j-1]
        j ← j - 1
    end while
    A[j] ← x
    i ← i + 1
end while

Quicksort

Gut bei großen Datenmengen.
Sehr gut wenn Liste invertiert ist.
Nicht so gut wenn Liste bereits fast sortiert ist.
Grundsätzlich für die meisten Anwendungen gut geeignet.
"in memory" sort, braucht keinen zusätzlichen Speicher (außer eine handvoll temporäre Variablen für Wertetausch, Pivot, Iteratoren).

// Sorts a (portion of an) array, divides it into partitions, then sorts those
algorithm quicksort(A, lo, hi) is 
  if lo >= 0 && hi >= 0 && lo < hi then
    p := partition(A, lo, hi) 
    quicksort(A, lo, p) // Note: the pivot is now included
    quicksort(A, p + 1, hi) 

// Divides array into two partitions
algorithm partition(A, lo, hi) is 
  // Pivot value
  pivot := A[lo] // Choose the first element as the pivot

  // Left index
  i := lo - 1 

  // Right index
  j := hi + 1

  loop forever 
    // Move the left index to the right at least once and while the element at
    // the left index is less than the pivot
    do i := i + 1 while A[i] < pivot
    
    // Move the right index to the left at least once and while the element at
    // the right index is greater than the pivot
    do j := j - 1 while A[j] > pivot

    // If the indices crossed, return
    if i >= j then return j
    
    // Swap the elements at the left and right indices
    swap A[i] with A[j]

In C:

void qsort( void* ptr, size_t count, size_t size,
            int (*comp)(const void*, const void*) );

// ptr - start of array
// count elements of size bytes each
// comp - comparison function (returns -1 if first arg is < second arg, 1 if > and 0 if equal)

// Is not guaranteed to use quicksort, some implementations use insertion sort for small arrays or merge sort

stl::sort vs qsort: C qsort() vs C++ sort() - GeeksforGeeks, Beating Up on Qsort | Performance Matters

Radix Sort

Super fast, but only for integers: Beating Up on Qsort | Performance Matters
Directly sorts numbers into their final buckets by comparing them "digit by digit" basically (usually base-256 is used instead of base-10).
Needs additional memory for the buckets (which can slow it down, if not properly optimized).
Time complexity is O(n) instead of O(n*log(n)) for most other sorting algorithms.
Also needs special attention if negative numbers are involved (additional sort on the sign bit).

Quadtrees and Grids

Great set of posts:
data structures - Efficient (and well explained) implementation of a Quadtree for 2D collision detection - Stack Overflow
performance - What is a coarse and fine grid search? - Stack Overflow