C

strncpy, strncat

char * strncpy ( char * destination, const char * source, size_t num );

char * strncat ( char * destination, const char * source, size_t num );
warning

destination ist nicht null-terminiert, wenn strlen(source) >= num

Besser ist Folgendes:

// Ineffizient wenn src deutlich kürzer, weil Rest von dest mit 0 gefüllt wird
char dest[64] = "";
strncpy(dest, src, sizeof(dest) - 1);
dest[sizeof(dest)-1] = 0
// Besser (-1 als Platz fur \0)
dest[0] = 0;
strncat(dest, src, sizeof(dest) - 1);
// Anhängen an Vorhandenes
strncat(dest, src, sizeof(dest) - strlen(dest) - 1);

strncpy füllt Buffer komplett mit 0 auf, wenn strlen(source) < num, daher bei strings mit deutlich unterschiedlicher Länger als Alternative strncat oder memmove in Betracht ziehen!

Operator Precedence

https://en.cppreference.com/w/c/language/operator_precedence

warning

a & b == c wird als a & (b == c) evaluiert (gleiches gilt für die anderen bitwise Operatoren ^ | und Vergleiche < > <= >=).
Evaluations Reihenfolge: (quasi alle Operatoren) -> & AND -> ^ XOR -> | OR -> Logical -> ternary -> assignment -> comma operator

Warning

<< und >> werden nach +-*/ evaluiert, aber vor den meisten anderen Operatoren

Dynamischer Array ohne Templates

https://github.com/tree-sitter/tree-sitter/blob/master/lib/src/array.h
Nutzt macro zur Definition eines eigenen structs pro Type.

malloc + sizeof

struct someStruct *var = malloc(sizeof(struct someStruct));
// better (code does not need to change if type changes):
struct someStruct *var = malloc(sizeof(*var));
// usually even better (fills with 0)
struct someStruct *var = calloc(1, sizeof(*var));
// usually even better (shorter, on stack, 0 init when static or one member is set in struct literal):
// NOTE: inits to default values in C++ (0 if no defaults given)
struct someStruct var = {.member = 1, ...};
struct someStruct var = {0};

https://felipec.wordpress.com/2024/03/03/c-skill-issue-how-the-white-house-is-wrong/

offsetof, container_of macro

offsetof gets the byte offset for a member in a struct:

// Check if order of memers in two structs are equal
// So one can be casted into the other
static_assert(offsetof(Vector2, x) == offsetof(Rectangle, x) && offsetof(Vector2, y) == offsetof(Rectangle, y));

container_of() is part of the Linux kernel and basically does the reverse, pass in a pointer to a member and get a pointer for the surrounding type: https://radek.io/2012/11/10/magical-container_of-macro/

Macro stringification

#define MAKESTRING(n) STRING(n)
#define STRING(n) #n

// prints __LINE__ (not expanded)
std::cout << STRING(__LINE__) << std::endl;
// prints 42 (line number)
std::cout << MAKESTRING(__LINE__) << std::endl;

This is the stringize operation, it will produce a string literal from macro parameter (n). Two lines are required to allow extra expansion of macro parameter. (https://stackoverflow.com/a/48464280)

OpenMP - Simple Parallelisierung

OpenMP and pwrite() (nullprogram.com)

/* schedule(dynamic, 1): treat the loop like a work queue */
#pragma omp parallel for schedule(dynamic, 1)
for (int i = 0; i < num_frames; i++) {
    struct frame *frame = malloc(sizeof(*frame));
    float theta = compute_theta(i);
    compute_frame(frame, theta, beta);
    // only one thread at a time can be in the critical section
    #pragma omp critical
    {
        write(STDOUT_FILENO, frame, sizeof(*frame));
    }
    free(frame);
}

#pragma wird ignoriert, wenn OpenMP nicht unterstützt wird -> Code läuft normal ab, wird nur nicht parallelisiert.
Wird auch von Windows unterstützt (bis OpenMP 2.0)

switch

switch wird in Assembler als Jump-Table umgesetzt. Input wird in Speicheradresse umgerechnet, an der sich der Code für die korrekte Anweisung aus dem switch befindet.
Complexity: O(1) + C
if-tree Complexity: O(n) + K | K < C
-> schneller als if-tree wenn man viele Bedingungen hat
https://www.youtube.com/watch?v=fjUG_y5ZaL4

Compiler, Linker

Preprocessor only:
> gcc -E main.c -o main.i
Compiler:
> gcc -S main.i -o main.s
Assembler:
> as main.s -o main.o
Preprocessor + Compiler + Assembler:
> gcc -c main.c -o main.o
Linker:
> ld obj1.o /usr/lib/obj2.o -lc main.o -dynamic-linker dynamiclib.so -o main.exe
Full pipeline:
> gcc -o out.exe ./main.c -llibraryname -LLibrary/directory

Compiler generiert Object-Datei (.o, .obj). Diese enthält Assembler + Informationen zu Abhängigkeiten und Aufbau für Linker.
Linker verbindet mehrere Objekt-Dateien zu einer Executable.

Dump Assembler:
> gcc -S file.c
> gcc -S -o asm_output.s file.c

Dump from Object-file:
(-S intermixes source code)
(-rwC show symbol relocations, disable line-wrapping, demangle)
> objdump -S -d -rwC -Mintel file.o > file.dump

Dump Assembler optimized:
https://stackoverflow.com/a/38552509
> g++ -fno-asynchronous-unwind-tables -fno-exceptions -fno-rtti -fverbose-asm -Wall -Wextra   foo.cpp   -O3 -masm=intel -S -o- | less

Show dependencies:
ldd main.exe

p == NULL vs NULL == p

void *p = malloc(8);
if (p = NULL) {} // no error, p is assigned NULL
if (NULL = p) {} // error, we wanted to write NULL == p

main()

main() wird vom Compiler anders optimiert, da die Funktion nur einmal aufgerufen wird (Aussage irgendwo bei Stackoverflow gelesen)

Bit Hacks

https://graphics.stanford.edu/~seander/bithacks.html#CountBitsSetParallel
Count set bits in an int32 and more...

Bitmask with more than 64 entries

internal b32
EntityHasProperty(Entity *entity, EntityProperty property)
{
    return !!(entity->properties[property / 64] & (1ull << (property % 64)));
}

internal Entity *
SetEntityProperty(Entity *entity, EntityProperty property)
{
    entity->properties[property / 64] |= 1ull << (property % 64);
    return entity;
}

enum EntityProperty {
	ENT_PROP_NONE=0,
	ENT_PROP_TOGGLE,
	//...
	EntityProperty_MAX
}

#define EntityProperty_MAX 123
struct Entity {
	u64 properties[(EntityProperty_MAX+63)/64];
	// ...
}

Simple solution of using an "automatically" scaling array (at compile time) and accessor functions, which target the right entry for a given value.

Stack space

Stack space is limited and you can run out, especially if you place a large array on it.
Default limits

Opaque struct

// foo.h
typedef struct foo foo;
foo *init();
void doStuff(foo *f);
void freeFoo(foo *f);

// foo.c
struct foo {
    int x;
    int y;
};

Hides internal details, datamembers and functions of a struct from the user (i.e. only the struct name is known to the public API). Essentially this means: do not access the internals of this struct manually, only pass it to the accompanying API functions. Is the equivalent of making things protected/private in C++.
Advantage: Implementation details of struct can change and user-code is unaffected.

Bitshift

// >> shifts all bits right and fills with 0 (+int and uint) or 1 (-int, preserves sign bit, technically compiler dependant)
// -1 is all bits=1 in two's complement
// << always fills with 0 and preserves sign bit on int

int test = 1;
printf("%d | %d\n", test << 2, test >> 2);
// prints 4 | 0
test = -1;
printf("%d %d\n", test << 2, test >> 2);
// prints -4 | -1

More info: https://en.wikipedia.org/wiki/Bitwise_operations_in_C#Right_shift_>>

Rounding

 val | cast | floorf | roundf | ceilf
-1.9 |   -1 |   -2.0 |   -2.0 | -1.0
-1.5 |   -1 |   -2.0 |   -2.0 | -1.0
-1.2 |   -1 |   -2.0 |   -1.0 | -1.0
-0.9 |    0 |   -1.0 |   -1.0 | -0.0
-0.5 |    0 |   -1.0 |   -1.0 | -0.0
-0.1 |    0 |   -1.0 |   -0.0 | -0.0
 0.1 |    0 |    0.0 |    0.0 |  1.0
 0.5 |    0 |    0.0 |    1.0 |  1.0
 0.9 |    0 |    0.0 |    1.0 |  1.0
 1.2 |    1 |    1.0 |    1.0 |  2.0
 1.5 |    1 |    1.0 |    2.0 |  2.0
 1.9 |    1 |    1.0 |    2.0 |  2.0
#include <stdout.h>

int main()
{
    float values[] = {-1.9, -1.5, -1.2, -0.9, -0.5, -0.1, 0.1, 0.5, 0.9, 1.2, 1.5, 1.9};
    printf(" val | cast | floorf | roundf | ceilf\n");
    for (int idx = 0; idx < sizeof(values) / sizeof(values[0]); ++idx)
    {
        float v = values[idx];
        printf("% 4.1f | %4d | % 6.1f | % 6.1f | % 4.1f\n", v, (int)v, floorf(v), roundf(v), ceilf(v));
    }
}

int sizes

#include <stdint.h>
int64_t test = -(1ULL << 32) - 2;
int32_t test2 = test;
printf("%lld %d\n", test, test2);
// prints: -4294967298 -2

Implicit conversions

Implicit conversions - cppreference.com
Breaking it down to practical rules (ignoring niche use-cases like complex and imaginary types).
Two different types as argument in an arithmetic operation are converted as follows:

  1. an integer is converted to match a floating type
  2. the smaller type is converted to match the larger type
  3. signed integers are converted to unsigned, unless the signed type can represent all possible values of the unsigned type (i.e. the signed type is bigger)
    Examples:
// rule 1
1.f + 20000001 // 20000001 is converted to float 20000000.f
               // result after addition is float 20000000.f
               // (since 20000001.f is not representable)
// rule 2
1.f + 2.0      // 1.f is converted to double 1.0
               // result after addition is double 3.0
(char)'a' + 1L // (char)'a' is converted to long 97
               // result after addition is signed long 98
5UL - 2ULL     // 2UL is converted to unsigned long long 2
               // result after addition is unsigned long long 3
// rule 3 (+ rule 2)
2u - 10        // 10 is converted to unsigned int 10
               // result after addition is unsigned int 
               // 4294967288 (-8 modulo 2^32 -> overflow)
0UL - 1LL      // 0UL is converted to signed long long 0
               // result after addition is signed long long -1
               // NOTE: this example is different on 
               // cppreference and depends on the size of 
               // (unsigned) long which is 32-bits in my case

Files - text vs. binary mode

FILE *f = fopen("somefile.txt", "r+"); // text mode
FILE *b = fopen("somefile.txt", "r+b"); // binary mode

UTF8

Unity build

Macro best practices

#define foo(x) bar(x); baz(x)
if (x > 3)
	foo(x)
// will lead to
if (x > 3)
	bar(x);
baz(x);

#define foo_fix(x) do { bar(x); baz(x); } while(0)
#ifdef NDEBUG
	#define debug_print(s) printf(s)
#else
	#define debug_print(s) ((void)0)
#endif

for (int idx = 0; idx < len; ++idx, debug_print("foo")) { ... }
x > 0 ? debug_print("greater") : debug_print("smaller")
#define textf(f) ((void)snprintf(tempBuf, sizeof(tempBuf), f))

MSVC Build Insights

Finding build bottlenecks with C++ Build Insights - C++ Team Blog

Use the following to get insights into the build process of your C/C++ app. Find out which files are included way too often, which files take long to parse or compile.

Pasted image 20250528140213.png

  1. Download and install the latest Visual Studio 2019. (vcperf seems to also be included in the Visual Studio Build Tools 2022)
  2. Obtain WPA by downloading and installing the latest Windows ADK.
  3. NOTE: I do not remember doing the following two steps, so maybe they can be skipped as of 05/2025
  4. Copy the perf_msvcbuildinsights.dll file from your Visual Studio 2019’s MSVC installation directory to your newly installed WPA directory. This file is the C++ Build Insights WPA add-in, which must be available to WPA for correctly displaying the C++ Build Insights events.
    1. MSVC’s installation directory is typically: C:\Program Files (x86)\Microsoft Visual Studio\2019\{Edition}\VC\Tools\MSVC\{Version}\bin\Hostx64\x64.
    2. WPA’s installation directory is typically: C:\Program Files (x86)\Windows Kits\10\Windows Performance Toolkit.
  5. Open the perfcore.ini file in your WPA installation directory and add an entry for the perf_msvcbuildinsights.dll file. This tells WPA to load the C++ Build Insights add-in on startup.
  6. Open an elevated x64 Native Tools Command Prompt for VS 20XX.
  7. Obtain a trace of your build:
    1. Run the following command: vcperf /start MySessionName.
    2. Build your C++ project from anywhere, even from within Visual Studio (vcperf collects events system-wide).
    3. Run the following command: vcperf /stop MySessionName outputFile.etl. This command will stop the trace, analyze all events, and save everything in the outputFile.etl trace file.
  8. Open the trace you just collected in WPA.