Math4Programmers

Mathematical Notation in Code From Symbols to Implementation: A Programmer’s Guide to Integrating Math Notation
Author	Alberto Rodriguez

4. ∫ - Integral

Basic Introduction

The ∫ symbol represents integration in calculus. It can be thought of as the opposite of differentiation and is often used to calculate areas under curves.

Simple Example


      1
   ∫  x dx = [x²/2] from 0 to 1 = 1/2
      0

This calculates the area under the curve y = x from x = 0 to x = 1.

Advanced Explanation

Integration is a fundamental concept in calculus with applications in physics, engineering, and many other fields. It's used to solve differential equations, calculate volumes, and much more.

C Implementation (Numerical Integration - Trapezoidal Rule):


double integrate_trapezoidal(double (*f)(double), double a, double b, int n) {
    double h = (b - a) / n;
    double sum = 0.5 * (f(a) + f(b));
    for (int i = 1; i < n; i++) {
        sum += f(a + i * h);
    }
    return sum * h;
}

5. d/dx - Derivative

Basic Introduction

The d/dx symbol represents differentiation with respect to x. It's used to find the rate of change of a function.

Simple Example


d
-- (x²) = 2x
dx

This means "the derivative of x² with respect to x is 2x".

Advanced Explanation

Differentiation is a key concept in calculus, used to analyze rates of change, find maxima and minima, and solve optimization problems. It's crucial in physics for describing motion and in economics for marginal analysis.

C Implementation (Numerical Differentiation):


double derivative(double (*f)(double), double x, double h) {
    return (f(x + h) - f(x - h)) / (2 * h);
}

6. lim - Limit

Basic Introduction

The lim symbol represents the limit of a function as the input approaches a specific value.

Simple Example


    lim (1/x) = 0
x -> ∞

This means "as x approaches infinity, 1/x approaches 0".

Advanced Explanation

Limits are fundamental in calculus, used to define continuity, derivatives, and integrals. They're crucial for understanding function behavior near critical points or asymptotes.

C Implementation (Limit Approximation):


double limit_approx(double (*f)(double), double a, double epsilon) {
    return f(a + epsilon);
}

7. ∀ - For All

Basic Introduction

The ∀ symbol means "for all" or "for every" in mathematical logic and set theory.

Simple Example

∀x (x² ≥ 0) means "for all x, x squared is greater than or equal to zero".

Advanced Explanation

This universal quantifier is used in formal logic, set theory, and mathematical proofs. It's often paired with the existential quantifier (∃) in complex logical statements.

C Implementation (conceptual):


int for_all(int* set, int size, int (*predicate)(int)) {
    for (int i = 0; i < size; i++) {
        if (!predicate(set[i])) return 0;
    }
    return 1;
}

// Usage example
int is_positive(int x) { return x > 0; }
int result = for_all(my_array, array_size, is_positive);

8. {…} - Set Notation

Basic Introduction

Curly braces {} are used to denote sets, which are collections of distinct objects.

Simple Example

{1, 2, 3, 4, 5} represents a set containing the first five positive integers.

Advanced Explanation

Set notation is fundamental in mathematics for describing collections of objects. It's used extensively in set theory, algebra, and computer science, especially in describing data structures.

C Implementation (using a simple array):


#include <stdio.h>

void print_set(int* set, int size) {
    printf("{");
    for (int i = 0; i < size; i++) {
        printf("%d", set[i]);
        if (i < size - 1) printf(", ");
    }
    printf("}\n");
}

// Usage
int my_set[] = {1, 2, 3, 4, 5};
print_set(my_set, 5);

9. ∈ - Element of

Basic Introduction

The ∈ symbol means "is an element of" or "belongs to" a set.

Simple Example

If S = {1, 2, 3}, then 2 ∈ S means "2 is an element of set S".

Advanced Explanation

This symbol is crucial in set theory for describing relationships between elements and sets. It's used in defining sets, proving set properties, and in many areas of discrete mathematics.

C Implementation:


int is_element_of(int element, int* set, int size) {
    for (int i = 0; i < size; i++) {
        if (set[i] == element) return 1;
    }
    return 0;
}

// Usage
int my_set[] = {1, 2, 3, 4, 5};
int result = is_element_of(3, my_set, 5);  // Returns 1 (true)

10. { | property} - Set Builder Notation

Basic Introduction

Set builder notation describes a set by stating a property that its members must satisfy.

Simple Example

{x | x is an even number less than 10} = {2, 4, 6, 8}

Advanced Explanation

This notation is powerful for defining sets based on properties rather than listing elements. It's widely used in mathematics to concisely describe infinite sets or sets with complex membership criteria.

C Implementation (conceptual):


#include <stdlib.h>

int* build_set(int max, int (*property)(int), int* size) {
    int* set = malloc(max * sizeof(int));
    *size = 0;
    for (int i = 0; i < max; i++) {
        if (property(i)) {
            set[(*size)++] = i;
        }
    }
    return set;
}

// Usage example
int is_even_less_than_10(int x) { return x % 2 == 0 && x < 10; }
int size;
int* even_set = build_set(10, is_even_less_than_10, &size);

11. ⊆ - Subset

Basic Introduction

The ⊆ symbol means "is a subset of". A set A is a subset of set B if every element of A is also an element of B.

Simple Example

If A = {1, 2} and B = {1, 2, 3, 4}, then A ⊆ B.

Advanced Explanation

The subset relationship is fundamental in set theory and is used extensively in proofs and in defining relationships between sets. It's closely related to the concepts of power sets and set inclusion.

C Implementation:


int is_subset(int* setA, int sizeA, int* setB, int sizeB) {
    for (int i = 0; i < sizeA; i++) {
        if (!is_element_of(setA[i], setB, sizeB)) return 0;
    }
    return 1;
}

// Usage
int setA[] = {1, 2};
int setB[] = {1, 2, 3, 4};
int result = is_subset(setA, 2, setB, 4);  // Returns 1 (true)

12. ∪ - Union

Basic Introduction

The ∪ symbol represents the union of sets. The union of sets A and B is the set of elements that are in A, in B, or in both A and B.

Simple Example

If A = {1, 2, 3} and B = {3, 4, 5}, then A ∪ B = {1, 2, 3, 4, 5}

Advanced Explanation

Union is a fundamental set operation used in set theory, logic, and computer science. It's essential in database operations, algorithm design, and in solving problems involving multiple sets.

C Implementation:


#include <stdlib.h>

int* union_sets(int* setA, int sizeA, int* setB, int sizeB, int* sizeResult) {
    int* result = malloc((sizeA + sizeB) * sizeof(int));
    *sizeResult = 0;
    
    // Add all elements from set A
    for (int i = 0; i < sizeA; i++) {
        result[(*sizeResult)++] = setA[i];
    }
    
    // Add elements from set B that are not in A
    for (int i = 0; i < sizeB; i++) {
        if (!is_element_of(setB[i], setA, sizeA)) {
            result[(*sizeResult)++] = setB[i];
        }
    }
    
    return result;
}

// Usage
int setA[] = {1, 2, 3};
int setB[] = {3, 4, 5};
int sizeResult;
int* unionSet = union_sets(setA, 3, setB, 3, &sizeResult);

13. ∩ - Intersection

Basic Introduction

The ∩ symbol represents the intersection of sets. The intersection of sets A and B is the set of elements that are in both A and B.

Simple Example

If A = {1, 2, 3, 4} and B = {3, 4, 5, 6}, then A ∩ B = {3, 4}

Advanced Explanation

Intersection is a fundamental set operation used in set theory, logic, and database operations. It's crucial in finding common elements between sets and in defining relationships between different sets.

C Implementation:


int* intersection(int* setA, int sizeA, int* setB, int sizeB, int* sizeResult) {
    int* result = malloc(((sizeA < sizeB) ? sizeA : sizeB) * sizeof(int));
    *sizeResult = 0;
    
    for (int i = 0; i < sizeA; i++) {
        if (is_element_of(setA[i], setB, sizeB)) {
            result[(*sizeResult)++] = setA[i];
        }
    }
    
    return result;
}

14. ∂ - Partial Derivative

Basic Introduction

The ∂ symbol represents a partial derivative, which is the derivative of a function with respect to one variable, treating other variables as constants.

Simple Example

If f(x, y) = x² + xy, then ∂f/∂x = 2x + y

Advanced Explanation

Partial derivatives are crucial in multivariable calculus, used to analyze functions of several variables. They're fundamental in physics, engineering, and economics for studying rates of change in complex systems.

C Implementation (numerical approximation):


double partial_derivative(double (*f)(double, double), double x, double y, double h, int respect_to_x) {
    if (respect_to_x) {
        return (f(x + h, y) - f(x - h, y)) / (2 * h);
    } else {
        return (f(x, y + h) - f(x, y - h)) / (2 * h);
    }
}

15. ∇ - Gradient

Basic Introduction

The ∇ (nabla) symbol represents the gradient of a scalar function, which is a vector of all its partial derivatives.

Simple Example

If f(x, y) = x² + xy + y², then ∇f = (2x + y, x + 2y)

Advanced Explanation

The gradient is a key concept in vector calculus, crucial for optimization problems, potential theory in physics, and machine learning algorithms like gradient descent.

C Implementation (2D gradient):


typedef struct {
    double x;
    double y;
} Vector2D;

Vector2D gradient(double (*f)(double, double), double x, double y, double h) {
    Vector2D grad;
    grad.x = partial_derivative(f, x, y, h, 1);
    grad.y = partial_derivative(f, x, y, h, 0);
    return grad;
}

16. ≈ - Approximately Equal

Basic Introduction

The ≈ symbol means "approximately equal to" and is used when two values are close but not exactly the same.

Simple Example

π ≈ 3.14159

Advanced Explanation

This concept is crucial in numerical analysis, physics, and engineering where exact values are often impossible or impractical to compute. It's also important in defining limits and in computational approximations.

C Implementation:


#include <math.h>

int approximately_equal(double a, double b, double epsilon) {
    return fabs(a - b) < epsilon;
}

17. ∞ - Infinity

Basic Introduction

The ∞ symbol represents infinity, a concept of something without any limit.

Simple Example

The set of all positive integers: {1, 2, 3, ...} → ∞

Advanced Explanation

Infinity is a profound concept in mathematics, used in calculus for limits, in set theory for describing infinite sets, and in topology. It's crucial for understanding asymptotic behavior and in defining certain mathematical structures.

C Implementation (representation using limits):


#include <float.h>

#define INFINITY DBL_MAX

double approach_infinity(int n) {
    return 1.0 / (1.0 / n);
}

18. ∃ - There Exists

Basic Introduction

The ∃ symbol means "there exists" or "there is at least one" in mathematical logic.

Simple Example

∃x (x² = 4) means "there exists an x such that x squared equals 4"

Advanced Explanation

This existential quantifier is crucial in logic and set theory, often used in conjunction with the universal quantifier (∀) to form complex logical statements and in mathematical proofs.

C Implementation (conceptual):


int there_exists(int* set, int size, int (*predicate)(int)) {
    for (int i = 0; i < size; i++) {
        if (predicate(set[i])) return 1;
    }
    return 0;
}

// Usage example
int is_even(int x) { return x % 2 == 0; }
int result = there_exists(my_array, array_size, is_even);

19. ⇒ and ⇔ - Implication and Equivalence

Basic Introduction

⇒ means "implies" or "if...then"

⇔ means "if and only if" or "is equivalent to"

Simple Example

A ⇒ B: "If it's raining (A), then the ground is wet (B)"

A ⇔ B: "A triangle is equilateral if and only if all its angles are 60°"

Advanced Explanation

These logical connectives are fundamental in mathematical logic, used extensively in proofs, definitions, and in formalizing mathematical statements. They're crucial in understanding the relationships between different mathematical conditions or statements.

C Implementation (conceptual boolean logic)


int implies(int a, int b) {
    return !a || b;  // equivalent to "not A or B"
}

int iff(int a, int b) {
    return (a && b) || (!a && !b);  // both true or both false
}

Note: Adding new sections specifically relevant to deep learning below.

20. Matrix Operations

Basic Introduction

Matrices are rectangular arrays of numbers, symbols, or expressions arranged in rows and columns. They are fundamental in deep learning for representing and manipulating data and model parameters.

Simple Example

A 2x3 matrix:
A = [ 1 2 3 ] [ 4 5 6 ]

Advanced Explanation

Matrix operations like addition, multiplication, and transposition are crucial in deep learning for tasks such as feature transformation, weight updates, and backpropagation.

Key Notations

A^T: Transpose of matrix A

AB: Matrix multiplication of A and B

A ⊙ B: Hadamard (element-wise) product of A and B

C Implementation (basic matrix operations)


#include <stdio.h>
#include <stdlib.h>

typedef struct {
    int rows, cols;
    double** data;
} Matrix;

Matrix create_matrix(int rows, int cols) {
    Matrix m = {rows, cols, malloc(rows * sizeof(double*))};
    for (int i = 0; i < rows; i++) {
        m.data[i] = calloc(cols, sizeof(double));
    }
    return m;
}

Matrix matrix_multiply(Matrix A, Matrix B) {
    if (A.cols != B.rows) {
        printf("Error: incompatible dimensions\n");
        exit(1);
    }
    Matrix C = create_matrix(A.rows, B.cols);
    for (int i = 0; i < A.rows; i++) {
        for (int j = 0; j < B.cols; j++) {
            for (int k = 0; k < A.cols; k++) {
                C.data[i][j] += A.data[i][k] * B.data[k][j];
            }
        }
    }
    return C;
}

// Other operations like addition, transposition can be implemented similarly
void free_matrix(Matrix m) {
    for (int i = 0; i < m.rows; i++) {
        free(m.data[i]);
    }
    free(m.data);
}

// Usage example
Matrix A = create_matrix(2, 3);
// Use matrix A for operations...
free_matrix(A); // Remember to free the memory

21. Partial Derivatives and Gradients in Neural Networks

Basic Introduction

In deep learning, partial derivatives and gradients are used to compute how the loss function changes with respect to each model parameter.

Simple Example

For a loss function L(w, b) where w is a weight and b is a bias:

∂L/∂w represents how L changes with respect to w

∂L/∂b represents how L changes with respect to b

Advanced Explanation

The gradient of the loss function with respect to all parameters forms the basis of gradient descent optimization in neural networks. Backpropagation efficiently computes these gradients.

Key Notation

∇L = [∂L/∂w₁, ∂L/∂w₂, ..., ∂L/∂wn, ∂L/∂b]

C Implementation (simple gradient descent)


void gradient_descent(double *w, double *b, double learning_rate, int iterations) {
    for (int i = 0; i < iterations; i++) {
        double dL_dw = compute_gradient_w(*w, *b);  // Compute ∂L/∂w
        double dL_db = compute_gradient_b(*w, *b);  // Compute ∂L/∂b
        *w -= learning_rate * dL_dw;
        *b -= learning_rate * dL_db;
    }
}

22. Activation Functions

Basic Introduction

Activation functions introduce non-linearity into neural networks, allowing them to learn complex patterns.

Common Activation Functions

1. Sigmoid: σ(x) = 1 / (1 + e^(-x))

2. ReLU: f(x) = max(0, x)

3. Tanh: tanh(x) = (e^x - e^(-x)) / (e^x + e^(-x))

Advanced Explanation

Choice of activation functions affects the network's ability to learn and can help mitigate issues like vanishing gradients.

C Implementation (activation functions)


#include <math.h>

double sigmoid(double x) {
    return 1.0 / (1.0 + exp(-x));
}

double relu(double x) {
    return (x > 0) ? x : 0;
}

double tanh_activation(double x) {
    return tanh(x);
}

23. Probability and Statistics in Machine Learning

Basic Introduction

Probability theory underpins many machine learning concepts, from loss functions to generative models.

Key Concepts

- P(A|B): Conditional probability of A given B

- E[X]: Expected value of random variable X

- Var(X): Variance of X

Advanced Explanation

Concepts like maximum likelihood estimation, Bayesian inference, and information theory are crucial in understanding and developing machine learning algorithms.

C Implementation (basic probability calculations)


double expected_value(double *values, double *probabilities, int n) {
    double E = 0;
    for (int i = 0; i < n; i++) {
        E += values[i] * probabilities[i];
    }
    return E;
}

double variance(double *values, double *probabilities, int n) {
    double E = expected_value(values, probabilities, n);
    double Var = 0;
    for (int i = 0; i < n; i++) {
        Var += probabilities[i] * pow(values[i] - E, 2);
    }
    return Var;
}

24. Optimization Techniques

Basic Introduction

Optimization algorithms are used to minimize the loss function in machine learning models.

Key Concepts

- Gradient Descent: w = w - η∇L

- Stochastic Gradient Descent (SGD)

- Adam Optimizer

Advanced Explanation

Advanced optimization techniques like Adam combine ideas from momentum and adaptive learning rates to efficiently train deep neural networks.

C Implementation (simple SGD)


void sgd(double *w, double *x, double y, double learning_rate, int features) {
    double prediction = 0;
    for (int i = 0; i < features; i++) {
        prediction += w[i] * x[i];
    }
    double error = prediction - y;
    for (int i = 0; i < features; i++) {
        w[i] -= learning_rate * error * x[i];
    }
}

25. Trigonometric Functions

Basic Introduction

Trigonometric functions are fundamental in mathematics and have numerous applications in computer science, particularly in graphics, physics simulations, and signal processing.

Key Concepts

θ (theta): Represents an angle in radians or degrees
sin(θ): Sine function
cos(θ): Cosine function
tan(θ): Tangent function

Advanced Explanation

These functions describe the relationships between angles and the ratios of sides in right-angled triangles. They are periodic functions with various important properties used in wave analysis, rotation calculations, and more.

The symbol θ (theta) is commonly used to represent an angle. In trigonometry, angles are typically measured in radians, where 2π radians equal 360 degrees. However, many programming languages use degrees for input, so conversion between degrees and radians is often necessary.

Example: Angle Conversion and Trigonometric Calculations


#include <stdio.h>
#include <math.h>

#define PI 3.14159265358979323846

// Convert degrees to radians
double deg_to_rad(double degrees) {
    return degrees * PI / 180.0;
}

// Convert radians to degrees
double rad_to_deg(double radians) {
    return radians * 180.0 / PI;
}

int main() {
    double angle_deg = 45.0;
    double angle_rad = deg_to_rad(angle_deg);

    printf("Angle: %.2f degrees = %.4f radians\n", angle_deg, angle_rad);
    printf("sin(%.2f°) = %.4f\n", angle_deg, sin(angle_rad));
    printf("cos(%.2f°) = %.4f\n", angle_deg, cos(angle_rad));
    printf("tan(%.2f°) = %.4f\n", angle_deg, tan(angle_rad));

    return 0;
}

This example demonstrates angle conversion and basic trigonometric calculations. Note that C math functions expect angles in radians, so we convert from degrees to radians before calculation.

Example: Calculating points on a circle


#include <stdio.h>
#include <math.h>

#define PI 3.14159265358979323846

typedef struct {
    double x;
    double y;
} Point;

Point point_on_circle(double radius, double angle_degrees) {
    double angle_radians = angle_degrees * PI / 180.0;
    Point p;
    p.x = radius * cos(angle_radians);
    p.y = radius * sin(angle_radians);
    return p;
}

void print_points_on_circle(double radius, int num_points) {
    printf("Points on a circle with radius %.2f:\n", radius);
    for (int i = 0; i < num_points; i++) {
        double angle = (360.0 / num_points) * i;
        Point p = point_on_circle(radius, angle);
        printf("  Angle: %.2f°, Point: (%.2f, %.2f)\n", angle, p.x, p.y);
    }
}

int main() {
    print_points_on_circle(5.0, 8);
    return 0;
}

This example calculates points on a circle using trigonometric functions. It demonstrates how sin(θ) and cos(θ) are used to convert polar coordinates (radius and angle) to Cartesian coordinates (x and y).

Application: 2D Rotation

Trigonometric functions are crucial in computer graphics for rotating objects. Here's an improved example that rotates a set of points around the origin:


#include <stdio.h>
#include <math.h>

#define PI 3.14159265358979323846

typedef struct {
    double x;
    double y;
} Point;

void rotate_point(Point* p, double angle_degrees) {
    double angle_radians = angle_degrees * PI / 180.0;
    double x = p->x;
    double y = p->y;
    p->x = x * cos(angle_radians) - y * sin(angle_radians);
    p->y = x * sin(angle_radians) + y * cos(angle_radians);
}

void print_point(Point p) {
    printf("(%.2f, %.2f)", p.x, p.y);
}

int main() {
    Point points[] = {{1, 0}, {0, 1}, {-1, 0}, {0, -1}};
    int num_points = sizeof(points) / sizeof(points[0]);
    double rotation_angle = 45.0;

    printf("Original points:\n");
    for (int i = 0; i < num_points; i++) {
        print_point(points[i]);
        printf("\n");
    }

    printf("\nRotating by %.2f degrees...\n\n", rotation_angle);

    for (int i = 0; i < num_points; i++) {
        rotate_point(&points[i], rotation_angle);
    }

    printf("Rotated points:\n");
    for (int i = 0; i < num_points; i++) {
        print_point(points[i]);
        printf("\n");
    }

    return 0;
}

This example demonstrates how to use trigonometric functions to rotate points around the origin. The rotation matrix used is derived from the following trigonometric identities:
x' = x cos(θ) - y sin(θ)
y' = x sin(θ) + y cos(θ)
Where (x, y) is the original point, (x', y') is the rotated point, and θ is the rotation angle.

26. Euler's Formula

Basic Introduction

Euler's formula, e^ix = cos(x) + i sin(x), establishes a profound connection between exponential and trigonometric functions.

Key Concept

e^iπ + 1 = 0

Advanced Explanation

This formula is fundamental in complex analysis and has applications in signal processing, particularly in Fourier transforms. It provides a way to represent rotations and periodic phenomena using complex exponentials.

Example: Complex number rotation


#include <complex.h>

double complex rotate_complex(double complex z, double angle_degrees) {
    double angle_radians = angle_degrees * M_PI / 180.0;
    return z * cexp(I * angle_radians);
}

Application: Signal Processing

Euler's formula is crucial in the discrete Fourier transform:


void dft(double complex* x, int N) {
    for (int k = 0; k < N; k++) {
        x[k] = 0;
        for (int n = 0; n < N; n++) {
            x[k] += x[n] * cexp(-2 * M_PI * I * k * n / N);
        }
    }
}

Complex Numbers in Polar Form

A complex number can be represented in polar form as:


z = r * (cos(θ) + i * sin(θ))

Where:

r is the magnitude (distance from the origin)
θ (theta) is the angle formed with the positive real axis, known as the argument

This form is useful for multiplying and dividing complex numbers, as it simplifies the calculation of products and quotients.

Example:


double magnitude = cabs(z);
double angle = carg(z);
double complex polar_form = magnitude * (cos(angle) + I * sin(angle));

27. Superscript Notation (i) and [l]

Basic Introduction

In machine learning notation:

Superscript (i) denotes the i-th training example
Superscript [l] denotes the l-th layer of a neural network

Advanced Explanation

This notation helps distinguish between different examples in a dataset and different layers in a neural network, which is crucial for understanding the flow of data and computations.

C Implementation


typedef struct {
    double* features;
    double label;
} Example;

typedef struct {
    double** weights;
    double* biases;
    int units;
} Layer;

// Usage
Example training_example[m];  // m examples
Layer network[L];  // L layers

// Accessing i-th example, l-th layer
double feature_j_of_example_i = training_example[i].features[j];
double weight_jk_of_layer_l = network[l].weights[j][k];

28. Size Notations (m, nx, ny, n[l]h, L)

Basic Introduction

m: number of examples in the dataset
nx: input size
ny: output size (or number of classes)
n[l]h: number of hidden units of the l-th layer
L: number of layers in the network

Advanced Explanation

These notations help define the structure of the neural network and the dataset. They're crucial for setting up the correct dimensions of weight matrices and bias vectors.

C Implementation


#define M 1000  // Number of examples
#define NX 784  // Input size (e.g., for MNIST)
#define NY 10   // Output size (e.g., 10 classes for MNIST)
#define L 3     // Number of layers

int n_h[L] = {128, 64, NY};  // Hidden units per layer

29. Matrix and Vector Notations (X, x(i), Y, y(i), W[l], b[l], ŷ)

Basic Introduction

X ∈ R^(nx×m): input matrix
x(i) ∈ R^nx: i-th example as a column vector
Y ∈ R^(ny×m): label matrix
y(i) ∈ R^ny: output label for the i-th example
W[l]: weight matrix of layer l
b[l]: bias vector of layer l
ŷ: predicted output vector

Advanced Explanation

These notations represent the core data structures in a neural network. The input matrix X contains all training examples, while Y contains their corresponding labels. W[l] and b[l] are the learnable parameters of the network.

C Implementation


#include <stdlib.h>

double** allocate_2d_array(int rows, int cols) {
    double** array = malloc(rows * sizeof(double*));
    for (int i = 0; i < rows; i++) {
        array[i] = malloc(cols * sizeof(double));
    }
    return array;
}

// Usage
double** X = allocate_2d_array(NX, M);
double** Y = allocate_2d_array(NY, M);
double** W[L];
double* b[L];

for (int l = 0; l < L; l++) {
    int next_layer_size = (l == L-1) ? NY : n_h[l+1];
    int this_layer_size = (l == 0) ? NX : n_h[l];
    W[l] = allocate_2d_array(next_layer_size, this_layer_size);
    b[l] = malloc(next_layer_size * sizeof(double));
}

double* y_hat = malloc(NY * sizeof(double));

30. Activation Functions and Forward Propagation

Basic Introduction

Activation functions introduce non-linearity into the network. Common ones include ReLU, sigmoid, and tanh.

Forward propagation equation: a = g[l](Wxx(i) + b1) = g[l](z1)

Advanced Explanation

The activation function g[l] is applied element-wise to the linear transformation of the input. This process is repeated layer by layer in forward propagation.

C Implementation


#include <math.h>

double relu(double x) {
    return x > 0 ? x : 0;
}

double sigmoid(double x) {
    return 1 / (1 + exp(-x));
}

void forward_propagation(double* input, double** W, double* b, int input_size, int output_size, double* output) {
    for (int i = 0; i < output_size; i++) {
        double z = 0;
        for (int j = 0; j < input_size; j++) {
            z += W[i][j] * input[j];
        }
        z += b[i];
        output[i] = relu(z);  // Using ReLU activation
    }
}

31. Cost Functions J(x, W, b, y) or J(ŷ, y)

Basic Introduction

Cost functions measure the difference between predicted and actual outputs. Common ones include:

Cross-entropy: JCE(ŷ, y) = -Σ y(i) log ŷ(i)
Mean squared error: J1(ŷ, y) = Σ |y(i) - ŷ(i)|^2

Advanced Explanation

Cost functions guide the optimization process. The goal is to minimize the cost function by adjusting the network's parameters (weights and biases).

C Implementation


#include <math.h>

double cross_entropy_loss(double* y_true, double* y_pred, int n) {
    double loss = 0;
    for (int i = 0; i < n; i++) {
        loss -= y_true[i] * log(y_pred[i]);
    }
    return loss;
}

double mean_squared_error(double* y_true, double* y_pred, int n) {
    double loss = 0;
    for (int i = 0; i < n; i++) {
        double diff = y_true[i] - y_pred[i];
        loss += diff * diff;
    }
    return loss / n;
}

32. Jacobian Matrix (J)

Basic Introduction

The Jacobian matrix is a matrix of all first-order partial derivatives of a vector-valued function. It represents how a small change in each input affects each output.

Mathematical Notation


        J = 
        [ ∂f₁/∂x₁ ∂f₁/∂x₂ ... ∂f₁/∂xn ]
        [ ∂f₂/∂x₁ ∂f₂/∂x₂ ... ∂f₂/∂xn ]
        [  .       .           .      ]
        [ ∂fm/∂x₁ ∂fm/∂x₂ ... ∂fm/∂xn ]

Advanced Explanation

The Jacobian matrix is crucial in transforming coordinates in multivariable calculus, optimizing multivariable functions, and is widely used in machine learning, particularly in backpropagation for neural networks.

C Implementation


// Example function to calculate partial derivatives
double df1_dx1(double x1, double x2) {
    return 2 * x1;  // Example: derivative of f1 = x1^2 + x2
}

double df1_dx2(double x1, double x2) {
    return 1;  // Example: derivative of f1 = x1^2 + x2
}

double** compute_jacobian(double (*f[])(double, double), double x1, double x2, int m, int n) {
    double** J = malloc(m * sizeof(double*));
    for (int i = 0; i < m; i++) {
        J[i] = malloc(n * sizeof(double));
    }
    J[0][0] = df1_dx1(x1, x2);
    J[0][1] = df1_dx2(x1, x2);
    // Add more partial derivatives as necessary
    return J;
}

33. Hessian Matrix (H)

Basic Introduction

The Hessian matrix is a square matrix of second-order partial derivatives of a scalar-valued function. It is used to analyze the curvature of the function, which is critical in optimization problems.

Mathematical Notation


        H = 
        [ ∂²f/∂x₁² ∂²f/∂x₁∂x₂ ... ∂²f/∂xn² ]
        [ ∂²f/∂x₂² ∂²f/∂x₂∂x₁ ... ∂²f/∂xn² ]
        [  .           .           .      ]
        [ ∂²f/∂xn² ∂²f/∂xn∂x₁ ... ∂²f/∂xn² ]

Advanced Explanation

The Hessian is vital in determining the nature of critical points (minima, maxima, or saddle points) and is heavily used in second-order optimization algorithms like Newton's method.

C Implementation


// Example second-order partial derivatives
double d2f_dx1dx1(double x1, double x2) {
    return 2;  // Example: second derivative of f = x1^2 + x2 with respect to x1
}

double d2f_dx1dx2(double x1, double x2) {
    return 0;  // Example: mixed derivative of f = x1^2 + x2 with respect to x1 and x2
}

double** compute_hessian(double x1, double x2) {
    double** H = malloc(2 * sizeof(double*));
    for (int i = 0; i < 2; i++) {
        H[i] = malloc(2 * sizeof(double));
    }
    H[0][0] = d2f_dx1dx1(x1, x2);
    H[0][1] = d2f_dx1dx2(x1, x2);
    H[1][0] = H[0][1];  // Hessian is symmetric
    H[1][1] = 0;  // Example: second derivative of f = x1^2 + x2 with respect to x2
    return H;
}

34. Eigenvalues and Eigenvectors

Basic Introduction

Eigenvalues and eigenvectors are key concepts in linear algebra. An eigenvector of a matrix is a vector that remains parallel to itself after being transformed by the matrix, and the corresponding eigenvalue is a scalar that represents how much the eigenvector is scaled during the transformation.

Mathematical Notation


        A * v = λ * v

Where A is a square matrix, v is the eigenvector, and λ (lambda) is the eigenvalue.

Advanced Explanation

Eigenvalues and eigenvectors are used in many areas including principal component analysis (PCA), stability analysis, and quantum mechanics. They help in simplifying matrix operations and understanding the structure of a matrix.

C Implementation


#include <stdio.h>
#include <math.h>

void power_iteration(double** A, double* v, int n, int iterations) {
    for (int k = 0; k < iterations; k++) {
        double* w = malloc(n * sizeof(double));
        for (int i = 0; i < n; i++) {
            w[i] = 0;
            for (int j = 0; j < n; j++) {
                w[i] += A[i][j] * v[j];
            }
        }
        // Normalize vector w to become the next eigenvector approximation
        double norm = 0;
        for (int i = 0; i < n; i++) {
            norm += w[i] * w[i];
        }
        norm = sqrt(norm);
        for (int i = 0; i < n; i++) {
            v[i] = w[i] / norm;
        }
        free(w);
    }
}

double compute_eigenvalue(double** A, double* v, int n) {
    double* Av = malloc(n * sizeof(double));
    double eigenvalue = 0;
    for (int i = 0; i < n; i++) {
        Av[i] = 0;
        for (int j = 0; j < n; j++) {
            Av[i] += A[i][j] * v[j];
        }
        eigenvalue += v[i] * Av[i];
    }
    free(Av);
    return eigenvalue;
}

35. Matrix Inverses and Determinants

Basic Introduction

The inverse of a matrix A is another matrix, denoted A⁻¹, such that A * A⁻¹ = I, where I is the identity matrix. The determinant of a matrix is a scalar value that is a function of the entries of a square matrix and provides important properties of the matrix.

Mathematical Notation


        A * A⁻¹ = I
        
        det(A) = ∑ (sign(σ) * a₁σ(1) * a₂σ(2) * ... * anσ(n))

Advanced Explanation

Determinants are used to solve systems of linear equations, compute eigenvalues, and understand matrix properties like invertibility. The inverse matrix is crucial in solving matrix equations and in many algorithms in computer science and data analysis.

C Implementation


#include <stdio.h>

double determinant(double** A, int n) {
    if (n == 1) return A[0][0];
    if (n == 2) return A[0][0] * A[1][1] - A[0][1] * A[1][0];

    double det = 0;
    for (int p = 0; p < n; p++) {
        double** submatrix = malloc((n-1) * sizeof(double*));
        for (int i = 1; i < n; i++) {
            submatrix[i-1] = malloc((n-1) * sizeof(double));
            for (int j = 0, col = 0; j < n; j++) {
                if (j == p) continue;
                submatrix[i-1][col++] = A[i][j];
            }
        }
        det += A[0][p] * determinant(submatrix, n-1) * (p % 2 == 0 ? 1 : -1);
        for (int i = 0; i < n-1; i++) free(submatrix[i]);
        free(submatrix);
    }
    return det;
}

void inverse(double** A, double** inverse, int n) {
    double det = determinant(A, n);
    if (det == 0) {
        printf("Singular matrix, no inverse.");
        return;
    }

    // Compute the inverse here (this implementation is non-trivial and can be complex)
    // ...
}

36. L'Hôpital's Rule

Basic Introduction

L'Hôpital's Rule provides a method to evaluate limits that result in indeterminate forms, such as 0/0 or ∞/∞.

Mathematical Notation


        lim (f(x)/g(x)) = lim (f'(x)/g'(x))
        x -> c               x -> c

Provided that the limit on the right-hand side exists.

Advanced Explanation

L'Hôpital's Rule is useful for calculating difficult limits and appears frequently in calculus, particularly in problems involving asymptotic analysis and optimization.

C Implementation


#include <math.h>

double f(double x) {
    return sin(x);
}

double g(double x) {
    return x;
}

double df_dx(double x) {
    return cos(x);
}

double dg_dx(double x) {
    return 1;
}

double lhopital_rule(double x) {
    double numerator = df_dx(x);
    double denominator = dg_dx(x);
    if (denominator == 0) {
        printf("Indeterminate form at x = %.2f\n", x);
        return NAN;
    }
    return numerator / denominator;
}

37. Implicit Differentiation

Basic Introduction

Implicit differentiation is used when a function is not explicitly solved for one variable in terms of the others, but you still need to find its derivative.

Mathematical Notation


        Given: F(x, y) = 0
        Then: dy/dx = - (∂F/∂x) / (∂F/∂y)

Advanced Explanation

Implicit differentiation is particularly useful in situations where solving for one variable explicitly is complex or impossible, such as in many equations in physics and economics.

C Implementation


// Example function F(x, y) = x^2 + y^2 - 1 = 0 (circle equation)
double dF_dx(double x, double y) {
    return 2 * x;
}

double dF_dy(double x, double y) {
    return 2 * y;
}

double implicit_derivative(double x, double y) {
    return -dF_dx(x, y) / dF_dy(x, y);
}

38. Chain Rule

Basic Introduction

The chain rule is a fundamental theorem in calculus used to differentiate composite functions.

Mathematical Notation


        If y = f(g(x)), then dy/dx = f'(g(x)) * g'(x)

Advanced Explanation

The chain rule is essential for understanding how to differentiate functions composed of other functions. It is widely used in the backpropagation algorithm in neural networks and in various other optimization techniques.

C Implementation


// Example: f(g(x)) where f(u) = u^2 and g(x) = sin(x)
double g(double x) {
    return sin(x);
}

double f(double u) {
    return u * u;
}

double chain_rule(double x) {
    double u = g(x);
    double df_du = 2 * u;
    double dg_dx = cos(x);
    return df_du * dg_dx;
}

39. Multivariable Calculus

Basic Introduction

Multivariable calculus extends calculus to functions of several variables. It includes concepts like partial derivatives, multiple integrals, and gradient vectors.

Key Concepts

Partial Derivatives: Derivatives of functions with respect to one variable, holding others constant.
Gradient (∇): A vector that represents the direction and rate of fastest increase of a function.
Multiple Integrals: Integrals over more than one variable, used to compute volumes and areas.

Advanced Explanation

Multivariable calculus is essential in fields such as optimization, physics, and machine learning. It allows for the analysis of systems with multiple changing variables.

C Implementation


// Example: Gradient of f(x, y) = x^2 + y^2
double df_dx(double x, double y) {
    return 2 * x;
}

double df_dy(double x, double y) {
    return 2 * y;
}

void compute_gradient(double x, double y, double* gradient) {
    gradient[0] = df_dx(x, y);
    gradient[1] = df_dy(x, y);
}

40. Big-O Notation (O)

Basic Introduction

Big-O notation describes the upper bound of an algorithm's runtime or space complexity in the worst case, helping to classify algorithms according to how their running time or space requirements grow as the input size grows.

Mathematical Notation


        O(f(n)) = { g(n) | there exists constants c > 0 and n₀ ≥ 0 such that 0 ≤ g(n) ≤ c * f(n) for all n ≥ n₀ }

Advanced Explanation

Big-O notation is crucial for analyzing the efficiency of algorithms, particularly in computer science. It helps in making decisions about which algorithms to use based on their performance as the size of the input data grows.

C Implementation


// Example: Linear Search O(n)
int linear_search(int* arr, int n, int target) {
    for (int i = 0; i < n; i++) {
        if (arr[i] == target) {
            return i;
        }
    }
    return -1;
}

// Example: Binary Search O(log n)
int binary_search(int* arr, int n, int target) {
    int low = 0, high = n - 1;
    while (low <= high) {
        int mid = (low + high) / 2;
        if (arr[mid] == target) {
            return mid;
        } else if (arr[mid] < target) {
            low = mid + 1;
        } else {
            high = mid - 1;
        }
    }
    return -1;
}

**This project is not a professional educational tool.** It was created for personal learning and experimentation with mathematical notation and calculus. While the content aims to be accurate and useful, it is not guaranteed to be error-free or comprehensive. Use it at your own risk, and please verify any mathematical or coding implementations independently.

For any questions or additional information, you can reach out via alberrod.dev@gmail.com

Mathematical Notation in Code

1. Σ (Sigma) - Summation

Basic Introduction

Simple Example

Advanced Explanation

Complex Example: Sum of Natural Numbers

Additional Example: Summation with a Step Size

2. ∏ (Pi) - Product

Basic Introduction

Simple Example

Advanced Explanation

Complex Example: Factorial

3. √ - Square Root

Basic Introduction

Simple Example

Advanced Explanation

Newton's Method for Square Root

4. ∫ - Integral

Basic Introduction

Simple Example

Advanced Explanation

5. d/dx - Derivative

Basic Introduction

Simple Example

Advanced Explanation

6. lim - Limit

Basic Introduction

Simple Example

Advanced Explanation

7. ∀ - For All

Basic Introduction

Simple Example

Advanced Explanation

8. {…} - Set Notation

Basic Introduction

Simple Example

Advanced Explanation

9. ∈ - Element of

Basic Introduction

Simple Example

Advanced Explanation

10. { | property} - Set Builder Notation

Basic Introduction

Simple Example

Advanced Explanation

11. ⊆ - Subset

Basic Introduction

Simple Example

Advanced Explanation

12. ∪ - Union

Basic Introduction

Simple Example

Advanced Explanation

13. ∩ - Intersection

Basic Introduction

Simple Example

Advanced Explanation

14. ∂ - Partial Derivative

Basic Introduction

Simple Example

Advanced Explanation

15. ∇ - Gradient

Basic Introduction

Simple Example

Advanced Explanation

16. ≈ - Approximately Equal

Basic Introduction

Simple Example

Advanced Explanation

17. ∞ - Infinity

Basic Introduction

Simple Example

Advanced Explanation

18. ∃ - There Exists

Basic Introduction

Simple Example

Advanced Explanation

19. ⇒ and ⇔ - Implication and Equivalence

Basic Introduction

Simple Example