Final Exam: Mathematical Formulas Reference

Moravec SSD Function

Measures intensity variation when a window is shifted in a discrete direction $(x, y)$.

\[E_{m,n}(x,y) = \sum_{(u,v) \in W} \big[I(m+u, n+v) - I(m+x+u, n+y+v)\big]^2\]

Moravec Corner Response

Takes the minimum Sum of Squared Differences (SSD) over all evaluated shift directions.

\[F_{m,n} = \min_{(x,y) \in D} E_{m,n}(x,y)\]

Image Gradients

Measures horizontal and vertical intensity changes per pixel.

\[I_x = \dfrac{\partial I}{\partial x}, \qquad I_y = \dfrac{\partial I}{\partial y}\]

Gradient Vector & Direction

Identifies the magnitude and direction angle of the strongest local intensity change.

\[\nabla f = \left(\dfrac{\partial f}{\partial x}, \dfrac{\partial f}{\partial y}\right), \qquad \theta_g = \text{atan2}\!\left(\dfrac{\partial f}{\partial y}, \dfrac{\partial f}{\partial x}\right)\]

Structure Tensor Matrix ($M$)

Summarizes local intensity variation using gradients in a neighborhood weighted by $w(x, y)$.

\[M = \sum w(x,y) \begin{bmatrix} I_x^2 & I_x I_y \\ I_x I_y & I_y^2 \end{bmatrix}\]

Harris SSD Taylor Approximation

Simplifies the continuous intensity variation function into a quadratic matrix expression.

\[E(u,v) \approx \begin{bmatrix} u & v \end{bmatrix} M \begin{bmatrix} u \\ v \end{bmatrix}\]

Eigenvector & Characteristic Equations

Used to explicitly compute eigenvalues $\lambda_1, \lambda_2$ from the structure tensor matrix.

\[(M - \lambda I)e = 0, \qquad \det(M - \lambda I) = 0\]

Determinant & Trace

Calculated directly from the structure tensor coefficients, avoiding the need to solve quadratic equations.

\[\det(M) = \lambda_1 \lambda_2, \qquad \text{trace}(M) = \lambda_1 + \lambda_2\]

Harris Response Function

Evaluates corner strength using a sensitivity coefficient $\alpha$ (typically $0.04$ to $0.06$).

\[R = \det(M) - \alpha \cdot \text{trace}(M)^2\]

Shi-Tomasi Response

Directly selects the smaller eigenvalue as the corner strength metric.

\[R = \min(\lambda_1, \lambda_2)\]

Noble Response

An alternative response function used for improved numerical stability.

\[R = \dfrac{\det(M)}{\text{trace}(M) + \epsilon}\]

SIFT Descriptor Vector Dimensions

Formed by combining $4 \times 4 = 16$ spatial grid cells, each storing an 8-bin gradient orientation histogram.

\[16 \text{ cells} \times 8 \text{ bins} = 128 \text{ dimensions}\]

Slope-Intercept Representation

Computes intercept parameters in Hough space for an edge point $(x_i, y_i)$.

\[b = -x_i \cdot m + y_i\]

Polar Line Representation

Standard normal representation to prevent slope parameters from reaching infinity on vertical lines.

\[\rho = x\cos\theta + y\sin\theta\]

Accumulator Bin Indexing

Maps continuous line parameters into discrete parameter bins.

\[\rho_i = \text{round}\!\left(\dfrac{\rho}{\Delta\rho}\right), \qquad \theta_j = \text{round}\!\left(\dfrac{\theta}{\Delta\theta}\right)\]

Accumulator Increment

Increments voting array cell $(i, j)$ in parameter space.

\[H(\rho_i, \theta_j) = H(\rho_i, \theta_j) + 1\]

Circle Representation

Base analytical equation for circle boundary points.

\[(x-a)^2 + (y-b)^2 = r^2\]

Gradient-Guided Circle Centers

Restricts parameter voting to the normal direction $\theta_g$ to speed up computation.

\[a = x \pm r\cos\theta_g, \qquad b = y \pm r\sin\theta_g\]

K-Means Objective (SSD)

Objective function minimizing distances between data points and centroids.

\[\text{SSD} = \sum_{\text{clusters}} \sum_{p_j \in c_i} \lVert p_j - c_i \rVert^2\]

K-Means Point Assignment

Assigns a point $p$ to the nearest cluster centroid $c_i$.

\[\lVert p - c_i \rVert < \lVert p - c_j \rVert \quad \text{for all } j \neq i\]

K-Means Cluster Center Update

Recomputes the cluster centroid as the average of its assigned points.

\[c_i = \dfrac{1}{N_i} \sum p_j\]

Term Frequency (TF)

Measures the proportional occurrence frequency of visual word $v_i$ inside an image.

\[\text{TF}(v_i) = \dfrac{\text{count}(v_i \text{ in image})}{\text{total visual words}}\]

Inverse Document Frequency (IDF)

Measures the rarity of visual word $v_i$ across a dataset of size $N$.

\[\text{IDF}(v_i) = \log\!\left(\dfrac{N}{df_i}\right)\]

TF-IDF Weighting

Combines word frequency and document rarity to weight features.

\[\text{TF-IDF}(v_i) = \text{TF}(v_i) \times \text{IDF}(v_i)\]

SVM Decision Function

Defines the classification prediction label using support weights and bias.

\[f(x) = \text{sign}\big(w^T \varphi(x) + b\big)\]

CNN Output Spatial Size (without padding)

Computes output dimension for input size $N$, filter size $F$, and stride $S$.

\[\text{Output size} = \left\lfloor \dfrac{N-F}{S} \right\rfloor + 1\]

Sigmoid Activation

Outputs range $(0, 1)$ representing probabilities.

\[\sigma(x) = \dfrac{1}{1 + e^{-x}}\]

tanh Activation

Outputs range $(-1, 1)$, centered at zero.

\[\tanh(x)\]

ReLU Activation

Keeps positive values and zeroes out negative ones.

\[\max(0, x)\]

Leaky ReLU Activation

Allows a small positive leakage gradient of $0.1$ when inactive.

\[\max(0.1x, x)\]

Maxout Activation

Evaluates maximum over two parallel linear transformations.

\[\max(w_1^T x + b_1,\ w_2^T x + b_2)\]

ELU Activation

Smooth curve for negative inputs based on hyperparameter $\alpha$.

\[x \ \text{if } x \ge 0 \text{ else } \alpha(e^x - 1)\]

Categorical Cross-Entropy Loss

Computes negative log probability of the true target class.

\[L = -\log(P_{true})\]

Accuracy

Overall fraction of correct predictions.

\[\text{Accuracy} = \dfrac{TP+TN}{TP+TN+FP+FN}\]

Precision

Measures prediction purity (what fraction of positives are correct).

\[\text{Precision} = \dfrac{TP}{TP+FP}\]

Recall

Measures detection completeness (what fraction of actual positives are found).

\[\text{Recall} = \dfrac{TP}{TP+FN}\]

F1-Score

Harmonic mean balancing precision and recall.

\[F_1 = \dfrac{2 \times \text{Precision} \times \text{Recall}}{\text{Precision}+\text{Recall}} = \dfrac{2TP}{2TP+FP+FN}\]

Intersection over Union (IoU)

Measures bounding box or pixel mask overlap.

\[\text{IoU} = \dfrac{\text{Intersection}}{\text{Union}}\]

mean Average Precision (mAP)

Average of the area under the precision-recall curve across all classes.

\[\text{mAP} = \dfrac{AP_1+AP_2+\dots+AP_n}{n}\]