This is a reference of terms I thought required explanation when I used them in a post or ones I wanted to list for myself.


Application Binary Interface. The definition of a standard for binary portability across computers, consisting of a subset of the instruction set and the interfaces made available by the operating system to do things like I/O and allocating memory.


A program that translates a symbolic version of instructions (:term`assembly language <Assembly Language>`) into the binary version (machine language).

Assembly Language

First layer of abstraction on top of machine language where one instruction in assembly (add x5, x6, x6) is directly related to a bit sequence in machine language.

Common Time

\(4/4\) time-signature, also denoted \(\mathcal{C}\).

Cost Function
Loss Function

A mapping between an outcome and a real number that signifies the loss or cost of that outcome. The outcome variable may be an event like dropping hot cup of tea, in which case the cost is some numerical value representing how terrible that is. More commonly the outcome variable is a vector representing, for example, the probability mass a function (like a neural network) assigned to an input.


Dynamic Random Access Memory. Contains the program and data when they are needed. DRAM is volatile and access times are around 50 nanoseconds.

Enharmonic Equivalence

Two notes, intervals, scales or chords are enharmonic equivalents if they have different names but contain the exact same notes. Imagine a building. The ceiling of the first floor and the floor of the second floor are the same thing, but the naming is different depending on your point of view.

Flash Memory

Nonvolatile memory. Faster but more expensive than magnetic disks, slower but cheaper than DRAM. Access times are 5 to 50 microseconds.

Hessian Matrix

A square matrix of the second-order partial derivatives of a function. On the diagional are the partial derivatives in a single direction, and the other spots are taken up by all the mixed-partial derivatives. Example in 2D:

\[\mathbf{H}f(x,y) = \begin{bmatrix} \frac{\partial^2 f}{\partial x^2} & \frac{\partial^2 f}{\partial xy}\\ \frac{\partial^2 f}{\partial yx} & \frac{\partial^2 f}{\partial y^2} \end{bmatrix}\]
Instruction Set Architecture

Abstract interface between the hardware and the lowest level software. It essentially defines the Machine Language. Examples are ARM and RISC-V.

Iverson Bracket

Notation that converts logical propositions inside the brackets to a \(1\) if true and \(0\) if false. One application is to mathematically include or exclude elements of vectors or sets in a summation or product:

\[\v x = \begin{bmatrix}1&3&7&9\end{bmatrix}\\ \sum_{i=1}^n \left[x_i > 5 \right] = 2\]
Jacobian Matrix

Matrix of first-order partial derivatives of a vector-valued function. If \(f\) is a function that maps some vector \(\mathbf{x}\) in \(\mathbb{R}^n\) to \(\mathbf{f(x)}\) in \(\mathbb{R}^m\), its Jacobian is:

\[\mathbf{J} f = \begin{bmatrix} \frac{\partial \mathbf{f}}{\partial x_1} & \dots & \frac{\partial \mathbf{f}}{\partial x_n} \end{bmatrix} = \begin{bmatrix} \frac{\partial f_1}{\partial x_1} & \dots & \frac{\partial f_1}{\partial x_n}\\ \dots & \ddots & \vdots\\ \frac{\partial f_m}{\partial x_1} & \dots & \frac{\partial f_m}{\partial x_n}\\ \end{bmatrix}\]

Local Area Network. A network that carries data between devices located in a geographically confied area, like a building or complex.

Machine Language

Language of bit sequences that are the lowest level computer instructions.

Magnetic Disk

Nonvolatile memory on magnetized rotating disks. Cheaper than Flash Memory but also much slower, with access times of around 10 milliseconds.

Observed Variable
Unobserved Variable

A factor that is a part of a statistical relationship like a correlation or causation, and is (not) recorded in the data at hand.

Operator Overloading

Having an operator do different things depending on the type of the arguments. For example, we are familiar with + adding numbers, but the operator is often extended to support adding images, dates and other datatypes.


Random Sample Consensus. Iterative method of fitting a model.

  1. Draw \(s\) samples from the data.

  2. Fit the model to these samples.

  3. Check how many points from the full dataset fall within an acceptable range \(d\) around the model - these are inliers.

  4. Do this for \(N\) iterations.

  5. Choose the model with the most inliers and refit it to all inliers.

Signed Distance

The distance of a point to some surface, with the sign signifying on which side of the surface the point is located.


Static Random Access Memory. Faster but less dense than DRAM. Used to cache instructions closer to the CPU. SRAM is volatile and access times are as low as 10 nanoseconds.


Tom’s Obvious, Minimal Language. A readable configuration file format. Example:

title = "TOML Example"

name = "Stefan Wijnja"
website = ""

server = ""
ports = [ 8001, 8001, 8002 ]

The sum of the components on the main diagonal of a square matrix. The trace has the property that for three matrices \(A, B, C\): \(\mathrm{Tr}(ABC) = \mathrm{Tr}(BCA)\)

Unit Vector

A vector with norm of \(1\), i.e.: \(\sqrt{x \cdot x} = 1\).


The contents of volatile memory are lost when it loses power.


Wide Area Network. A network that carries data between devices spread out over potentially hundreds of kilometers across continents.