Skip to content
  • Overflow
  • Integer arithmetic representation is limited to the number of bits and might cause overflow. But it is always consistent.
  • Floating-point arithmetic is not associative, thus finite precision.

2.1 Information Storage

  • Most computers use blocks of eight bits, or bytes, as the smallest addressable unit of memory.
  • A machinelevel program views memory as a very large array of bytes, referred to as virtual memory. Every byte of memory is identified by a unique number, known as its address, and the set of all possible addresses is known as the virtual address space.

Hexadecimal Notation

  • begins with 0x
  • When $x = 2^n$ for some nonnegative integer n, we can readily write x in hexadecimal form by remembering that the binary representation of x is simply 1 followed by n zeros.

Words

  • Every computer has a word size, indicating the nominal size of integer and pointer data.
  • For a machine with a w-bit word size, the virtual addresses can range from 0 to $2^{w − 1}$, giving the program access to at most $2^w$ bytes.
    • Most personal computers today have a 32-bit word size. This limits the virtual address space to 4 GB.

Data Sizes

![[Screen Shot 2024-05-18 at 17.13.51.png]]

Addressing and Byte Ordering

  • For program objects that span multiple bytes: In virtually all machines, a multi-byte object is stored as a contiguous sequence of bytes, with the address of the object given by the smallest address of the bytes used.
    • E.g. suppose a variable x of type int has address 0x100, that is, the value of the address expression &x is 0x100. Then the 4 bytes of x would be stored in memory locations 0x100, 0x101, 0x102, and 0x103.
  • For ordering the bytes representing an object:
    • Little endian: the least significant byte comes first (first = smallest address)
    • Big endian: the most significant byte comes first
    • E.g. suppose the variable x of type int and at address 0x100 has a hexadecimal value of 0x01234567![[Screen Shot 2024-05-18 at 17.31.59.png]]

Representing Strings

  • A string in C is encoded by an array of characters terminated by the null character \0.
    • the terminating byte has the hex representation 0x00

Representing Code

  • Different machine types use different and incompatible instructions and encodings, thus not binary compatible.
  • A fundamental concept of computer systems is that a program, from the perspective of the machine, is simply a sequence of bytes.

Bit-level Operations in C

  • AND: &
  • OR: |
  • XOR: ^
  • NOT: ~
  • One common use of bit-level operations is to implement masking operations

Logical Operations in C

  • AND: &&
  • OR: ||
  • NOT: !

Shift Operations in C

  • x << y Left Shift: Fill 0 on the right.
  • x >> yRight Shift (Logical): Fill 0 on the left.
  • x >> yRight Shift (Arithmetic): Fill the most significant bit on the left.

signed data: almost all use arithmetic right shifts unsigned data: must use logical rights shifts

2.2 Integer Representations

Integral Data Types

![[Screen Shot 2024-05-18 at 18.37.01.png]] - Both C and C++ support signed (the default) and unsigned numbers. Java supports only signed numbers.

Unsigned Encodings

Binary to unsigned, length w: $$B2U_w(\vec{x}) \mathrel{\dot{=}} \sum_{i=0}^{w-1} x_i 2^i$$![[Screen Shot 2024-05-18 at 18.43.43.png]]

Two’s-Complement Encodings

Binary to two's-complement, length w: $$B2T_w(\vec{x}) \mathrel{\dot{=}} -x_{w-1} 2^{w-1} + \sum_{i=0}^{w-2} x_i 2^i$$![[Screen Shot 2024-05-18 at 18.47.35.png]] ![[Screen Shot 2024-05-18 at 18.48.44.png]] - <limits.h> defines a set of constants delimiting the ranges of the different integer data types for the particular machine on which the compiler is running. (INT_MAX, INT_ MIN, and UINT_MAX) - <stdint.h> defines a set of data types with declarations of the form intN_t and uintN_t, specifying N-bit signed and unsigned integers

Conversions Between Signed and Unsigned

Two's complement to unsigned: $$T2U_w(x) = \begin{cases} x + 2^w, & x < 0 \ x, & x \geq 0 \end{cases}$$ Unsigned to two's complement: $$U2T_w(u) = \begin{cases} u, & u < 2^{w-1} \ u - 2^w, & u \geq 2^{w-1} \end{cases}$$

Expanding the Bit Representation of a Number

  • To convert an unsigned number to a larger data type, dadd leading zeros to the representation - zero extension
  • To convert a two's complement number to a larger data type, add copies of the most significant bit to the representation - sign extension

Truncating Numbers

The effect of truncation for unsigned numbers is: $$B2U_k([x_{k-1}, x_{k-2}, \ldots, x_0]) = B2U_w([x_{w-1}, x_{w-2}, \ldots, x_0]) \mod 2^k$$ The effect for two’s-complement numbers is: $$B2T_k([x_{k-1}, x_{k-2}, \ldots, x_0]) = U2T_k(B2U_w([x_{w-1}, x_{w-2}, \ldots, x_0]) \mod 2^k)$$

2.3 Integer Arithmetic

Unsigned Addition

Unsigned Addition with length $w$, such that $0 \leq x, y < 2^w$: $$x \mathbin{{}^{u}_{w}\mathop{+}} y = \begin{cases} x + y, & x + y < 2^w \ x + y - 2^w, & 2^w \le x + y < 2^{w+1} \end{cases}$$ Test overflow: $\text{sum} < x$

Two's-Complement Addition

Two's-Complement Addition applied to x and y in range $-2^{w-1} \le x, y \le 2^{w-1} - 1$: $$x \mathbin{{}^{t}_{w}\mathop{+}} y = \begin{cases} x + y - 2^w, & 2^{w-1} \le x + y & \text{Positive overflow} \ x + y, & -2^{w-1} \le x + y < 2^{w-1} & \text{Normal} \ x + y + 2^w, & x + y < -2^{w-1} & \text{Negative overflow} \end{cases} $$

Two's-Complement Negation

Two's-Complement negation for x in the range $-2^{w-1} \leq x \le 2^{w-1} - 1$: $$- \mathbin{{}^{t}_{w}\mathop{}} x = \begin{cases} -2^{w-1}, & x = -2^{w-1} \ -x, & x > -2^{w-1} \end{cases}$$

Unsigned Multiplication

w-bit unsigned multiplication: $$x \mathbin{{}^{u}_{w}\mathop{*}} y = (x \cdot y) \bmod 2^w$$

Two’s-Complement Multiplication

w-bit two's-complement multiplication: $$x \mathbin{{}^{t}_{w}\mathop{*}} y = U2T_w((x \cdot y) \bmod 2^w)$$

Multiplying by Constants

  • integer multiply instruction is fairly slow, requiring 10 or more clock cycles, whereas other integer operations—such as addition, subtraction, bit-level operations, and shifting—require only 1 clock cycle.
  • optimization used by compilers: replace multiplications by constant factors with combinations of shift and addition operations.
  • 左移 k 位等于乘以 $2^k$, 即 k zeros have been added to the right
    • 如 x * 14 = (x<<3)+(x<<2)+(x<<1) = (x<<4)-(x<<2)
    • 判断如何移动:14 的位级表示为 1110,所以分别左移 3,2,1

Dividing by Powers of Two

  • integer division is slower than integer multiplication
  • optimization: replace divisions using right shift
  • integer division always rounds to zero
  • x >> k gives $\left\lfloor \frac{x}{2^k} \right\rfloor$ (round down)
  • round to zero:
    (x<0 ? x+(1<<k)-1 : x) >> k;