Integer, ASCII and Unicode

This lecture covers programming based on computation machines. Even mechanical devices allow for calculations:

Figure 46. Manual calculation: Abacus Slide presentation
Manual calculation: Abacus

Figure 47. Mechanical calculation: Cash register Slide presentation
Mechanical calculation: Cash register

Figure 48. Electromechanical calculation: Zuse Z3 Slide presentation
Electromechanical calculation: Zuse Z3

Figure 49. Vacuum Tube: Eniac Slide presentation
Vacuum Tube: Eniac

So far all machines being described are based on non-semiconductor technologies. Inventing the transistor in the fifties gave rise to a rapid development of microprocessor chips:

Figure 50. Transistor: Microprocessor ICs Slide presentation
Transistor: Microprocessor ICs

These sample devices differ heavily with respect to addressable memory, data size, supported arithmetic operations / speed and other features. We take a closer look to Zilog's Z80 processor:

Figure 51. Z80 8-bit data bus Slide presentation
Z80 8-bit data bus

Following technological advances processors have been categorized by the length the so called address- and data-bus:

Figure 52. Progress in hardware 1 Slide presentation
Processor Year Address/ data bus Transistors Clock rate
Intel 4004 1971 12 / 4 2,300 740 kHz
Zilog Z80 1976 16 / 8 8,500 2.5 MHz
Motorola 68020 1984 32 / 32 190,000 12.5 MHz

Figure 53. Progress in hardware 2 Slide presentation
Processor Year Address/ data bus Transistors Clock rate
Six-core Opteron 2009 64 / 64 904,000,000 1.8 GHz
Core i7 Broadwell 2016 64 / 64 3,200,000,000 3.6 GHz
Apple's ARM M1 Ultra 2022 64 / 64 114,000,000,000 3.2 GHz

Figure 54. Simple facts: Slide presentation

There are only 10 types of people in the world:

Those who understand binary and those who don't.


We remind the reader to the binary representation of signed integer values. Details will be discussed in your math lectures. Our first example features three bit signed integer values:

Figure 55. Unsigned 3 bit integer representation Slide presentation

Figure 56. Binary system addition Slide presentation
Within limits: o.K. Caution: Overflow!
   010       2
  +011      +3
  ----     ---
   101       5
                  100       4
                  101      +5
               ------     ---
discarded ━━━▶ (1)001       1
by 3 bit
representation

Figure 57. 3 bit two-complement representation Slide presentation

Figure 58. 3 bit two complement rationale: Usual addition Slide presentation
Within limits: o.K. Caution: Overflow!
   101      -3
  +010      +2
  ----     ---
   111      -1
  100     -4
  101     -3
 ----    ---
 1001      1

Signed byte values are being represented accordingly:

Figure 59. Signed 8 bit integer binary representation Slide presentation
Signed 8 bit integer binary representation

exercise No. 5

Hotel key cards

Q:

A hotel supplies the following type of cards for opening room doors:

A customer is worried concerning the impact of loosing his card. For security reasons the corresponding pattern can never be issued again. Thus the hotel may eventually run short on available combinations.

Discuss this argument by estimating the number of distinct patterns.

Hint: Consider a key card's (likely?) grid of possible punch positions:

A:

No need to be worried: The 32 possible punch positions may be arranged in a linear fashion:

Since each position may either contain a hole or be solid we have 2 32 = 4.294.967.296 distinct possibilities. Thus a lot of key cards may get lost before a hotel manager needs to start worrying.

Regarding language characters we start with one of the oldest and widespread character encoding schemes:

Figure 60. 7-bit ASCII Slide presentation
7-bit ASCII

ASCII by design is limited to US characters not including characters . ASCII requires only seven bits. A byte consisting of eight bits allowed to introduce a parity bit for data integrity check purposes:

Figure 61. 7-bit ASCII with even parity bit Slide presentation
7-bit ASCII with even parity bit

A byte's parity bit may instead be used for 8-bit encodings providing non- ASCII supplementary characters like e.g. the ñ in Señor. One such example is the ISO 8859-1 (ISO Latin 1) standard representing Western European character sets:

Figure 62. Western European characters: ISO Latin 1 encoding Slide presentation
Western European characters: ISO Latin 1 encoding

Supporting additional languages comes at a price: We have to increase the number of bytes representing a single character:

Figure 63. Unicode UTF-8 samples Slide presentation
Unicode UTF-8 samples

Notice the representation's differing byte count: UTF-8 Unicode encoding allows for one-, two-, three- and four- byte encodings. See Unicode and You for further details.