CIE IGCSE COMPUTER SCIENCE
Theory of computer science
Chapter 6 – Memory and data storage
6.1 Introduction
There are many file formats used to store data, be this text, images or sound, in computer systems.
All computer systems have primary memory and secondary memory storage. The main technologies used are
magnetic, optical and solid state.
6.2 File formats
A number of different file formats are used in computer systems, such as:
Musical Instrument Digital Interface (MIDI)
MP3
MP4
Jpeg
text and number format
6.2.1 Musical Instrument Digital Interface (MIDI)
MIDI is always associated with the storage of music files. However, MIDI files are not music and do not contain
any sounds; they are very different to, for example, MP3 files. MIDI is essentially a communications protocol
that allows electronic musical instruments to interact with each other. The MIDI protocol uses 8-bit serial
transmission with one start bit and one stop bit, and is therefore asynchronous.
A MIDI file consists of a list of commands that instruct a device (for example, an electronic organ, sound card in
a computer or in a mobile phone) how to produce a particular sound or musical note. Each MIDI command has
a specific sequence of bytes. The first byte is the status byte – this informs the MIDI device what function to
perform. Encoded in the status byte is the MIDI channel. MIDI operates on 16 different channels, which are
numbered 0 to 15.
Examples of MIDI commands include:
note on/off: this indicates that a key (on an electronic keyboard) has been pressed/ released to
produce/stop producing a musical note
key pressure: this indicates how hard the key has been pressed (this could indicate loudness of the
music note or whether any vibrato has been used, and so on).
Two additional bytes are required, a PITCH BYTE, which tells the MIDI device which note to play, and a
VELOCITY BYTE, which tells the device how loud to play the note. When music or sound is recorded on a
computer system, these MIDI messages are saved in a file which is recognized by the file extension .mid.
If this .mid file is played back through a musical instrument, such as an electronic keyboard, the music will be
played back in an identical way to the original. The whole piece of music will have been stored as a series od
commands but no actual musical notes. This makes it a very versatile file structure, since the same file could
be fed back through a different electronic instrument, such as an electric guitar, with different effects to the
original. However, to play back through an instrument such as a guitar would need the use of SEQUENCER
SOFTWARE, since the MIDI files would not recognize in their ‘raw’ form.
Both the electronic instruments and the computer need a MIDI interface to allow them to ‘talk’ to each other. It
was mentioned earlier that the MIDI operates on 16 channels. In fact, the computer can send data out on all 16
MIDI channels at the same time. For example, 16 MIDI devices, each set up for a different MIDI channel, could
1
,be connected to the computer. Each device could be playing a separate line in a song from the sequencer
software, effectively cresting an electronic orchestra. This implementation is being used more and more today
in the recording studio, by major orchestras and in musical scores used in films.
Because MIDI files do not contain any audio tracks, their size, compares with an MP3 file, is considerably
smaller. For example, a 10-megabyte MP3 file only requires about 10-kilobyte file size when using the MIDI
format. This makes them ideal for devices where memory is an issue; for example, storing ring tones on a
mobile phone.
6.2.2 MPEG-3 (MP3) and MPEG-4 (MP4)
MPEG-3 (MP3) uses technology known as AUDIO COMPRESSION to convert music and other sounds into an
MP3 file format. Essentially, this compression technology will reduce the size of a normal music file about 90%.
For example, an 80 megabytes music CD can be reduced to 8 megabytes using MP3 technology.
MP3 files are used in MP3 players, computers or mobile phones. Files can be downloaded from the internet, or
CDs can be converted to MP3 format. The CD files are converted using FILE COMPRESSION software. Whilst
the music quality can never match the ‘full’ version found on a CD, the quality is satisfactory for most general
purposes.
But how can the original music file be reduced by 90% whilst still retaining most of the music quality? This is
done by using file compression algorithms which use PERCEPTUAL MUSIC SHAPING; this essentially
removes sounds that the human ear cannot hear properly. For example, if two sounds are played at the same
time, only the louder one can be heard by the ear, so the softer sound is eliminated. This means that certain
parts of the music can be removed without affecting the quality too much.
MP3 files use what is known as a LOSSY FORMAT since part of the original file is lost following the
compression algorithm. This means that the original file cannot be put back together again. However, even the
quality of MP3 files can be different since it depends on the BIT RATE – this is the number of bits per second
used when creating the file. Bit rates are roughly between 80 and 320 kilobits per second; usually 200 or
higher gives a sound quality close to a normal CD.
MPEG-4 (MP4) files are slightly different to MP3 files. This format allows the storage of multimedia files rather
than just sound. Music, videos, photos and animation can be all stored in the MP4 format. Videos, for example,
could be streamed over the internet using the MP4 format without losing any real discernible quality.
Activity 6.1
A CD is being used to store music. Each minute’s worth of recording takes up 12 megabytes.
a. The CD contains nice tracks which are the following length (in minutes): 3, 5, 6, 4, 5, 2, 7, 8, 8. How
much memory would these nine tracks occupy on the CD?
1 minute = 12 megabytes
3 + 5 + 6 + 4 + 5 + 2 + 7 + 8 + 8 = 48 minutes
48 * 12 = 576 megabytes
b. If the CD was downloaded to a computer and then all tracks were put through an MP3 compression
algorithm, how much memory would the nine tracks now occupy (you may assume a 90% per cent file
reduction size)?
576 * ((100 – 90) ÷ 100) = 57.6 megabytes
c. Find the average size of each of the MP3 tracks, and then estimate how many MP3 files could be
stored on an 800-megabyte CD.
57.6 = 3 : 5 : 6 : 4 : 5 : 2 : 7 : 8 : 8
3 minutes = 3.6 megabytes
5 minutes = 6 megabytes
6 minutes = 7.2 megabytes
4 minutes = 4.8 megabytes
2
, 2 minutes = 2.4 megabytes
7 minutes = 8.4 megabytes
8 minutes = 9.6 megabytes
57.6 ÷ 9 = 6.4 megabytes average
800 ÷ 6.4 ≈ 125 MP3 tracks
6.2.3 Joint Photographic Experts Group (jpeg) files
Look at the following five photographs of the same car wheel:
The resolution of the photographs is reduced from A to E. Photographs A to B are very sharp whist photograph
D is very fuzzy and E is almost unrecognizable. This is the result of changing the number of PIXELS per
centimetre used to store the image (that is, reducing the PICTURE RESOLUTION).
When a photographic file undergoes file compression, the size of the file is reduced. The trade-off for this
reduced file size is reduced quality of the image. One of the file formats used to reduce photographic file sizes
is known as JPEG. This is another example of lossy file compression. As with MP3 format, once the image is
subjected to the jpeg compression algorithm, a new file is formed and the original file can no longer be
constructed. Jpeg will reduce the RAW BITMAP image by a factor of between 5 and 15 depending on the
quality of the original.
An image that is 2048 pixels wide and 1536 pixels high is equal to 2048 × 1536 pixels; in other words,
3145728 pixels. This is often referred to as a 3-megapixel image (although it is obviously slightly larger). A raw
bitmap can often be referred to as a TIFF or BMP image (file extension .TIF or .BMP). The file size of this
image is determined by the number of pixels. In the previous example, a 3-megapixel image would be 3
megapixels × 3 colours. In other words, 9 megabytes (each pixel occupies 3 bytes because it is made up of the
three main colours: red, green and blue). TIFF and BMP are the highest image quality because, unlike jpeg,
they are not in a compressed format.
The same image stored in jpeg format would probably occupy between 0.6 megabytes and 1.8 megabytes.
Jpeg relies on certain properties of the human eye and, up to a point, a certain amount of file compression can
take place without any real loss of quality. The human is limited in its ability to detect very slightly differences in
brightness and in colour hues. For example, some computer imaging software boasts that it can produce over
40 million different colours – the human eye is only able to differentiate about 10 million colours.
Activity 6.2
a. An image is 1200 pixels by 1600 pixels. Calculate:
i. the total number of pixels in the original image
1200 * 1600 = 1920000 pixels
ii. the number of bytes occupied by this file
1920000 pixels * 3 colours (bytes) = 5760000 bytes ÷ 10242 = 5.493 megabytes
iii. the file size of the jpeg image (in kilobytes) if the original image was reduced by a factor of 8.
5.493 megabytes ÷ 8 = 0.686625 megabytes = 703.104 kilobytes
b. A second image is 3072 pixels by 2304 pixels. Calculate:
i. the total number of pixels in the original image
3072 * 2304 = 7077888 pixels
ii. the number of bytes occupied by this file
3