Computers & Music:

How does the EMU8000 make it?

by Kai Pihl

Tuning and pitch shifting

When an instrument produces a sound, it generates differences in the airpressure. They vary from a small underpressure to a small overpressure. These fluctuations in the airpressure are spread from the soundsource to the surrounding air as waves on the water when we fling a stone into it. The extent of this fluctuation is called the amplitude. The bigger stone the bigger waves on the water. Same way the louder sound the bigger fluctuations in the airpressure. The amplitude is usually normalized into a scale from -1 to +1. When a sound is recorded digitally, it is the amplitude that is actually recorded at given intervals. The amount of these intervals per time is the sampling speed in samples per second. When we sample a sound with a speed of 22050 samples per second, we actually use the microphone to measure the airpressure 22050 times per second.

In the picture below we can see the same type of graph than in the Loop Marking dialog box of Vienna when the 'arcoviolinx2' sound is selected. Here it is drawn with the scales on both axis. The vertical axis represents the amplitude, the amount of the air pressure. When the graph is on the lower (negative) part there is underpressure, and when it is in the upper (positive) part there is overpressure in the air. The vertical axis shows the time, here in seconds. As we can see, the recording is very short, a little less than 0.2 seconds. That's why the title says that it is the beginning of the sound. There is also two green vertical lines at end of the graph. These lines mark the beginning and the end of the sustain loop. But more about it a little later. We are first interested in the pitch and in the tuning of the sound and how the EMU8000 does it.

Why should we notice the musical note of the played sound? Remember that the synthesizer lowers and rises the pitch of the recorded sound according to the key we hit. The first zone (arcoviolingx2) of the violin instrument has it's root key at the Eb4. It can be seen from the red triangle above the keyboard when the sample is activated. At the rootpoint the synthesizer does not make any changes to the original sound if there's no tuning applied. The rootkey is usually at the same pitch where the original recording was made. But why this sample has it's root key outside it's playing zone. Highest note in the zone is 27 semitones lower than the rootkey and besides of that, whole record is tuned 30 cents upwards. The result of this is, that the highest note in the zone is played 26 semitones and 70 cent lower than the original record.

Tuning downwards assumes the synthesizer to play the sound back at a lower speed, but the EMU8000 is reading and outputting the data at a constant speed of 44100 samples per second. So it must to stretch the original sound in order to get it last longer, i.e. to get it sound as a lower note. The length of the original record to the end of it's loop is 1727 samples which can be seen from the Loop Marking dialog box. You get to this dialog by double-clicking the name of the sample in the Tree view.

Every octave (12 semitones) downward divides the sound oscillation by two. 26 semitones makes 2 octaves plus 2 semitones. These and the 70 cents (70/100 a semitone) give us a divider of 4.675. So the original record must be lenghtened by this factor to get it sound 26 and 70/100 semitones lower. This way we end up to 8073 samples which corresponds to 0.183 seconds when played back at the 44100 samples per second. The original samples are located in this stretched record at longer distances from each other. The gaps between them are filled by a software algorithm to construct a curve of best audio fit as Dave Rossum, Chief Wizard of the Tech Center and EMU8000 co-designer, mentioned in his article on "memory bandwith issues in EMU 8000". Sorry to say, but this article has disappeared from Creatives www-pages.

What is wrong here? Why they do double work in first lengthening the original record by 4.675 and then filling the gaps with calculated samples? Why could the record not be originally in rigth length? The answer is in 1 MB ROM. Exactly spoken in the lack of storage space. The truth is, that the original record was in the zone, but they took away about tree of every four samples, so they got a record which was only about 1/4 of the length of the original record. When this record is then played back with the full 44.1 KHz speed, it shoud be lowered in order to get it sound as the original. This process brings gaps between individual samples, which should be filled with that algorithm. This way the records fit into the 1 MB ROM and the sound quality can still be kept on a reasonable level. Clever, isn't it?

This sounds a little bit like cheating, but it's not that. With enough DRAM, there's no need to do things like this. In those bigger banks E-mu offers, everything is OK. And remember, without tricks like this, the price of AWE would be much higher.

Did I promise to give easy explanations? Yes, I did. This was only a try to explain why the lengths in the Loop Marking dialog box and the actual played lengths differ so much. It also leads us to understand the volume envelope more deeply.

Measuring the sound

Now that we know the real duration of the violin sound played, we can examine the sound more precisely. The Volume Envelope dialog box of Vienna2 doesn't show the envelope in a true scale. The lengths of the different phases are merely figurative. In the following table are summarized all the volume envelope parameters. In addition to those we've already discussed there's few new parameters. They are Initial attenuation, Key to Hold and and Key to Decay.

Initial attenuation makes it possible to adjust the volumes of individual samples inside an instrument which has more than one sample zones.

Key to Hold and Key to Decay -parameters lengthen and/or shorten the times of Hold and Decay phases according to the played MIDI key. If the key is key number 60 (C4, middle C) they do not have any impact. These parameters are usually left at 1 (the default), so that they have no impact at all.

The parameters of the volume envelope

Phase Min. valueMax. value Comments
Delay 0.001 10 secs A time the envelope waits before it begins the attack. Can be used for esoteric things, usually leaved at 0.
Attack 0.001 11.87 secs The time needed to reach the full level of the sound.
A value of 0.001 is considered as having no attack phase.
Hold 0.001 11.68 secs The time sound stays at the full level.
Decay 0.001 48.28 secs The time used to get to the sustain level.
Sustain 0 dB 96 dB This isn't a time value, but a steady state level for as long as the note lasts.
Release 0.001 48.28 secs The time for the sound to go to zero from what ever
level it was before entering this phase
0 -96 dB Sets the amount by which the volume of a note will be reduced below full level. A value of zero indicates no attenuation.
Key to Hold 0.5 2 Sets the degree to which the hold time will be decreased by increasing the MIDI key number. The hold time at key number 60 is always unchanged.
Key to Decay 0.5 2 Sets the degree to which the decay time will be decreased by increasing the MIDI key number. The decay time at key number 60 is always unchanged.

As the table shows the maximum time values are quite long when compared to the actual durations of many instument sounds. The piano sound is one of the longest in the ROM-bank. When tuned down and pitch-shifted to the lowest usable octave it's actual duration is about 1.5 seconds.

As said before the visible lengths of the different phases of the volume envelope are merely figurative in Vienna user interface. This is convenient for a user who only wants to experience with the sounds, check out what happens if some slider is thrown all the way up, etc. In order to gain precise control over the sound, we must first check out what is the exact playing time of a sound at some exact pitch, as we did above. After it we can adjust the attack, hold and decay times so that they fit to the non-sustaining part of the sound. If it's going to be a sustained sound we can adjust the sustaining part after that. We talk about sustaining sounds in another section.

How then we can find the actual playing time of a sound? Here's how I make it. As we can see from table above the maximum hold time is 11.68 seconds. Surely every natural instrument sound without a sustain effect fits to this time range. We can disable the looping by unchecking the 'Enable looping for this sampleŽ checkbox in the Loop Marking dialog of Vienna2. Then, if we put all volume phases to zero except the hold phase which should be in it's maximum, we can hear the sound in it's recorded form: Without attack or decay adjustments and without the sustaining part. Only the used tuning and the pitch shift are in effect. So if we play a note at, say middle C, and measure the duration of produced sound, we can get the duration in other octaves aswell. Every octave down multiplies the duration by two and every octave up divedes it by two. How this measurement can be done in adequate accuracy? Very easyly. We just play it with Vienna and record it at the same time with Wave Studio.

Using the Wave Studio Prev: Envelope basics To: The contents