vosim — Simple vocal simulation based on glottal pulses with formant characteristics.
This opcode produces a simple vocal simulation based on glottal pulses with formant characteristics. Output is a series of sound events, where each event is composed of a burst of squared sine pulses followed by silence. The VOSIM (VOcal SIMulation) synthesis method was developed by Kaegi and Tempelaars in the 1970's.
ifn - a sound table, normally containing half a period of a sinewave, squared (see notes below).
iskip - (optional) Skip initialization, for tied notes.
ar - output signal. Note that the output is usually unipolar - positive only.
kamp - output amplitude, the peak amplitude of the first pulse in each burst.
kFund - fundamental pitch, in Herz. Each event is 1/kFund seconds long.
kForm - formant center frequency. Length of each pulse in the burst is 1/kForm seconds.
kDecay - a dampening factor from pulse to pulse. This is subtracted from amplitude on each new pulse.
kPulseCount - number of pulses in the burst part of each event.
kPulseFactor - the pulse width is multiplied by this value at each new pulse. This results in formant sweeping. If factor is < 1.0, the formant sweeps up, if > 1.0 each new pulse is longer, so the formant sweeps down. The final pitch of the formant is kForm * pow(kPulseFactor, kPulseCount)
The output of vosim is a series of sound events, where each event is composed of a burst of squared sine pulses followed by silence. The total duration of the events determines fundamental frequency. The length of each single pulse in the squared-sine bursts produce a formant frequency band. The width of the formant is determined by rate of silence to pulses (see below). The final result is also shaped by the dampening factor from pulse to pulse.
A small practical problem in using this opcode is that no GEN function will create a squared sine wave out of the box. Something like the following can be used to create the appropriate table from the score.
; use GEN09 to create half a sine in table 17 f 17 time size 9 0.5 1 0 ; run instr 101 on table 17 for a single init-pass i 101 0 0 17
It can also be done with an instrument writing to an f-table in the orchestra:
; square each point in table #p4. This should be run as init-only, just once in the performance. instr 101 index tableng p4 index = index - 1 ; start from last point loop: ival table index, p4 ival = ival * ival tableiw ival, index, p4 index = index - 1 if index < 0 igoto endloop igoto loop endloop: endin
Parameter Limits | |
---|---|
The count of pulses multiplied by pulse width should fit in the event length (1/kFund). If this is not fulfilled, the algorithm does not break, we just do not start any pulses that would outlast the event. This might introduce a silence at end of event even if none was intended. In consequence, kForm should be higher than kFund, otherwise only silence is output. |
Vosim was created to emulate voice sounds using a model of glottal pulse. Rich sounds can be created by combining several instances of vosim with different parameters. One drawback is that the signal is not band-limited. But as the authors point out, attenuation of high-pitch components is -60 dB at 6 times the fundamental frequency. The signal can also be changed by changing the source signal in the lookup table. The technique has historical interest, and can produce rich sound very cheaply (each sample requires only a table lookup and a single multiplication for attenuation).
As stated, formant bandwidth depends on the ratio between pulse burst and silence in an event. But this is not an independent parameter: The fundamental decides event length, and formant center defines the pulse length. It is therefore impossible to guarantee a specific burst/silence ratio, since the burst length has to be an integer multiple of pulse length. The decay of pulses can be used to smooth the transition from N to N+/-1 pulses, but there will still be steps in the spectral profile of output. The example code below shows one approach to this.
All input parameters are k-rate. The input parameters are only used to set up each new event (or grain). Event amplitude is fixed for each event at initialization. In normal parameter ranges, when ksmps <500, the k-rate parameters are updated more often than events are created. In any case, no wide-band noise will be injected in the system due to k-rate inputs being updated less often than they are read, but some other artefacts could be created.
The opcode should behave reasonably in the face of all user inputs. Some details:
kFund < 0: This is forced to positive - no point in "reversed" events.
kFund == 0: This leads to "infinite" length event, ie a pulse burst followed by very long indefinite silence.
kForm == 0: This leads to infinite length pulse, so no pulses are generated (i.e. silence).
kForm < 0: Table is read backward. If table is symmetric, kform and -kform should give bit-identical outputs.
kPulseFactor == 0: Second pulse onwards is zero. See (c).
kPulseFactor < 0: Pulses alternately read table forward and reversed.
With asymmetric pulse table there may be some use for negative kForm or negative kPulseFactor.
Here is an example of the vosim opcode. It uses the file vosim.csd.
Example 1015. Example of the vosim opcode.
See the sections Real-time Audio and Command Line Flags for more information on using command line flags.
<CsoundSynthesizer> <CsOptions> ; Select audio/midi flags here according to platform ; Audio out Audio in -odac -iadc ;;;RT audio I/O ; For Non-realtime ouput leave only the line below: ; -o vosim.wav -W ;;; for file output any platform </CsOptions> <CsInstruments> sr = 44100 ksmps = 100 nchnls = 1 ;################################################# ; By Rasmus Ekman 2008 ; Square each point in table #p4. This should only be run once in the performance. instr 10 index tableng p4 index = index - 1 ; start from last point loop: ival table index, p4 ival = ival * ival tableiw ival, index, p4 index = index - 1 if index < 0 igoto endloop igoto loop endloop: endin ;################################################# ; Main vosim instrument. Sweeps from a fund1/form1 to fund2/form2, ; trying for narrowest formant bandwidth (still quite wide by the looks of it) ; p4: amp ; p5, p6: fund beg-end ; p7, p8: form beg-end ; p9: amp decay (ignored) ; p10: pulse count (ignored - calc internally) ; p11: pulse length mod ; p12: skip (for tied events) ; p13: don't fade out (if followed by tied note) instr 1 kamp init p4 ; freq start, end kfund line p5, p3, p6 ; formant start, end kform line p7, p3, p8 ; Try for constant ratio burst/silence, and narrowest formant bandwidth kPulseCount = (kform / kfund) ;init p10 ; Attempt to smooth steps between format bandwidths, ; increasing decay before we are forced to a lower pulse count kDecay = kPulseCount/(kform % kfund) ; init p9 if (kDecay * kPulseCount) > kamp then kDecay = kamp / kPulseCount endif kDecay = 0.3 * kDecay kPulseFactor init p11 ; ar vosim kamp, kFund, kForm, kDecay, kPulseCount, kPulseFactor, ifn [, iskip] ar1 vosim kamp, kfund, kform, kDecay, kPulseCount, kPulseFactor, 17, p12 ; scale amplitude for 16-bit files, with quick fade out amp init 20000 if (p13 != 0) goto nofade amp linseg 20000, p3-.02, 20000, .02, 0 nofade: out ar1 * amp endin </CsInstruments> <CsScore> f1 0 32768 9 1 1 0 ; sine wave f17 0 32768 9 0.5 1 0 ; half sine wave i10 0 0 17 ; init run only, square table 17 ; Vosim score ; Picking some formants from the table in Csound manual ; p4=amp fund form decay pulses pulsemod [skip] nofade ; tenor a -> e i1 0 .5 .5 280 240 650 400 .03 5 1 i1 . . .3 . . 1080 1700 .03 5 . i1 . . .2 . . 2650 2600 .03 5 . i1 . . .15 . . 2900 3200 .03 5 . ; tenor a -> o i1 0.6 .2 .5 300 210 650 400 .03 5 1 0 1 i1 . . .3 . . 1080 800 .03 5 . . . i1 . . .2 . . 2650 2600 .03 5 . . . i1 . . .15 . . 2900 2800 .03 5 . . . ; tenor o -> aah i1 .8 .3 .5 210 180 400 650 .03 5 1 1 1 i1 . . .3 . . 800 1080 .03 5 . . . i1 . . .2 . . 2600 2650 .03 5 . . . i1 . . .15 . . 2800 2900 .03 5 . . . ; tenor aa -> i i1 1.1 .2 .5 180 250 650 290 .03 5 1 1 1 i1 . . .3 . . 1080 1870 .03 5 . . . i1 . . .2 . . 2650 2800 .03 5 . . . i1 . . .15 . . 2900 3250 .03 5 . . . ; tenor i -> u i1 1.3 .3 .5 250 270 290 350 .03 5 1 1 0 i1 . . .3 . . 1870 600 .03 5 . . . i1 . . .2 . . 2800 2700 .03 5 . . . i1 . . .15 . . 3250 2900 .03 5 . . . e </CsScore> </CsoundSynthesizer>