RSound:   A Sound Engine for Racket
1 Sound Control
play
stop
ding
2 Stream-based Playing
make-pstream
pstream?
andqueue
pstream-queue
pstream-current-frame
pstream-play
pstream-queue-callback
pstream-set-volume!
pstream-clear!
3 Recording
record-sound
4 File I/  O
rs-read
rs-read/  clip
rs-read-frames
rs-read-sample-rate
rs-write
5 Rsound Manipulation
rsound
FRAME-RATE
default-sample-rate
rs-frames
rs-frame-rate
rs-equal?
silence
rs-ith/  left
rs-ith/  right
clip
rs-append
rs-append*
rs-overlay
rs-overlay*
assemble
rs-scale
rs-mult
rearrange
resample
resample/  interp
resample-to-rate
build-sound
vec->rsound
6 Signals and Networks
network
prev
frame-ctr
sine-wave
sawtooth-wave
square-wave
pulse-wave
dc-signal
simple-ctr
loop-ctr
loop-ctr/  variable
signal->rsound
signals->rsound
rs-filter
signal-play
indexed-signal
fader
signal+
signal-+  s
signal*
signal-*s
rsound->signal/  left
rsound->signal/  right
thresh/  signal
clip&volume
thresh
signal?
filter?
6.1 Signal/  Blocks
signal/  block-play
7 Visualizing Rsounds
rs-draw
rsound-fft-draw
rsound/  left-1-fft-draw
vector-pair-draw/  magnitude
vector-draw/  real/  imag
8 RSound Utilities
make-harm3tone
make-tone
rs-fft/  left
rs-fft/  right
midi-note-num->pitch
pitch->midi-note-num
andplay
9 Piano Tones
piano-tone
10 Envelopes
sine-window
hann-window
11 Frequency Response
response-plot
poles&zeros->fun
12 Filtering
fir-filter
iir-filter
lpf/  dynamic
reverb
13 Single-cycle sounds
synth-note
synth-note/  raw
synth-waveform
14 Helper Functions
nonnegative-integer?
positive-integer?
15 Configuration
diagnose-sound-playing
all-host-apis
host-api
set-host-api!
display-device-table
set-output-device!
16 Fsounds
rsound->fsound
fsound->rsound
vector->fsound
17 Sample Code
18 Drum Samples
kick
bassdrum
bassdrum-synth
o-hi-hat
c-hi-hat-1
c-hi-hat-2
clap-1
clap-2
crash-cymbal
snare
click-1
click-2
19 Reporting Bugs
7.7

RSound: A Sound Engine for Racket

John Clements <clements@racket-lang.org>

 (require rsound) package: rsound
This collection provides a means to represent, read, write, play, and manipulate sounds. It depends on the portaudio package to provide bindings to the cross-platform ‘PortAudio’ library which appears to run on Linux, Mac, and Windows.
It represents all sounds internally as stereo 16-bit PCM, with all the attendant advantages (speed, mostly) and disadvantages (clipping).

Does it work on your machine? Try this example:

If it doesn’t work on your machine, please try running (diagnose-sound-playing), and tell me about it!

A note about volume: be careful not to damage your hearing, please. To take a simple example, the sine-wave function generates a sine wave with amplitude 1.0. That translates into the loudest possible sine wave that can be represented. So please set your volume low, and be careful with the headphones. Maybe there should be a parameter that controls the clipping volume. Hmm.

1 Sound Control

These procedures start and stop playing sounds.

procedure

(play rsound)  void?

  rsound : rsound?
Plays an rsound. Plays concurrently with an already-playing sound, if there is one. Returns immediately (before the sound is played).

Play can only be used at the top level.

procedure

(stop)  void

Stop all of the the currently playing sounds.

value

ding : rsound?

A one-second "ding" sound. Nice for testing whether sound playing is working.

2 Stream-based Playing

RSound provides a "pstream" abstraction which falls conceptually in between play and signal-play. In particular, a pstream encapsulates an ongoing signal, with primitives available to queue sounds for playback, to check the signal’s "current time" (in frames), and to queue a callback to occur at a particular time.

This mechanism has two advantages over play; first, it allows you to queue sounds for a particular frame, avoiding hiccups in playback. Second, it only uses a single portaudio stream, rather than the multiple portaudio streams that would occur in multiple calls to play

procedure

(make-pstream [#:buffer-time buffer-time])  pstream?

  buffer-time : (or/c number? #f) = #f
Create a new pstream and start playing it. Initially, of course, it will be silent. Returns the pstream. If a buffer-time argument is specified (in seconds), it overrides the default buffer time of about 50 ms. Use a long buffer-time when continuity is more important than responsiveness (background music, etc).

procedure

(pstream? val)  boolean?

  val : any
Returns #true if the given val is a pstream, constructed with make-pstream.

procedure

(andqueue pstream rsound frames val)  any

  pstream : pstream?
  rsound : rsound?
  frames : natural?
  val : any
Queue the given sound to be played at the time specified by frames. If that frame is in the past, it will still play the appropriate remainder of the sound. Returns the given value.

procedure

(pstream-queue pstream rsound frames)  string?

  pstream : pstream?
  rsound : rsound?
  frames : natural?
Queue the given sound to be played at the time specified by frames. If that frame is in the past, it will still play the appropriate remainder of the sound. Returns the string "sound is queued".

procedure

(pstream-current-frame pstream)  natural?

  pstream : pstream?
Returns the current value of the stream’s frame counter.

procedure

(pstream-play pstream rsound)  pstream?

  pstream : pstream?
  rsound : rsound?
Play the given sound on the given stream. Returns the pstream.

procedure

(pstream-queue-callback pstream    
  callback    
  frames)  pstream?
  pstream : pstream?
  callback : procedure?
  frames : natural?
Queue the callback (a procedure of no arguments) to be called when the pstream’s frame counter reaches frames. If the counter is already larger than frames, calls it immediately.

It’s perhaps worth noting that the callbacks are triggered by semaphore posts, to avoid the possibility of a callback stalling playback. This can mean that the callback is delayed by a few milliseconds.

procedure

(pstream-set-volume! pstream volume)  pstream?

  pstream : pstream?
  volume : real?
Given a nonnegative real number, sets the pstream’s volume to that number. A value of 0 indicates silence, a value of 1.0 indicates full volume. Returns the pstream.

procedure

(pstream-clear! pstream)  void?

  pstream : pstream?
Clear all rsounds from pstream’s queue. This will not stop any of pstream’s rsounds which have already begun playing.

3 Recording

RSound now includes basic support for recording sounds.

procedure

(record-sound frames)  rsound?

  frames : nat?
Using the default input default device, record a (stereo) sound of length frames, using the default sample rate. Blocks until the sound is finished.

4 File I/O

These procedures read and write rsounds from/to disk.

The RSound library reads and writes WAV files only; this means fewer FFI dependencies (the reading & writing is done in Racket), and works on all platforms.

procedure

(rs-read path)  rsound?

  path : path-string?
Reads a WAV file from the given path, returns it as an rsound.

It currently has lots of restrictions (it insists on 16-bit PCM encoding, for instance), but deals with a number of common bizarre conventions th-at certain WAV files have (PAD chunks, extra blank bytes at the end of the fmt chunk, etc.), and tries to fail relatively gracefully on files it can’t handle.

Reading in a large sound can result in a very large value (~10 Megabytes per minute); for larger sounds, consider reading in only a part of the file, using rs-read/clip.

procedure

(rs-read/clip path start finish)  rsound?

  path : path-string?
  start : nonnegative-integer?
  finish : nonnegative-integer?
Reads a portion of a WAV file from a given path, starting at frame start and ending at frame finish.

It currently has lots of restrictions (it insists on 16-bit PCM encoding, for instance), but deals with a number of common bizarre conventions that certain WAV files have (PAD chunks, extra blank bytes at the end of the fmt chunk, etc.), and tries to fail relatively gracefully on files it can’t handle.

procedure

(rs-read-frames path)  nonnegative-integer?

  path : path-string?
Returns the number of frames in the sound indicated by the path. It parses the header only, and is therefore much faster than reading in the whole sound.

The file must be encoded as a WAV file readable with rsound-read.

procedure

(rs-read-sample-rate path)  positive-number?

  path : path-string?
Returns the frame rate of the sound indicated by the path. It parses the header only, and is therefore much faster than reading in the whole sound.

The file must be encoded as a WAV file readable with rs-read.

procedure

(rs-write rsound path)  void?

  rsound : rsound?
  path : path-string?
Writes an rsound to a WAV file, using stereo 16-bit PCM encoding. It overwrites an existing file at the given path, if one exists.

5 Rsound Manipulation

These procedures allow the creation, analysis, and manipulation of rsounds.

struct

(struct rsound (data start end frame-rate)
    #:extra-constructor-name make-rsound)
  data : s16vector?
  start : nonnegative-number?
  end : nonnegative-number?
  frame-rate : nonnegative-number?
Represents a sound; specifically, frames start through end of the given 16-bit stereo s16vector.

the basic default frame rate.

Note for people not using the beginning student language: this constant is provided because the default-sample-rate parameter isn’t usable in beginning student language.

parameter

(default-sample-rate)  positive-real?

(default-sample-rate frame-rate)  void?
  frame-rate : positive-real?
A parameter that defines the default frame rate for construction of new sounds.

Note that the terms sample rate and frame rate are used interchangeably. The term "frame rate" is arguably more correct, because one second of stereo sound at a frame rate of 48000 actually has 96000 samples—48000 for the left channel, and 48000 for the right channel. Despite this, the term "sample rate" is generally used to refer to the frame rate in audio applications.

procedure

(rs-frames sound)  nonnegative-integer?

  sound : rsound?
Returns the length of a sound, in frames.

procedure

(rs-frame-rate sound)  positive-real?

  sound : rsound?
Returns the frame rate of a sound, in frames per second.

procedure

(rs-equal? sound1 sound2)  boolean?

  sound1 : rsound?
  sound2 : rsound?
Returns #true when the two sounds are (extensionally) equal.

This procedure is necessary because s16vectors don’t natively support equal?.

procedure

(silence frames)  rsound?

  frames : nonnegative-integer?
Returns an rsound of length frames containing silence. This procedure is relatively fast.

procedure

(rs-ith/left rsound frame)  nonnegative-integer?

  rsound : rsound?
  frame : nonnegative-integer?
Returns the nth sample from the left channel of the rsound, represented as a number in the range -1.0 to 1.0.

procedure

(rs-ith/right rsound frame)  nonnegative-integer?

  rsound : rsound?
  frame : nonnegative-integer?
Returns the nth sample from the right channel of the rsound, represented as a number in the range -1.0 to 1.0.

procedure

(clip rsound start finish)  rsound?

  rsound : rsound?
  start : nonnegative-integer?
  finish : nonnegative-integer?
Returns a new rsound containing the frames in rsound from the startth to the finishth - 1. This procedure copies the required portion of the sound.

procedure

(rs-append rsound-1 rsound-2)  rsound?

  rsound-1 : rsound?
  rsound-2 : rsound?
Returns a new rsound containing the given two rsounds appended sequentially. Both of the given rsounds must have the same frame rate.

procedure

(rs-append* rsounds)  rsound?

  rsounds : (listof rsound?)
Returns a new rsound containing the given rsounds, appended sequentially. This procedure is relatively fast. All of the given rsounds must have the same frame rate.

procedure

(rs-overlay rsound-1 rsound-2)  rsound?

  rsound-1 : rsound?
  rsound-2 : rsound?
Returns a new rsound containing the two sounds played simultaneously. Note that unless both sounds have amplitudes less that 0.5, clipping or wrapping is likely.

procedure

(rs-overlay* rsounds)  rsound?

  rsounds : (listof rsound?)
Returns a new rsound containing all of the sounds played simultaneously. Note that unless all of the sounds have low amplitudes, clipping or wrapping is likely.

procedure

(assemble assembly-list)  rsound?

  assembly-list : (listof (list/c rsound? nonnegative-integer?))
Returns a new rsound containing all of the given rsounds. Each sound begins at the frame number indicated by its associated offset. The rsound will be exactly the length required to contain all of the given sounds.

So, suppose we have two rsounds: one called ’a’, of length 20000, and one called ’b’, of length 10000. Evaluating

(assemble (list (list a 5000)
                (list b 0)
                (list b 11000)))

... would produce a sound of 21000 frames, where each instance of ’b’ overlaps with the central instance of ’a’.

procedure

(rs-scale scalar rsound)  rsound?

  scalar : real?
  rsound : rsound?
Scale the given sound by multiplying all of its samples by the given scalar.

procedure

(rs-mult a b)  rsound?

  a : rsound?
  b : rsound?
Produce a new sound by pointwise multiplication of sounds a and b.

procedure

(rearrange length mapping-fun rsound)  rsound?

  length : frames?
  mapping-fun : procedure?
  rsound : rsound?
Returns a new sound with samples drawn from the original according to the mapping-fun. Specifically, a sound of length length is constructed by calling mapping-fun once for each sample with the frame number, and using the resulting number to select a frame from the input sound rsound.

procedure

(resample factor sound)  rsound

  factor : positive-real?
  sound : rsound?
Returns a new sound that is resampled by the given factor. So, for instance, calling (resample 2 ding) will produce a sound that is half as long and one octave higher. The sample rate of the new sound is the same as the old one.

Samples are chosen using rounding; there is no interpolation done.

procedure

(resample/interp factor sound)  rsound

  factor : positive-real?
  sound : rsound?
Similar to resample, except that it performs linear interpolation. The resulting sound should sound better, but the function takes slightly longer.

My tests of 2014-09-22 suggest that interpolating takes about twice as long. In command-line racket, this amounts to a jump from 1.7% CPU usage to 3.0% CPU usage.

procedure

(resample-to-rate frame-rate sound)  rsound

  frame-rate : frame-rate?
  sound : rsound?
Similar to resample/interp, except that it accepts a new desired frame rate rather than a factor, and produces a sound whose frame rate is the given one.

Put differently, the sounds that result from (resample/interp 2.0 ding) and (resample-to-rate 24000 ding) should contain exactly the same set of samples, but the first will have a frame rate of 48000, and the second a frame rate of 24000.

procedure

(build-sound frames generator)  rsound?

  frames : frames?
  generator : procedure?
Given a number of frames and a procedure, produce a sound.

More specifically, the samples in the sound are generated by calling the procedure with each frame number in the range [0 .. frames-1]. The procedure must return real numbers in the range (-1 .. 1)]. The left and right channels will be identical.

Here’s an example that generates a simple sine-wave (you could also use make-tone for this).

(define VOLUME 0.1)
(define FREQUENCY 430)
 
(define (sine-tone f)
  (* VOLUME (sin (* 2 pi FREQUENCY (/ f FRAME-RATE)))))
 
(build-sound (* 2 FRAME-RATE) sine-tone)

procedure

(vec->rsound s16vec frame-rate)  rsound?

  s16vec : s16vector?
  frame-rate : frame-rate?
Construct an rsound from an s16vector containing interleaved 16-bit samples, using the given frame rate.

6 Signals and Networks

For signal processing, RSound adopts a dataflow-like paradigm, where elements may be joined together to form a directed acyclic graph, which is itself an element that can be joined together, and so forth. So, for instance, you might have a sine wave generator connected to the amplitude input of another sine wave generator, and the result pass through a distortion filter. Each node accepts a stream of inputs, and produces a stream of outputs. I will use the term node to refer to both the primitive elements and the compound elements.

The most basic form of node is simply a procedure. It takes inputs, and produces outputs. In addition, the network form provides support for nodes that are stateful and require initialization.

A node that requires no inputs is called a signal.

Signals can be played directly, with signal-play. They may also be converted to rsounds, using signal->rsound or signals->rsound.

A node that takes one input is called a filter.

syntax

(network (in ...)
         network-clause
         ...)
 
in = identifier
     
network-clause = [node-label = expression]
  | [node-label <= network expression ...]
  | [(node-label ...) = expression]
  | [(node-label ...) <= network expression ...]
     
node-label = identifier
Produces a network. The in names specify input arguments to the network. Each network clause describes a node. Each node must have a label, which may be used later to refer to the value that is the result of that node. Multiple labels are used for clauses that produce multiple values.

There are two kinds of clause. A clause that uses = simply gives the name to the result of evaluating the right-hand-side expression. A clause that uses <= evaluates the input expressions, and uses them as inputs to the given network.

The special (prev node-label init-val) form may be used to refer to the previous value of the corresponding node. It’s fine to have “forward” references to clauses that haven’t been evaluated yet. The init-val value is used as the previous value the first time the network is used.

The final clause’s node is used as the output of the network.

The network form is useful because it manages the initialization of stateful networks, and allows reference to previous outputs.

Here’s a trivial signal:

(lambda () 3)

Here’s the same signal, written using network:

(network ()
         [out = 3])

This is the signal that always produces 3.

Here’s another one, that counts upward:

(define counter/sig
  (network ()
           [counter = (+ 1 (prev counter 0))]))

The prev form is special, and is used to refer to the prior value of the signal component.

Note that since we’re adding one immediately, this counter starts at 1.

Here’s another example, that adds together two sine waves, at 34 Hz and 46 Hz, assuming a sample rate of 44.1KHz:

(define sum-of-sines
     (network ()
              [a <= sine-wave 34]
              [b <= sine-wave 46]
              [out = (+ a b)]))

In order to use a signal with signal-play, it should produce a real number in the range -1.0 to 1.0.

Here’s an example that uses one sine-wave (often called an "LFO") to control the pitch of another one:

(define vibrato-tone
  (network ()
           [lfo <= sine-wave 2]
           [sin <= sine-wave (+ 400 (* 50 lfo))]
           [out = (* 0.1 sin)]))
(signal-play vibrato-tone)
(sleep 5)
(stop)

There are many built-in signals. Note that these are documented as though they were procedures, but they’re not; they can be used in a procedure-like way in network clauses. Otherwise, they will behave as opaque values; you can pass them to various signal functions, etc.

Also note that all of these assume a fixed sample rate of 44.1 KHz.

syntax

(prev node-label init-val)

 
node-label = identifier
     
init-val = expression
Recognized specially in the network form, as documented above. It is an error to use prev outside of a network clause

signal

(frame-ctr)  signal?

A signal that counts up from zero, representing the number of frames since the beginning of the signal.

signal

(sine-wave frequency)  real?

  frequency : nonnegative-number?
A signal representing a sine wave of the given frequency, at the default sample rate, of amplitude 1.0.

signal

(sawtooth-wave frequency)  real?

  frequency : nonnegative-number?
A signal representing a naive sawtooth wave of the given frequency, of amplitude 1.0. Note that since this is a simple -1.0 up to 1.0 sawtooth wave, it’s got horrible aliasing all over the spectrum.

signal

(square-wave frequency)  real?

  frequency : nonnegative-number?
A signal representing a naive square wave of the given frequency, of amplitude 1.0, at the default sample rate. It alternates between 1.0 and 0.0, which makes it more useful in, e.g., gating applications.

Also note that since this is a simple 1/-1 square wave, it’s got horrible aliasing all over the spectrum.

signal

(pulse-wave duty-cycle frequency)  real?

  duty-cycle : real?
  frequency : nonnegative-number?
A signal representing a "pulse wave", with part of the signal at 1.0 and the rest of the signal at 0.0. The duty-cycle determines the fraction of the cycle that is 1.0. So, for instance, when duty-cycle is 0.5, the result is a square wave.

signal

(dc-signal amplitude)  real?

  amplitude : real?
A constant signal at amplitude. Inaudible unless used to multiply by another signal.

The following are functions that return signals.

procedure

(simple-ctr init skip)  signal?

  init : real?
  skip : real?
Produces a signal whose value starts at init and increases by skip at each frame.

procedure

(loop-ctr len skip)  signal?

  len : real?
  skip : real?
Produces a signal whose value starts at 0.0 and increases by skip at each frame, subtracting len when the value rises above len.

procedure

(loop-ctr/variable len)  signal?

  len : real?
Produces a signal whose value starts at 0.0 and increases by skip at each frame, subtracting len when the value rises above len. In this case, the skip value is supplied dynamically.

In order to listen to them, you can transform them into rsounds, or play them directly:

procedure

(signal->rsound frames signal)  rsound?

  frames : nonnegative-integer?
  signal : signal?
Builds a sound of length frames at the default sample-rate by using signal. Both channels are identical.

Here’s an example of using it:

(define sig1
  (network ()
           [a <= sine-wave 560]
           [out = (* 0.1 a)]))
 
(define r (signal->rsound 44100 sig1))
 
(play r)

procedure

(signals->rsound frames left-sig right-sig)  rsound?

  frames : nonnegative-integer?
  left-sig : signal?
  right-sig : signal?
Builds a stereo sound of length frames by using left-sig and right-sig to generate the samples for the left and right channels.

procedure

(rs-filter sound filter)  rsound?

  sound : rsound?
  filter : filter?
Applies the given filter to the given sound to produce a new sound. The sound’s channels are processed independently. The new sound is of the same length as the old sound.

procedure

(signal-play signal)  void?

  signal : signal?
Plays a (single-channel) signal. Halt playback using (stop).

There are several functions that produce signals.

procedure

(indexed-signal time->amplitude)  signal?

  time->amplitude : procedure?
Given a mapping from frame (in frames) to amplitude, return a signal. In prior versions of RSound, such a mapping was called a signal. This function converts those functions into new-style signals.

procedure

(fader fade-samples)  signal?

  fade-samples : number?
Produces a signal that decays exponentially. After fade-samples, its value is 0.001. Inaudible unless used to multiply by another signal.

There are also a number of functions that combine existing signals, called "signal combinators":

procedure

(signal+ a b)  signal?

  a : signal?
  b : signal?
Produces the signal that is the sum of the two input signals.

procedure

(signal-+s signals)  signal?

  signals : (listof signal?)
Produces the signal that is the sum of the list of input signals.

procedure

(signal* a b)  signal?

  a : signal?
  b : signal?
Produces the signal that is the product of the two input signals.

procedure

(signal-*s signals)  signal?

  signals : (listof signal?)
Produces the signal that is the product of the list of input signals.

We can turn an rsound back into a signal, using rsound->signal:

procedure

(rsound->signal/left rsound)  signal?

  rsound : rsound?
Produces the signal that corresponds to the rsound’s left channel, followed by endless silence. Ah, endless silence.

procedure

(rsound->signal/right rsound)  signal?

  rsound : rsound?
Produces the signal that corresponds to the rsound’s right channel, followed by endless silence. (The silence joke wouldn’t be funny if I made it again.)

procedure

(thresh/signal threshold signal)  signal?

  threshold : real-number?
  signal : signal?
Applies a threshold (see thresh, below) to a signal.

procedure

(clip&volume volume signal)  signal?

  volume : real-number?
  signal : signal?
Clips the signal to a threshold of 1, then multiplies by the given volume.

Where should these go?

procedure

(thresh threshold input)  real-number?

  threshold : real-number?
  input : real-number?
Produces the number in the range (- threshold) to threshold that is closest to input. Put differently, it “clips” the input at the threshold.

Finally, here’s a predicate. This could be a full-on contract, but I’m afraid of the overhead.

procedure

(signal? maybe-signal)  boolean?

  maybe-signal : any/c
Is the given value a signal? More precisely, is the given value a procedure whose arity includes 0, or a network that takes zero inputs?

procedure

(filter? maybe-filter)  boolean?

  maybe-filter : any/c
Is the given value a filter? More precisely, is the given value a procedure whose arity includes 1, or a network that takes one input?

6.1 Signal/Blocks

The signal/block interface can speed up sound generation, by allowing a signal to generate a block of samples at once. This is particularly valuable when it is possible for signals to use c-level primitives to copy blocks of samples.

UNFINISHED:

procedure

(signal/block-play signal/block    
  sample-rate    
  #:buffer-time buffer-time)  any
  signal/block : signal/block/unsafe?
  sample-rate : positive-integer?
  buffer-time : (or/c nonnegative-number #f)
Plays a signal/block/unsafe.

7 Visualizing Rsounds

 (require rsound/draw) package: rsound

procedure

(rs-draw rsound    
  #:title title    
  #:parent parent    
  [#:width width    
  #:height height])  void?
  rsound : rsound?
  title : string?
  parent : 
(or/c (is-a?/c frame%)
      (is-a?/c dialog%)
      (is-a?/c panel%)
      (is-a?/c pane%))
  width : nonnegative-integer? = 800
  height : nonnegative-integer? = 200
Displays a new window containing a visual representation of the sound as a waveform.

procedure

(rsound-fft-draw rsound    
  #:zoom-freq zoom-freq    
  #:title title    
  [#:width width    
  #:height height])  void?
  rsound : rsound?
  zoom-freq : nonnegative-real?
  title : string?
  width : nonnegative-integer? = 800
  height : nonnegative-integer? = 200
Draws an fft of the sound by breaking it into windows of 2048 samples and performing an FFT on each. Each fft is represented as a column of gray rectangles, where darker grays indicate more of the given frequency band.

procedure

(rsound/left-1-fft-draw rsound    
  #:title title    
  #:width width    
  #:height height)  void?
  rsound : rsound?
  title : string?
  width : 800
  height : 200
Draws an fft of the left channel of the sound, and displays it as a pair of graphs, one for magnitude and one for phase. The whole sound is processed as a single fft frame, so it must be of a length that is a power of 2, and using a sound of more than 16384 frames could be slow.

procedure

(vector-pair-draw/magnitude left    
  right    
  #:title title    
  [#:width width    
  #:height height])  void?
  left : (fcarrayof complex?)
  right : (vectorof complex?)
  title : string?
  width : nonnegative-integer? = 800
  height : nonnegative-integer? = 200
Displays a new window containing a visual representation of the two vectors’ magnitudes as a waveform. The lines connecting the dots are really somewhat inappropriate in the frequency domain, but they aid visibility....

procedure

(vector-draw/real/imag vec    
  #:title title    
  [#:width width    
  #:height height])  void?
  vec : (fcarrayof complex?)
  title : string?
  width : nonnegative-integer? = 800
  height : nonnegative-integer? = 200
Displays a new window containing a visual representation of the vector’s real and imaginary parts as a waveform.

8 RSound Utilities

procedure

(make-harm3tone frequency    
  volume?    
  frames    
  frame-rate)  rsound?
  frequency : nonnegative-number?
  volume? : nonnegative-number?
  frames : nonnegative-integer?
  frame-rate : nonnegative-number?
Produces an rsound containing a semi-percussive tone of the given frequency, frames, and volume. The tone contains the first three harmonics of the specified frequency. This function is memoized, so that subsequent calls with the same parameters will return existing values, rather than recomputing them each time.

procedure

(make-tone pitch volume duration)  rsound?

  pitch : nonnegative-number?
  volume : nonnegative-number?
  duration : nonnegative-exact-integer?
given a pitch in Hz, a volume between 0.0 and 1.0, and a duration in frames, return the rsound consisting of a pure sine wave tone using the specified parameters.

procedure

(rs-fft/left rsound)  (fcarrayof complex?)

  rsound : rsound?
Produces the complex-valued vector that represents the fourier transform of the rsound’s left channel. The sound’s length must be a power of two.

The FFT takes time N*log(N) in the size of the input, so runtimes will be super-linear, but on a modern machine even FFTs of 32K points take on the order of 100msec.

Changed in version 20151120.0 of package rsound: Was named rsound-fft/left.

procedure

(rs-fft/right rsound)  (fcarrayof complex?)

  rsound : rsound?
Produces the complex-valued vector that represents the fourier transform of the rsound’s right channel. The sound’s length must be a power of two.

The FFT takes time N*log(N) in the size of the input, so runtimes will be super-linear, but on a modern machine even FFTs of 32K points take on the order of 100msec.

Changed in version 20151120.0 of package rsound: Was named rsound-fft/right.

procedure

(midi-note-num->pitch note-num)  number?

  note-num : nonnegative-integer?
Returns the frequency (in Hz) that corresponds to a given midi note number. Here’s the top-secret formula: 440*2^((n-69)/12).

procedure

(pitch->midi-note-num pitch)  nonnegative-real?

  pitch : nonnegative-real?
Returns the midi note number that corresponds to a given frequency (in Hz). Inverse of the previous function.

procedure

(andplay snd val)  any/c

  snd : rsound?
  val : any/c
plays the given sound and evaluates to the value. Unlike play, andplay can be used in an expression context.

9 Piano Tones

This module provides functions that generate resampled piano tones. The source for these are recordings made by the University of Iowa’s Electronic Music Studios, which graciously makes their data available for re-use. In particular, rsound uses samples of c3, c4, c5, and c6, and resamples as needed.

procedure

(piano-tone midi-note-num)  rsound?

  midi-note-num : number?
Returns an rsound containing a recording of a piano note at the given midi note number, resampled from a nearby one. The notes are fairly long–about three seconds–though the exact length naturally depends on the length of the recorded note and the resampling factor.

This function is memoized, to speed loading.

10 Envelopes

procedure

(sine-window len fade-in)  rsound?

  len : frames?
  fade-in : frames
Generates an rsound of length len + fade-in representing a window with sine-shaped fade-in and fade-out. The fade-in and fade-out periods are identical, and have half-overlap with the center section. Er... that could be worded better.

procedure

(hann-window len)  rsound?

  len : frames?
Generates an rsound of length len representing a window with sine-shaped fade-in and fade-out, with no flat part in the middle. This is often called the "Hann" window, and is useful when applying the FFT. Strictly speaking, this one differs (indistinguishably, I believe) from the one specified by Wikipedia in that it hits zero at the length, not at length-1.

11 Frequency Response

This module provides functions to allow the analysis of frequency response on filters specified either as transfer functions or as lists of poles and zeros. It assumes a sample rate of 44.1 Khz.

procedure

(response-plot poly dbrel min-freq max-freq)  void?

  poly : procedure?
  dbrel : real?
  min-freq : real?
  max-freq : real
Plot the frequency response of a filter, given its transfer function (a function mapping reals to reals). The dbrel number indicates how many decibels up the "zero" line should be shifted. The graph starts at min-freq Hz and goes up to max-freq Hz. Note that aliasing effects may affect the apparent height or depth of narrow spikes.

Here’s an example of calling this function on a 100-pole comb filter, showing the response from 10KHz to 11KHz:
(response-plot (lambda (z)
                 (/ 1 (- 1 (* 0.95 (expt z -100)))))
               30 10000 11000)

procedure

(poles&zeros->fun poles zeros)  procedure?

  poles : (listof real?)
  zeros : (listof real?)
given a list of poles and zeros in the complex plane, generate the corresponding transfer function.

Here’s an example of calling this function as part of a call to response-plot, for a filter with three poles and two zeros, from 0 Hz up to the nyquist frequency, 22.05 KHz:
(response-plot (poles&zeros->fun '(0.5 0.5+0.5i 0.5-0.5i) '(0+1i 0-1i))
               40
               0
               22050)

12 Filtering

RSound provides a dynamic low-pass filter, among other things.

procedure

(fir-filter delay-lines)  network?

  delay-lines : (listof (list/c nonnegative-exact-integer? real-number?))
Given a list of delay times (in frames) and amplitudes for each, produces a function that maps signals to new signals where each frame is the sum of the current signal frame and the multiplied versions of the delayed input signals (that’s what makes it FIR).

So, for instance,

(fir-filter (list (list 13 0.4) (list 4 0.1)))

...would produce a filter that added the current frame to 4/10 of the input frame 13 frames ago and 1/10 of the input frame 4 frames ago.

procedure

(iir-filter delay-lines)  network?

  delay-lines : (listof (list/c nonnegative-exact-integer? real-number?))
Given a list of delay times (in frames) and amplitudes for each, produces a function that maps signals to new signals where each frame is the sum of the current signal frame and the multiplied versions of the delayed output signals (that’s what makes it IIR).

So, for instance,

(iir-filter (list (list 13 0.4) (list 4 0.1)))

...would produce a filter that added the current frame to 4/10 of the output frame 13 frames ago and 1/10 of the output frame 4 frames ago.

Here’s an example of code that uses a simple comb filter to extract a 3-second buzzing sound at 300 Hz from noise:

(define comb-level 0.99)
 
(play
 (signal->rsound
  (* 48000 3)
  (network ()
           [r = (random)]    ; a random number from 0 to 1
           [r2 = (* r 0.1)]  ; scaled to make it less noisy
                             ; apply the comb filter:
           [o2 <= (iir-filter (list (list 147 comb-level))) r]
                             ; compensate for the filter's gain:
           [out = (* (- 1 comb-level) o2)])))

signal

(lpf/dynamic control input)  signal?

  control : number?
  input : number?
The control signal must produce real numbers in the range 0.01 to 3.0. A small number produces a low cutoff frequency. The input signal is the audio signal to be processed. For instance, here’s a time-varying low-pass filtered sawtooth:

(signal->rsound
88200
(network ()
         [f <= (simple-ctr 0 1)]
         [sawtooth = (/ (modulo f 220) 220)]
         [control = (+ 0.5 (* 0.2 (sin (* f 7.123792865282977e-05))))]
         [out <= lpf/dynamic control sawtooth]))

signal

(reverb input)  number?

  input : number?
Apply a nice basic reverb to the input. Uses the algorithm and the constants from Moorer 1979.

13 Single-cycle sounds

This module provides support for generating tones from single-cycle waveforms.
In particular, it comes with a library of 247 such waveforms, courtesy of Adventure Kid’s website. Used with permission. Thanks!

procedure

(synth-note family    
  spec    
  midi-note-number    
  duration)  rsound
  family : string?
  spec : number-or-path?
  midi-note-number : natural?
  duration : natural?
Given a family (currently either "main", "vgame", or "path"), a spec (a number in the first two cases), a midi note number and a duration in frames, produces an rsound. There’s a (non-configurable) envelope applied, too.

Example, playing sound #49 from the vgame package for a half-second at middle C:

(synth-note "vgame" 49 60 22010)

procedure

(synth-note/raw family    
  spec    
  midi-note-number    
  duration)  rsound
  family : string?
  spec : number-or-path?
  midi-note-number : natural?
  duration : natural?
Same as above, but no envelope is applied.

procedure

(synth-waveform family spec)  rsound

  family : string?
  spec : number-or-path?
Given a family and a spec, produce an rsound representing the waveform; that is, a one-second-long, 1 Hz tone.

14 Helper Functions

procedure

(nonnegative-integer? v)  boolean?

  v : any
returns true for nonnegative integers.

procedure

(positive-integer? v)  boolean?

  v : any
returns true for strictly positive integers.

15 Configuration

procedure

(diagnose-sound-playing)  void?

Tries playing a short tone using all of the available APIs and several plausible sample rates. It tries to offer a helpful message, along with the test.

procedure

(all-host-apis)  (listof symbol?)

Returns a list of symbols representing host APIs supported by the underlying system. This is a re-export from the portaudio package.

parameter

(host-api)  symbol?

(host-api api)  void?
  api : symbol?
A parameter that instructs portaudio to choose a particular API to use in playing sounds. If its value is false, portaudio chooses one.

procedure

(set-host-api! api)  void?

  api : (or/c false? string?)
A version of the host-api parameter that can be used in the teaching languages (because it’s a regular procedure). A parameter that instructs portaudio to choose a particular API to use in playing sounds. If its value is false, portaudio chooses one.

procedure

(display-device-table)  void?

Display a table listing all of the available devices: what host API they’re associated with, what their names are, and the maximum number of input and output channels associated with each one.

procedure

(set-output-device! index)  void

  index : (or/c false? natural?)
Choose a specific device index number for use with portaudio. Note that this choice supersedes the host-api choice.

16 Fsounds

As part of a different project, I want a way to manipulate sounds as vectors of doubles. To handle this, I’ve copied and updated a bunch of rsound code, to make it work with vectors of doubles. As time passes and memory gets more common, I expect at some point simply to switch over to using these sounds everywhere.

procedure

(rsound->fsound rs)  fsound?

  rs : rsound?
Turn an rsound into an fsound. The result may be much smaller than the original, because of lazy clipping.

procedure

(fsound->rsound fs)  rsound?

  fs : fsound?
Turn an fsound into an rsound. The result is guaranteed to be at most 1/4 the size, but may be smaller, because of lazy clipping. Naturally, this is an extremely lossy conversion, because doubles hold more information.

procedure

(vector->fsound fs sample-rate)  fsound?

  fs : (vectorof real?)
  sample-rate : exact-positive-integer?
Given a vector of real numbers and a sample rate, produce an fsound where the samples in the left and right channels are identical and given by the elements of the vector fs, using the given sample rate.

17 Sample Code

An example of a signal that plays two lines, each with randomly changing square-wave tones. This one runs in the Intermediate student language:

(require rsound)
 
; scrobble: number number number -> signal
; return a signal that generates square-wave tones, changing
; at the given interval into a new randomly-chosen frequency
; between lo-f and hi-f
(define (scrobble change-interval lo-f hi-f)
  (local
    [(define freq-range (floor (- hi-f lo-f)))
     (define (maybe-change f l)
       (cond [(= l 0) (+ lo-f (random freq-range))]
             [else f]))]
    (network ()
             [looper <= (loop-ctr change-interval 1)]
             [freq = (maybe-change (prev freq 400) looper)]
             [a <= square-wave freq])))
 
(define my-signal
  (network ()
           [a <= (scrobble 4000 200 600)]
           [b <= (scrobble 40000 100 200)]
           [lpf-wave <= sine-wave 0.1]
           [c <= lpf/dynamic (max 0.01 (abs (* 0.5 lpf-wave))) (+ a b)]
           [b = (* c 0.1)]))
 
; write 20 seconds to a file, if uncommented:
; (rs-write (signal->rsound (* 20 48000) my-signal) "/tmp/foo.wav")
 
; play the signal
(signal-play my-signal)

An example of a signal that plays from one of the single-cycle vgame tones:

#lang racket
 
(require rsound)
 
(define waveform (synth-waveform "vgame" 4))
 
; wrap i around when it goes off the end:
(define (maybe-wrap i)
  (cond [(< i 48000) i]
        [else (- i 48000)]))
 
; a signal that plays from a waveform:
(define loop-sig
  (network (pitch)
    [i = (maybe-wrap (+ (prev i 0) (round pitch)))]
    [out = (rs-ith/left waveform i)]))
 
(signal-play
 (network ()
   [alternator <= square-wave 2]
   [s <= loop-sig (+ (* 200 (inexact->exact alternator)) 400)]
   [out = (* s 0.1)]))

18 Drum Samples

RSound comes with a few simple drum samples.

value

kick : rsound?

value

bassdrum : rsound?

value

o-hi-hat : rsound?

value

clap-1 : rsound?

value

clap-2 : rsound?

value

snare : rsound?

value

click-1 : rsound?

value

click-2 : rsound?

19 Reporting Bugs

For Heaven’s sake, report lots of bugs!