Nokia Composer Ringtone Synthesizer in 512 Bytes

A little bit of nostalgia in our new translation - trying to write Nokia Composer and compose our own melody.


Did any of your readers use an old Nokia, for example, models 3310 or 3210? You should remember its great feature - the ability to compose your own ringtones right on the phone keyboard. By arranging notes and pauses in the desired order, you could play a popular melody from the phone speaker and even share the creation with friends! If you missed that era, this is what it looked like:







Did not impress? Just trust me, it felt really cool back then, especially for those who were into music.



The musical notation (musical notation) and format used in Nokia Composer is known as RTTTL (Ring Tone Text Transfer Language). RTTL is still widely used by amateurs to play monophonic melodies on Arduino, etc.



RTTTL allows you to write music for only one voice, notes can only be played sequentially, without chords and polyphony. However, this limitation turned out to be a killer feature, since such a format is easy to write and read, easy to analyze and reproduce.



In this article, we'll try to create an RTTTL player in JavaScript, adding a little bit of golfing code and math to keep the code as short as possible for fun.



Parsing RTTTL



For RTTTL, a formal grammar is used. RTTL format is a string consisting of three parts: the name of the melody, its characteristics, such as tempo (BPM - beats per minute, that is, the number of beats per minute), octave and duration of the note, as well as the melody code itself. However, we will simulate the behavior of Nokia Composer itself, parse only a part of the melody and consider the BPM tempo as a separate input parameter. The name of the melody and its service characteristics are left outside the scope of this article.



A melody is simply a sequence of notes / rests separated by commas with additional spaces. Each note consists of a length (2/4/8/16/32/64), a pitch (c / d / e / f / g / a / b), optionally a sharp (#) and the number of octaves (from 1 to 3 since only three octaves are supported).



The easiest way is to use regular expressions . Newer browsers come with a very handy matchAll function that returns a set of all matches in a string:



const play = s => {
  for (m of s.matchAll(/(\d*)?(\.?)(#?)([a-g-])(\d*)/g)) {
    // m[1] is optional note duration
    // m[2] is optional dot in note duration
    // m[3] is optional sharp sign, yes, it goes before the note
    // m[4] is note itself
    // m[5] is optional octave number
  }
};
      
      





The first thing to figure out about each note is how to convert it to the frequency of sound waves. Of course, we can create a HashMap for all seven note letters. But since these letters are in sequence, it should be easier to think of them as numbers. For each letter-note, we find the corresponding numeric character code ( ASCII code ). For "A" this will be 0x41, and for "a" it will be 0x61. For "B / b" it will be 0x42 / 0x62, for "C / c" it will be 0x43 / 0x63, and so on:



// 'k' is an ASCII code of the note:
// A..G = 0x41..0x47
// a..g = 0x61..0x67
let k = m[4].charCodeAt();
      
      





We should probably skip the most significant bits, we will only use k & 7 as the note index (a = 1, c = 2,…, g = 7). What's next? The next stage is not very pleasant, as it is related to music theory. If we have only 7 notes, then we count them as all 12. This is because the sharp / flat notes are unevenly hidden between the usual notes:



         A#        C#    D#       F#    G#    A#         <- black keys
      A     B | C     D     E  F     G     A     B | C   <- white keys
      --------+------------------------------------+---
k&7:  1     2 | 3     4     5  6     7     1     2 | 3
      --------+------------------------------------+---
note: 9 10 11 | 0  1  2  3  4  5  6  7  8  9 10 11 | 0
      
      





As you can see, the note index in octave increases faster than the note code (k & 7). In addition, it increases non-linearly: the distance between E and F or between B and C is 1 semitone, not 2 as between the other notes.



Intuitively, we can try multiplying (k & 7) by 12/7 (12 semitones and 7 notes):



note:          a     b     c     d     e      f     g
(k&7)*12/7: 1.71  3.42  5.14  6.85  8.57  10.28  12.0
      
      





If we look at these numbers without the decimal places, we will immediately notice that they are non-linear, as we expected:



note:                 a     b     c     d     e      f     g
(k&7)*12/7:        1.71  3.42  5.14  6.85  8.57  10.28  12.0
floor((k&7)*12/7):    1     3     5     6     8     10    12
                                  -------
      
      





But not really ... The "halftone" spacing should be between B / C and E / F, not between C / D. Let's try other ratios (underlined semitones):



note:              a     b     c     d     e      f     g
floor((k&7)*1.8):  1     3     5     7     9     10    12
                                           --------

floor((k&7)*1.7):  1     3     5     6     8     10    11
                               -------           --------

floor((k&7)*1.6):  1     3     4     6     8      9    11
                         -------           --------

floor((k&7)*1.5):  1     3     4     6     7      9    10
                         -------     -------      -------
      
      





It is clear that the values ​​1.8 and 1.5 are not suitable: the first has only one semitone, and the second has too many. The other two, 1.6 and 1.7, seem to fit us well: 1.7 gives the major scale GA-BC-D-EF, and 1.6 gives the major scale AB-CD-EFG. Just what we need!



Now we need to change the values ​​a little so that C is 0, D is 2, E is 4, F is 5, and so on. We should be offset by 4 semitones, but subtracting 4 will make the A note below the C note, so instead we add 8 and calculate modulo 12 if the value is out of an octave:



let n = (((k&7) * 1.6) + 8) % 12;
// A  B C D E F G A  B C ...
// 9 11 0 2 4 5 7 9 11 0 ...
      
      





We also have to take into account the "sharp" character, which is caught by the m [3] group of the regular expression. If present, increase the note value by 1 semitone:



// we use !!m[3], if m[3] is '#' - that would evaluate to `true`
// and gets converted to `1` because of the `+` sign.
// If m[3] is undefined - it turns into `false` and, thus, into `0`:
let n = (((k&7) * 1.6) + 8)%12 + !!m[3];

      
      





Finally, we must use the correct octave. Octaves are already stored as numbers in the regular expression group m [5]. According to music theory, each octave is 12 Seminots, so we can multiply the octave number by 12 and add to the note value:



// n is a note index 0..35 where 0 is C of the lowest octave,
// 12 is C of the middle octave and 35 is B of the highest octave.
let n =
  (((k&7) * 1.6) + 8)%12 + // note index 0..11
  !!m[3] +                 // semitote 0/1
  m[5] * 12;               // octave number
      
      





Clamping



What happens if someone indicates the number of octaves as 10 or 1000? This can lead to ultrasound! We should only allow the correct set of values ​​for such parameters. Limiting the number between the other two is commonly called "clamping". Modern JS has a special function Math.clamp (x, low, high) , which, however, is not yet available in most browsers. The simplest alternative is to use:



clamp = (x, a, b) => Math.max(Math.min(x, b), a);
      
      





But because we're trying to keep our code as small as possible, we can reinvent the wheel and stop using math functions. We use the default x = 0 to make clamping work with undefined values ​​too:



clamp = (x=0, a, b) => (x < a && (x = a), x > b ? b : x);

clamp(0, 1, 3) // => 1
clamp(2, 1, 3) // => 2
clamp(8, 1, 3) // => 3
clamp(undefined, 1, 3) // => 1
      
      





Note Tempo and Duration



We expect BPM to be passed as a parameter to the out play () function . We just have to validate it:



bpm = clamp(bpm, 40, 400);
      
      





Now, to calculate how long a note should last in seconds, we can get its musical duration (whole / half / quarter /…), which is stored in the regex group m [1]. We use the following formula:



note_duration = m[1]; // can be 1,2,4,8,16,32,64
// since BPM is "beats per minute", or usually "quarter note beats per minute",
// BPM/4 would be "whole notes per minute" and BPM/60/4 would be "whole
// notes per second":
whole_notes_per_second = bpm / 240;
duration = 1 / (whole_notes_per_second * note_duration);
      
      





If we combine these formulas into one and limit the duration of the note, we get:



// Assuming that default note duration is 4:
duration = 240 / bpm / clamp(m[1] || 4, 1, 64);
      
      





Also, do not forget about the ability to specify notes with dots, which increase the length of the current note by 50%. We have a group m [2], the value of which can be a point . or undefined . Applying the same method that we used earlier for the sharp sign, we get:



// !!m[2] would be 1 if it's a dot, 0 otherwise
// 1+!![m2]/2 would be 1 for normal notes and 1.5 for dotted notes
duration = 240 / bpm / clamp(m[1] || 4, 1, 64) * (1+!!m[2]/2);
      
      





Now we can calculate the number and duration for each note. It's time to use the WebAudio API to play a tune.



WEBAUDIO



We only need 3 parts from the entire WebAudio API : audio context , an oscillator for processing the sound wave and a gain node for turning on / off the sound. I will use a rectangular oscillator to make the melody sound like that awful old phone ringing:



// Osc -> Gain -> AudioContext
let audio = new (AudioContext() || webkitAudioContext);
let gain = audio.createGain();
let osc = audio.createOscillator();
osc.type = 'square';
osc.connect(gain);
gain.connect(audio.destination);
osc.start();
      
      





This code by itself will not create music yet, but since we parsed our RTTTL melody, we can tell WebAudio what note to play, when, with what frequency and for how long.



All WebAudio nodes have a special setValueAtTime method that schedules a value change event (frequency or node gain).



If you remember, earlier in the article we already had the ASCII code for the note stored as k, the note index as n, and we had the duration of the note in seconds. Now, for each note, we can do the following:



t = 0; // current time counter, in seconds
for (m of ......) {
  // ....we parse notes here...

  // Note frequency is calculated as (F*2^(n/12)),
  // Where n is note index, and F is the frequency of n=0
  // We can use C2=65.41, or C3=130.81. C2 is a bit shorter.
  osc.frequency.setValueAtTime(65.4 * 2 ** (n / 12), t);
  // Turn on gain to 100%. Besides notes [a-g], `k` can also be a `-`,
  // which is a rest sign. `-` is 0x2d in ASCII. So, unlike other note letters,
  // (k&8) would be 0 for notes and 8 for rest. If we invert `k`, then
  // (~k&8) would be 8 for notes and 0 for rest. Shifing it by 3 would be
  // ((~k&8)>>3) = 1 for notes and 0 for rests.
  gain.gain.setValueAtTime((~k & 8) >> 3, t);
  // Increate the time marker by note duration
  t = t + duration;
  // Turn off the note
  gain.gain.setValueAtTime(0, t);
}
      
      





It's all. Our play () program can now play entire melodies written in RTTTL notation. Here is the complete code, with a few minor clarifications, such as using v as a shortcut for setValueAtTime or using one letter variables (C = context, z = oscillator because it produces a similar sound, g = gain, q = bpm, c = clamp):



c = (x=0,a,b) => (x<a&&(x=a),x>b?b:x); // clamping function (a<=x<=b)
play = (s, bpm) => {
  C = new AudioContext;
  (z = C.createOscillator()).connect(g = C.createGain()).connect(C.destination);
  z.type = 'square';
  z.start();
  t = 0;
  v = (x,v) => x.setValueAtTime(v, t); // setValueAtTime shorter alias
  for (m of s.matchAll(/(\d*)?(\.?)([a-g-])(#?)(\d*)/g)) {
    k = m[4].charCodeAt(); // note ASCII [0x41..0x47] or [0x61..0x67]
    n = 0|(((k&7) * 1.6)+8)%12+!!m[3]+12*c(m[5],1,3); // note index [0..35]
    v(z.frequency, 65.4 * 2 ** (n / 12));
    v(g.gain, (~k & 8) / 8);
    t = t + 240 / bpm / (c(m[1] || 4, 1, 64))*(1+!!m[2]/2);
    v(g.gain, 0);
  }
};

// Usage:
play('8c 8d 8e 8f 8g 8a 8b 8c2', 120);
      
      





When minified with terser, this code is only 417 bytes. This is still below the 512 byte threshold. Why don't we add a stop () function to interrupt playback:



C=0; // initialize audio conteext C at the beginning with zero
stop = _ => C && C.close(C=0);
// using `_` instead of `()` for zero-arg function saves us one byte :)
      
      





This is still around 445 bytes. If you paste this code into the developer console, you can play the RTTTL and stop playing by calling the JS functions play () and stop () .



UI



I think adding a little UI to our synthesizer will make the moment of making music even more enjoyable. At this point, I would suggest forgetting about code golf. It is possible to create a tiny editor for RTTTL ringtones without saving bytes using normal HTML and CSS and including a play-only minified script.



I decided not to post the code here as it's pretty boring. You can find it on github . You can also try the demo version here: https://zserge.com/nokia-composer/ .







If the muse has left you and you don't want to write music at all, try a few existing songs and enjoy the familiar sound:





By the way, if you actually composed something, share the url (all song and BPM are stored in the hash portion of the url, so saving / sharing your songs is as easy as copying or bookmarking the link.



Hope you enjoyed it. See this article You can follow the news on Github , Twitter or subscribe via rss .



All Articles