RASIKA - Article - Computer Music

HOME Synthesizing Carnatic Music with a Computer
M.Subramanian

1. Introduction. The term Computer Music is generally applied to producing music from notation or data, using a Computer Sound Card installed in a Computer or a Synthesizer. It thus implies that the music is synthesized or created artificially approximating as closely as possible the tones of musical instruments. Although the advent of multimedia (simultaneous use of text, pictures and sound on a computer) has led to publication of large number of CD Titles relating to music, these mostly have music recorded from a performer (though occasionally there may be some synthetic music in such titles) and are not considered as generating 'Computer Music'. Again artificial music produced using analog devices are not considered as Computer Music, the essential requirement being that the music is generated from digital data. This article describes the present situation in synthesizing Carnatic Music with the computer, the problems and possible solutions.

1.1. Western musicians and composers have been extensively using the computer in the field of music for the past decade or so. The synthesizer in the computer, similar to the synthesizer of electronic keyboard instruments can play simultaneously more than one 'voice' i.e. more than one instrument (melodic or percussion) playing its own notes. Computer Music greatly assists composers of Western Music with its emphasis on orchestration and harmony. A composer can immediately listen to his ideas without waiting for it to be played by an orchestra. In India composers of film music which has a high content of orchestration have been using Computers for synthesizing their music for a considerable time now. "It is simply amazing what a PC can do to music today. One can practically have a studio at home." (Sandeep Kanjilal quoted in 'Computers and Music', PCQuest, June 1992, p 16). A large number of software (some shareware) in wide range of prices are available for composing music using the staff notation.

1.2. "Whereas in the west where billions of dollars are spent in researching for computers in the filed of music, India is way behind. The fact that traditional Indian classical music can never be replaced by computers is unquestionable." ('Computers and Music', PCQuest, June 1992, p 17-18). This article, written more than 7 years ago also mentions the names of Louis Banks, Loy Mendonza, Varnaj Bhatia, Illayaraja and Viju Shah as composers using computers. and goes on to say that "in the field of Indian classical music someone still has to make the first move", which was not quite true since the same issue of PCQuest also mentions the efforts made by Mr.T.Narayanamurthy (then Deputy Director General in the Dept. of Telecommunications) who made the first attempts to generate Carnatic music using computers. In fact he had designed his own operating system and even hardware for the work, at a time when Personal Computer had not become popular. However, he did not bring out a software package for distribution and his efforts remained in the realm of experimentation, though he gave demonstrations in Music Academy, Madras.

2. Carnatic Music and the Computer - Problems. The reasons for the reluctance to accept computer as a tool to generate Carnatic music are not far to seek. The factors that make it very difficult to synthesize Carnatic music on a computer are:

(a) Although Carnatic music has a large number of full fledged compositions published with notation, the notation system is only skeletal. If one keys in the notes alone on a harmonium or an electronic keyboard, in most cases the music would be totally unrecognizable. The notation system does not show the nuances or gamakams which are essential to get the 'raga bhavam' in most of the ragas.

(b) Even though attempts were made nearly a century ago by Sri Subbarama Dikshithar in his Magnum Opus, Sangitha Sampradaya Pradarsini in which he gave symbols for the various gamakams used, and more recent publications by C.S.Ayyar and S.Vidhya use gamakam symbols, these are qualitative descriptions and are inadequate for generating music on the computer which would need a precise quantitative description of the gamakam.

(c) Carnatic music is mainly melodic. A concert may have only a single accompaniment for the main artiste (vocalist or others like gottuvadhyam). Both artistes play the same melody unlike in western music where different instruments may be playing different scores. The computer thus does not greatly help a composer who can easily 'hear' his melody in his mind.

(d) The same composition with the same notation may be sung by artistes of different schools in widely different ways, all acceptable within the raga's framework.

(e) Extemporization is a basic feature of Carnatic music. Although it is the alapana and kalpana swaram which are entirely extempore, artistes do resort to extemporization even in compositions, which is often welcome by listeners.

(f) The score played by percussion accompaniment is varied and personalised. Computerised percussion is often felt monotonous by a Carnatic music listener. Thus on the one hand there are many difficulties in synthesizing Carnatic music and on the other hand the motivation for synthetic music is also lacking.

3. Programs for synthetic Carnatic music. In spite of these, I had tried to generate Carnatic music on the computer with gamakams closely approximating to the natural singing. The motivation was to produce a program which could explain the Carnatic music system (theory) with text, visuals and audio (while a book can only give textual description). The program was released in 1994 for public use. It explained the Carnatic music system in 5 modules one of which (Raagam module) explained 80 ragas in detail. All the modules had audio, visuals or audio-visuals which could be accessed while reading the test. Complicated gamakams used in ragas like Begada, Arabhi, Devagandhari, Surati etc were reproduced reasonably accurately. As the use of sound cards in computers were then not popular (and also expensive) the music was produced through the Personal Computer's speaker, with no control on the tonal quality or volume. Nevertheless the software was appreciated and well received indicating that there was a need for such a program. This was followed by another program which contained beginner's lessons but which could be in a wide range of adhara sruthis, enabling practice by the student in his own normal sruthi. Subsequently, in 1998 I released 2 programs which ran under Windows operating system and produced synthetic music through the sound card in the tones of Veena and Flute. The first program was basically the earlier DOS based program but producing music in the tones of Veena and Flute while the second one was entirely new. It enabled generation of synthetic music by entering notation in the traditional 'sa ri ga ma' style. With the advent of CDRom drives a program to explain the Carnatic music system can be produced on a CDRom with recorded music including vocal music, but synthetic music programs have the advantage of being played from the hard disk with quicker access to the different modules and also enable generation of the same music in different instruments and sruthis. A program to generate music from notation of course cannot be developed without synthesizing the music. In the following pages I propose to explain the basic principles used in generating synthetic Carnatic music through a sound card and the problems faced.

4. Computer and sound – Midi and Wave devices. The common sound card found in most Personal Computers has basically 2 different sets of chips for producing music. One set - 'midi synthesizer' has a chip similar to the synthesizer chips in electronic keyboards. It produces tones of different instruments generally using a technique called 'frequency modulation synthesis'. It can take data from files which usually have an extension '.mid' and generate the note of required pitch on the specified instrument for the duration specified. Usually 4 melodic instruments and 2 percussion instruments can be simultaneously be played. Some sound cards provide for more instruments. For better quality some more expensive sound cards produce the synthetic sound using the wave form samples of the instruments. The midi device cannot produce vocal music although some sound effects like clapping are included.

4.1. The second set of chips on the sound card are called 'digital analog converter' and 'analog digital converter' (together forming the 'wave device'). These can be used to record any type of sound including vocal music and replay them. However, unlike the cassette tape, the sound is recorded digitally i.e. the sound is sampled many thousand times in a second (44100 times for the best quality, but could be less - 22050 or 11025 times a second) and the value of the amplitude of the sound at that moment is recorded as a number which may vary between -32768 and +32767 (for a 16 bit recording). This is done by the analog to digital converter. The digital to analog converter reproduces the music by reading these numbers and generating electric current in proportion to the number, which is fed into the amplifier and speaker. Files generated by this process tend to be very large (at a sampling rate of 11025 one minute of recording will occupy 1,323,000 bytes, whereas one minute of simple midi file may be around 5000 bytes).

4.2. Computerised synthetic music generally uses the midi device which is fast and generates music from relatively small files. The music composing programs (which use the staff notation) convert the information from staff notation into commands recognizable by the midi device. Thus it is the midi device which is the most convenient one for generating synthetic music.

4.3. However, western music does not generally require minute changes in pitch (microtonal variations) or smooth movements between and around notes as required in Carnatic Music, though some forms of western music such as jazz require smooth movements but to a limited extent. The midi device provides a command called 'pitch bend' enabling changing the pitch of a note from its normal position. Unfortunately the extent of pitch variation for a given number in the command has not been standardised. Most sound cards provide for a variation of one major tone(about 9/8) up or down which is not adequate for all types of gamakams used in Carnatic music which requires sometimes smooth movement over a range of 4/3 (a fourth). Further the standard list of instruments available with most midi drivers does not include any South Indian instrument (Sitar is provided under 'ethnic' group but the quality of the sound is not always satisfactory). These problems made it necessary to look for some other way of synthesizing Carnatic music.

5. Wave device for synthetic Carnatic music. The wave device though primarily meant for recording, is more flexible in the sense that it can play any sound. To use this to produce synthetic music it is necessary to calculate and create the values that would be generated if the music has been recorded i.e. 11025 numbers per second have to be calculated for the lowest sampling rate. To calculate these numbers the information required are (a) the pitch of the sound, (b) the pattern of the wave form of the particular instrument at that particular pitch (c) the amplitude of the wave at that point. The second parameter is easily ascertained by recording the required instrument's tone at different pitches and analysing them using Fourier analysis or by actually using the sampled wave for one period. The calculations are a bit complex and would consume a lot of time with slow computers since 11025 values are to be calculated for one second of music (11025 is the lowest possible sampling rate. For better quality higher rates as 22050 or 44100 are to be used).

5.1. The synthesis presents no difficulty when notes are sounded separately since for 2 octaves about 24 wave form samples will be required. But in the case of most instruments, particularly the South Indian Veena, the composition of overtones vary widely over the pitch ranges. In general the higher pitches have less overtones, but other factors such as formant of the instrument affect the exact composition of overtones. A musical phrase having transition from (say) Gaandharam to Dhaivatham would traverse through the entire pitch range with varying overtone composition. Thus the synthesized wave, while sounding good for separate notes, will sound metallic or hollow when there is a large scale smooth transition.

5.2. The second problem is the time taken for calculating the samples. To be useful, the music has to start immediately when a command (by clicking the mouse) is issued. One solution is to create the samples for (say) 4 seconds, start playing the music and while the music is played create the next block of 4 seconds. This is a well known technique (double buffering) used in processing sound in computers (recording or playing). The main requirement for this technique to work is that the time taken to synthesize the music should be much less than the time taken to play it. With the algorithm developed by me for calculation, this criterion was satisfied in PC's with Intel 486 processor working at 66 MHz and faster computers. In fact with the 486 processor the time taken to synthesize the music was about 40% of the playing time and with Pentiums it was far less.

5.3. However, the problem of wave form definition for transitions remained. An approximation was used and a limited number of wave forms were defined for different ranges specially for transitions. The approximation was not only in respect of ranges but also the manner in which the pitch varied. This was the greatest problem since the music may stay at a pitch for some time and then start moving up or down. It may also start moving immediately after start. To provide for all possibilities an very large number of wave form definitions would be required which could put a heavy strain on the memory resources of the computer. The final product has still some unsatisfactory tones for Veena where the pitch varies over a large range. In the case of the Flute, the overtones are quite limited and the problem of tonal quality during pitch variations was not found to be a problem.

5.4. The principles used could be extended to other instruments. But for instruments like violin which has large number of overtones and whose fundamental pitch for a given sruthi (kattai) is double that of Veena, a sampling rate of 11025 would not be adequate. The sampling rate decides the highest possible frequency that could be recorded (Nyquist limit). Instruments with higher pitches and larger number of overtones need higher sampling rate which would imply that higher number of calculations are required per second requiring faster computers.

6. Gamakams on the Computer. The actual production of Carnatic music with all its nuances, required evolving a suitable notation system. For the program which only explained Carnatic Music system, where the user is not required to key in any data (except for some quizzes), I evolved a system which directly used the frequencies and durations. Initially an attempt was made to develop 'C' language functions for the traditionally defined gamakams like Kampitham (shaking a note) Thirupam( reaching a higher note and them coming to the lower note in arohanam), Jaru (smooth slides), Janta (Sphuritham and Prathyaghatham – stressing the second note of a pair). Except for Jaru these were not found convenient. Instead the entire music was conceived of phrases and each individual phrase was defined by (a) the total number of notes in the phrases (including anuswarams) (b) the individual frequency and durations of the first note, the duration of transition to the next note, the duration of the next note and so on. Suitable codes were developed to define these parameters as also to indicate periods of silence.

6.1. Thus the Rishabham of Carnatic Saveri was coded as

155 0 166 20 5 172 5 10 166 15 5 170 5 10 166 10 150 5

where the first number indicated that 5 notes are in the phrase, the second number 0 is for internal use with reference to the wave form definition. The number 166 represented the frequency of Shadjam (approximately 3 kattai i.e. the 3rd white note on the Keyboard) the next 20 indicated the duration of Shadjam, the next 5 the duration of transit to the higher note of frequency 172 and so on. The last note is Shadjam for a duration of 9 units. The durations are in 10 milliseconds (1/100 of a second). The number 150 indicated silence for the duration of the next number. This type of coding was found to be fastest for calculation (the coding was stored in binary form for reading and calculating which speeded up the process). It took considerable experimentation and practice to get the correct coding for different gamakams and my four decades of Veena playing helped me in trying to quantitatively define the gamakams. It may be noted that the Rishabham frequency is less than 166 X16/15 = 177 and even less than 166 X 256/243 = 174.

6.2. As I did not have the tools to analyse live music gamakams at that time and as the style of singing gamakams varies among artistes, I took the route of experimenting and generating the gamakam as close as possible to the form in which it was taught to me.

7. Some observations: The experience of generating gamakams synthetically revealed a few interesting points.

(a) Kampitham which is a single classification can have many ramifications, the two basic types being those anchored on the lower note (more common, as in Saveri Rishabham) and those anchored on the higher note(as in Surati Nishadam).

(b) When Kampitham is anchored on the lower note the duration of the lower note is much more than that of the higher end of the gamakam. The upward transition time is generally less than the downward transit time.

(c) The frequency of the upper end of the note in the case of black notes like the Carnatic Suddha Rishabham is not critical. Often the same musical experience is obtained with different frequencies by adjusting the transit durations and the duration of stay in the upper note. It would appear the brain perceives an average value for the note which could be a weighted average of the frequencies in the entire phrase, giving due weightage for the durations. This is a subject requiring further study in the realm of musical cognizance.

(d) In the case of Carnatic Suddha Rishabham. Sadharana Gandharam, Suddha Dhaivatham and Kaisiki Nishadam generally the lower end of the gamakam is the note below (Chathusruthi Rishabham etc.) but by using a frequency slightly higher than the lower note, the main note is itself felt to be higher in pitch (for Suddha Rishabham of ragas like Panthuvarali, Sadharana Gandharam of Ranjani etc.)

(e) The durations of notes and their transition times were very important. The same phrase with the same frequencies but with different durations does not generate the required gamakam.

(f) Gamakam within the note (as in Gandharam of Kalyani) required an oscillation over a range of a relative frequency of about 41/40. Here again the frequency of the upper end of the gamakam could be changed and with change in durations acceptable gamakam could be produced.

(g) In the many case of gamakams on Suddha Madhyamam (in Sankarabharanam) and Kaisiki Nishadam (Bhairavi), the lower end of the gamakam was below the main note and the upper end was at Panchamam and Shadjam respectively (or slightly lower). If the lower end is placed at the note itself the raga bhavam was changed drastically.

(h) Only Janta gamakam could be standardised. It is generally produced by touching the lower note momentarily before the second note of the pair is sung. . A duration of 3/100 to 5/100 of the second for the lower note and an equal transit time was found adequate. With this 'akaara' Janta could be produced reasonably

(i) Jaru was also easy to code. It was found that the duration of the transit was critical for raga bhavam in certain cases. For instance the phrase ' sa ni pa ' of Kedaram and Neelambari differed considerably in the duration of descent.

(j) Kakali Nishadam sounded too low even when the relative frequency of 243/128 was used. The note has generally to be coded as 'sa ni sa'. Similarly Prathi Madhyamam sounded too low even when the r.f. 64/45 was used and the note had to be generally coded as 'pa ma pa'.

8. Importance of note and transit durations. The most important point which came out was that in the production of gamakams not only the extent of oscillation but the durations of the notes and the transitions were also very important. Obviously a computer analysis of live music would be of great value in this regard. The presently available tools for analysing music put greater stress on the spectrum of the sound rather than the actual frequencies since in the West the frequencies are fixed. Fourier Transform techniques introduce large errors when the frequency is to be ascertained over a very small time period which is necessary for analysing the gamakams as the frequency keeps changing continuously. I am trying to develop an analytical tool for this purpose.

9. Music generated from Notation. The second program which enables the user to enter notation in 'sa ri ga ma' style required an interface to convert the notation into frequencies and durations which was not difficult and which did not put too much strain on the computer resources. But extending the notation system to represent gamakam quantitatively was found to be very difficult. The choice was between a qualitative symbol for the gamakam (such as Kampitham) letting the computer generate the most common form of the gamakam or give the user total control on the flow of frequencies and durations. The latter was chosen to make the program more useful for different types of users but it also introduced considerable difficulties for the user.

9.1 Some minor changes were made in the notation system to avoid creating a new set of Fonts and key correlations. Upper and lower case letters are used to indicate the octave instead of putting dots over and below the notes. To denote higher 'kalams' (double, quadruple tempos) brackets are used instead of lines drawn over the notes. In fact for accurate production of gamakams sometimes even triple brackets equivalent of 3 lines drawn over the notation were required.

9.2. Three major additions were made to make the notation more accurate, which is needed for the computer.

(a) The symbol '-' was introduce to separate groups of notes into phrases. At present books use the symbols ',' and ';' to indicate both silences of 1 or 2 notes duration and to prolong the pervious note. The hyphen symbol '-' makes it easy to indicate whether a silence is indicated or prolongation of previous symbol.

(b) The symbols '\' and '/' were introduced to precisely define jaru i.e. to indicate the transit durations.

9.3. The other requirements of melam definition, choice of tempo, change of melam in the middle of a file (for ragamalika), change of tempo in the middle of a file(for gathi change) have been incorporated. Provision for quick check of total number of notes (after adjusting for half durations, quarter durations etc.) is available. This check can be made for any selected part of the notation.For playing the notation either all the notation or selected part can be played. Continuous repeated play of the notation by looping back is also possible which may be useful for practice. A wide range of adhara sruthis from .5 kattai to 5.5 kattai i.e. a semitone below C up to a semitone above G, are available and a choice of instrument (Flute or Veena) is provided.

9.4. The program is considered useful for copying notation from books and playing it. But for this to be really useful gamakams have to be added by splitting notes by showing anuswarams explicitly and adjusting their durations of the notes and transitions. To make it easier, the program also provides for some ready made notation for common raagams, which can be copied and pasted. It is also noticed that in the books silences are not explicitly noted except where the composition starts after the beginning of avartham, when a comma ',' or a semicolon ';' or their combinations are used. Elsewhere these symbols are used both to prolong a note and to indicate gaps. But for the computer to play the notation accurately the user has to work out the actual note's duration and the silence that follows and put a hyphen symbol '-' in between. Again for a realistic reproduction of a krithi's words silences are to be provided when joint consonants appear. For instance when the word 'Bhaktha' is pronounced there is as minute gap in the continuity of voice before the consonant 'tha'. Normally in writing notation these finer aspects are not incorporated as they are played intuitively on instruments.

9.5. The gap between what is written down in the existing system of notation and what is actually sung is a major problem in trying to generate carnatic music from notation on the computer. The major deficiencies of the notation system are (a) inadequate representation of gamakams, (b) lack of precision in describing silences and (c) in the case of instruments lack of indication of points of strumming (for Veena) or bow change for (violin). Traditionally these are left to be learnt intuitively.

9.6 There is another field in which a program of this type could be useful which is the research in musical cognizance, since the required notes or phrases could be produced accurately and no element of subjectivity on the part of the singer would come in. One could do research into pitch recognition such as thresholds of note durations for accurate pitch appreciation. Even raga recognition could be objectively researched.

10 Conclusion. In conclusion it may be stated that though there are many problems in generating Carnatic music synthetically with all its nuances, these are not insurmountable. The application of such programs lie in the field of teaching (especially for practicing lessons in between tuitions with a teacher) and voice training. Such programs can also help in enhancing the student's appreciation of the finer aspects of Carnatic music and allow him to experiment The programs have also applications in the field of research especially the field of musical cognizance (in which very little work has been done).

(Published in the Journal of Sangeet Natak Akademi, New Delhi, 133-134 (1999) pages 16-24. )

-------