Louis Pols, Lauri Karttunen, and Jean-Paul Haton talk to Saras Institute
Matt Speed, Catherine Lai, and Marcel Waeltermann
SLTC Newsletter, July 2009
We continue the series of excerpts of interviews from the History of Speech and Language Technology Project. In these segments Louis Pols, Lauri Karttunen, and Jean-Paul Haton discuss how they became involved with the field of speech and language technology.
These interviews were conducted by Dr. Janet Baker in 2005 and are being transcribed by members of ISCA-SAC as described previously in 1, 2, 3, 4, 5. Sunayana Sitaram (National Institute of Technology, Surat), and Antonio Roque (University of Southern California) coordinated the transcription efforts and edited the transcripts.
Transcribed by Matt Speed (University of York)
Q: One question that I think we’ve asked everybody, is how did you get into this field?
A: I’m a physicist by training, but I didn’t know too well what field I would choose. I never knew what I wanted to do, so I started to study physics because I thought it might be interesting, and I had good figures in high school. While studying theoretical physics I didn’t know which direction to take, so I took solid state physics but still wasn’t sure about it. So I got my degree and I didn’t know what to do, and then I had to fulfill my military service which in the military in Holland at that time you could do before after your university studies. So, rather than fulfilling my draft I took the alternative of starting some research at the defense research organization, and that happened to be the TNO Institute in Soesterberg. There I came into the Department of Speech and Hearing of Reinier Plomp and I started doing military-type work, measuring noise around rifles and hearing loss in people that were shooting and such practical things. Quickly I ran into the regular research that was going on there, and that was psychoacoustics. At that time psychoacoustics was only done by tones and by combinations of tones and Plomp wanted to make this slightly more realistic. He wanted to use real sounds, and one of those real sounds of course is speech sounds and vowels. He wanted to do timbre perception and pitch perception on real sounds and he asked me if I would be interested in working on that a little bit. So that’s how I gradually grew into it. We started to do psychoacoustic research with vowel like sounds and we liked it so much that it grew and we went on to more dynamic sounds, then synthesis, recognition and coding and all of that. After my period of one and half years in the military we talked to each other and said that he liked me and we both liked that type of work, so I got a job at the institute.
Q: And what year would that have been?
A: I graduated in 1964, so in mid 1965. I became active in the field and from a physicist I gradually grew into a speech scientist and also a phonetician. I never got any training in phonetics, but I took it up myself. That’s how it started and the work we did came closer and closer to phonetics work done in the universities. So then you become a member of the Phonetic Association and you become a chairman of the board and things like that and then they need a new professor in Amsterdam and my boss said “why don’t you apply, you never know”. I was totally relaxed because I didn’t care, I could do whatever I liked in the institute, they gave me complete freedom but I thought ‘why not’. That was exactly the right attitude to get that job because my competitors were all very nervous and were chasing the job because they had a long career in this field already. For some reason they chose me, and that was 1982
Q: That was 1982?
A: Yes. I also got my PhD of course. At a certain point I was still at TNO. The ‘T’ in TNO stands for Applied – Applied Physics Research, so it is an applied research institute and one of the sections is Defence Research but the institute was very open towards science. Actually it was more a real research institute, like the Fraunhofer-Gesellschaft in Germany. They had good opportunities to get your doctoral degree. So after a while I graduated. I got my PhD in 1977 I think, so that was more than 10 years after I started, so I did it the slow way. I first did all kinds of things and then at some point we both realized that if we put a few things together then we would have a nice topic for a thesis, so that’s what I did.
Transcribed by Catherine Lai (University of Pennsylvania)
Q: Could you maybe say something about how you got into this field?
A: Oh, it's kind of difficult to pinpoint it. In linguistics I've actually had two careers. I started out as a semanticist. I wrote my dissertation on problems of reference in syntax and I invented a theory that became very widely used for discourse reference, then it became the theory of Irene Heim and Hans Kamp who developed it further. Then I worked on presuppositions and implications and brought about the theory of questions, the model theoretic semantics of questions. But I had been doing a little bit of computation already when I was a grad student and I spent a year when I was writing my semantics dissertation in Santa Monica at the RAND corporation, where I worked with Martin Kay who is one of the great pioneers of the whole field. But then I got my job in Texas and I was just doing semantics for ten years with nothing to do with computational things. But I was getting tired of this stuff and I just accidentally got myself one of these early ARPANET accounts and decided that I was going to do Finnish.
Q: What year was that?
A: This was around 1978. Again kind of...I signed up to teach a course in computational linguistics, of course knowing nothing about it at the time.
Q: (laughs) Well, that's the best way to learn it right?
A: That's what academic people do!
A: And so I re-taught myself how to program and I taught myself LISP. I was organizing, actually even, a conference on parsing and a lot of people came to it including my old friends from RAND who are Martin Kay and Ron Kaplan who, I found, were already at PARC, in California, and on came also a young graduate student from Finland by the name of Kim Koskenniemi and discovered that we all had an interest in morphology and we all had been doing it. So, I had built an analyzer for Finnish in LISP...
Q: And what did it run on? What hardware?
A: A DEC machine... a DEC 20 or DEC 10, one or the other.
Transcribed by Marcel Waeltermann (Deutsche Telekom Laboratories)
Q: We’re great pleased today to have Jean-Paul Haton come here and honored that he is allowing us to interview him today. Thank you so much.
A: It’s a pleasure.
Q: One question I think we’ve asked everybody, and maybe the only question we’ve asked everybody is: How did you get into the field?
A: Good question for a good start. Well, I remember the date, it was in ’68, because May ’68 for a French guy is an important date. I was just graduated from Ecole Normale Supérieure. My first introduction to the field of speech processing was related to speech training for the deaf; for deaf children, especially. From that particular topic, I rapidly shifted to automatic speech recognition. Until now, I’m still involved in this particular area of speech communication.
Q: Why did you switch to speech recognition? Was that an opportunity?
A: In fact, speech training is one application area, a fundamental aspect of speech processing, pattern recognition, and speech recognition.
Q: And who were you working with at that time?
A: I graduated from Ecole Normale, as I said, in Paris, and I went to Nancy, which is in the north-eastern part of France. And there was a person who was working on speech training for deaf children, due to the fact that, unfortunately, his daughter was deaf. That’s the reason why he was working on this. I was interested by what he was doing. But rapidly, as I said, I shifted to the problem of speech recognition, which was more attractive for me.
Q: Were there other people working in speech recognition there?
A: No, I was the first person. I started the work at that place.
Q: What did you start trying to do? How did you start?
A: I started trying to recognize vowels, using a perceptron. So, that was very particular. At that time, the back-propagation learning algorithm was not yet invented, I should say. So, we were using a single-layer perceptron, which is a linear classifier. So, of course the results concerning the recognition of vowels were not really good.
Q: What computers were you using, or what resource?
A: We were using what we called at that time a minicomputer, “mini-ordinateur” in French, from the French company Télémécanique, which was a very limited machine. Nevertheless, for my doctoral thesis in 1974, I had a real-time system for recognition of, let’s say, if I remember well, 30 words, which was not bad for the time; and in real-time, once again.