IBM®
Skip to main content
    Country/region [change]    Terms of use
 
 
 
    Home    Products    Services & solutions    Support & downloads    My account    
IBM Research

IBM Research


IBM Text-to-Speech Research

Text-to-speech (TTS) is the generation of synthesized speech from text. Our goal is to make synthesized speech as intelligible, natural and pleasant to listen to as human speech and have it communicate just as meaningfully.

We have developed a novel TTS system, Naxpres, built on IBM's successful work in data-driven methodologies for speech recognition. Our system obtains its parameters through completely automated training on a few hours of speech data, which is acquired by recording a specially prepared script. During synthesis very small segments of recorded human speech are concatenated together to produce the synthesized speech.


Interactive U.S. English Demo

Interactive Demo
This demonstration of our work in unconstrained text-to-speech research allows users to submit text to be synthesized into speech.


Expressive Samples

Most speech synthesis has a neutral, one-size-fits-all expression, regardless of what it's saying. The new IBM expressive speech synthesizer has a range of expressions, so you can tune the speech to fit its content. Here are some examples.

Good news statement  Unexpressive  Expressive
Bad news statement  Unexpressive  Expressive
Yes-no question      Unexpressive  Expressive
Contrastive emphasis
response to question about
changing in Atlanta
  Unexpressive  Expressive


Language Samples

Here are some samples from other languages that we are working on. IBM Research labs all over the world have developed these examples of synthesized speech in their languages.

Arabic
ÃõßúÊõÔöÝóÊú åóÐöåö ÇáúãóÞúÈóÑóÉ?ÚóÇã?ÇËúäóíúä?æóÚöÔúÑöíä?æóÊöÓúÚöãöÆóÉò æóÃóáúÝò Úóáó?íóÏö åõæóÇÑúÏ?ßóÇÑúÊóÑ? æóÞóÏö ÇÍúÊóæóÊ?Úóáó?ÃóßúËóÑö ÇáúãóÌúãõæÚóÇÊ?ÇáúãóáóßöíøóÉö ÇßúÊöãóÇáð?ãöäú ãöÕúÑó ÇáúÞóÏöíãóåú. æóÚóÏóÏõåó?ÍóæóÇáóíøú ËóáóÇËóÉ?ÂáóÇÝò æóËóãóÇäöãöÆóÉ?æóÎóãúÓöíä?ÞöØúÚóÉð ÈóÏúÁð?ãöäó ÇáúÃóáúÚóÇÈö ÇáúÎóÇÕøóÉ?ÈöÇáúãóáöß?ÇáÕøóÛöíÑö Åöáó?ÇáúÃóËóÇËö æóÇáúÃóÓúáöÍóÉ?æóÃóÏóæóÇÊ?ÇáÕøóíúÏ?Åöáó?ÌóÇäöÈ?ÇáúÞöäóÇÚö æóÇáÊøóæóÇÈöíÊ?ÇáÐøóåóÈöíøóåú. Ðóáößó ÈöÇáúÅöÖóÇÝóÉö Åöáó?ÇáÑøõãõæÒö æóÊóãóÇËöíáó ÇáúÂáöåóÉö ÇáøóÊö?ÊóÍúãö?Çáúãóáöß?æóÊõÓóÇÚöÏóå?Ýö?ÇáúÚóÇáóãö ÇáúÂÎóÑú. Discovered in 1922 by Howard Carter, the tomb of Tutankhamun contained the most extensive royal treasure of ancient Egypt. The collection consisted of over 3850 artifacts including everything from toys and games for the young king to furniture, weapons, chariots, a golden mask and a golden sarcophagus. Many statues and symbols of deities to protect and help the king in the afterlife were also found in the tomb.
   
  • Male
  • 22 kHz(wav)
  • Female
  • 22 kHz(wav)

    Chinese
    »¶Ó­Ê¹Óùú¼ÊÉÌÒµ»úÆ÷¹«Ë¾ÖÐÎÄÓïÒôºÏ³Éϵͳ
    Welcome to the Mandarin text-to-speech system developed by the International Business Machines Corporation.
       
  • Simplified-Male
  • 22 kHz(wav)
  • Simplified-Female
  • 22 kHz(wav)

    Åwªï¨Ï¥Î°ê»Ú°Ó·~¾÷¾¹¤½¥q¶}µoªº¼sªF¸Ü¤å»yÂà´«¥Ü½d¡C
    Welcome to the Cantonese text-to-speech system developed by the International Business Machines Corporation.
       
  • Cantonese-Male
  • 22 kHz(wav)
  • Cantonese-Female
  • 22 kHz(wav)

    º¿ÄR¦³¤@¥u¤p¦Ï¯Ì¡C¥¦¦³³·¥Õªº¤ò¡CµL½×º¿ÄR¨«¨ì¨º¸Ì¡M¥L´N¸ò¨ì¨º¸Ì¡C
    Welcome to the Mandarin text-to-speech system developed by the International Business Machines Corporation.
       
  • Taiwanese
  • 22 kHz(wav)

    French
    Male:
    Bienvenue chez IBM. Taper 1 pour commander. Taper 2 pour annuler. Taper 3 pour confirmer. Puis, veuillez composer les 10 chiffres de votre numéro et terminer en appuyant sur la touche dièse.
    Welcome to IBM. Press 1 to order. Press 2 to cancel. Press 3 to confirm. Then, compose the 10 digits of your number and finish by pressing pound sign.

    • Male 8 kHz (wav)

    Female:
    Date : Le 21 juin 2002, 12 heures 30. Adresse : 2 avenue Gambetta La Défense. Téléphone : 01 49 05 43 67. Votre compte est créditeur de 3481 euros.
    Date : 21st of June 2002, 12h30. Address : 2 Gambetta avenue in La Defense. Telephone : 01 49 05 43 67. Your credit balance is 3481 euros.
    • Female 8 kHz (wav)

    German
    Male:
    Ich habe für sie ein Zimmer mit Blick auf das Meer für den Zeitraum vom 28. Juli bis 5. August reserviert.
    I have reserved a room for you with a view on the ocean for the time of Jul 28 til Aug 5.

    • Male 8 kHz (wav)

    Female:
    Ihr aktuelles Guthaben beträgt 19 Euro und 25 Cent, ihr nächster Aufladetermin ist der 25.01.2003.
    Your current credit is 19 Euro and 25 cent, your next recharge date is January 25th 2003.
    • Female 8 kHz (wav)


      
     

    Products
    IBM's commercial TTS site contains more access points to available IBM products and product information.

    Publications

    News about our research
      · The Quest for the Digital Chatterbox
    Forbes.com
      

      About IBM  |  Privacy  |  Legal  |  Contact