asteroza + speech   52

Different languages, similar encoding efficiency: Comparable information rates across the human communicative niche | Science Advances
In terms of encoded data, humans generally operate at 39.15 bits/sec regardless of language. Which means that languages that have low information per syllable must speak faster to compensate. Japanese seems to encode at 5 bits/syllable so must talk fast. On the other end, it looks like US english is capped at 9 syllables/sec so audio processing into information (mental objects) is a possible limiter on the high end.
linguistics  cognitive  research  language  speech  speed  density  information  science  neural  processing 
6 weeks ago by asteroza
DeepZen
Trying to put audiobook narrators out of business, by automating narration of e-books for conversion into audiobooks. Probably not better than real pro voice actors organized specifically for an audiobook, but for random small e-books, this is probably good enough (though that will put struggling unknown voice actors out of work...)
deep  machine  learning  ebook  e-book  voice  narration  automation  audiobook  conversion  audio  generator  service  speech  synthesis  TTS  text-to-speech 
june 2019 by asteroza
NVIDIA/waveglow: A Flow-based Generative Network for Speech Synthesis
25x realtime, so you can respond in realtime if your text generating NN can keep up with the speech-to-text...
Nvidia  GPU  CUDA  deep  machine  learning  human  speech  synthesis  synthetic  voice  artificial  generative  network  software  audio  TTS 
november 2018 by asteroza
Voicery Speech Synthesis
Using deep learning to make artificial voices for voice assistants
artificial  voice  synthesizer  service  TTS  text-to-speech  speech  audio 
march 2018 by asteroza
Serverless speech recognition with WebAssembly
Not the lambda variety of serverless, rather using webassembly to optimize local speech recognition from within the browser
javascript  webassembly  speech  recognition  library  software 
october 2017 by asteroza
Zypr | Tight
Basically, like Siri, but the backend is a stable buffer API web platform overlaid on third party web services API's, allowing you to do stuff like Siri without being aware of the true underlying API's being used. So, if some app developer wants to use Zypr, and use facebook integration services, if facebook changes their underlying API, you won't notice with Zypr as it only provides a normalized web API.
Pioneer  Zypr  natural  language  voice  speech  recognition  API  web  service  platform  Siri  cloud  mashup  webdev  programming  development  Delicious 
november 2011 by asteroza
Why Some Languages Sound So Much Faster than Others - TIME
Japanese has a high syllable speech rate (7.84 syllables per second average) but a low syllable information density (0.49). For comparison, English is 6.19/0.91, Mandarin 5.18/0.94, Spanish at 7.82/0.63. However, averaging total data output, most languages average out the same total information output for a given time unit, indicating a neural limit on language.
japanese  language  reference  information  speech  speed  syllable  data  density  Delicious 
september 2011 by asteroza
Why Some Languages Sound So Much Faster than Others - TIME
Japanese has a high syllable speech rate (7.84 syllables per second average) but a low syllable information density (0.49). For comparison, English is 6.19/0.91, Mandarin 5.18/0.94, Spanish at 7.82/0.63. However, averaging total data output, most languages average out the same total information output for a given time unit, indicating a neural limit on language.
japanese  language  reference  information  speech  speed  syllable  data  density  Delicious 
september 2011 by asteroza
Asiajin » “Koe-tan”, A Free iPhone Voice Search App For Public Transit In Tokyo
Voice search enabled iPhone app to do transit searches for metro Tokyo. I thought the InFerret guys were all over this but I guess not...
koe-tan  iPhone  tokyo  public  transit  railway  railroad  JR  subway  voice  speech  recognition  search  lookup  software  app  japan  Novauris  Delicious 
october 2009 by asteroza
Speech enable your personal website or blog - ReadSpeaker
Opportunity for someone to make an iPhone app that reads RSS feeds. Like audiobooks, but realtime and not stale.
blog  audio  flash  widget  text  plugin  text-to-speech  TTS  reader  speech  reading  ReadSpeaker  WebReader  voicer  Delicious 
february 2009 by asteroza

related tags

2.0  accelerometer  accessibility  adversarial  advice  aggregator  AI  algorithm  alice  amazon  analysis  android  answering  API  app  application  artificial  assistant  asterisk  audio  audiobook  augmentation  automated  automation  AWS  based  beta  biometric  blinkx  block  blocker  blog  bluetooth  browser  business  cadence  censorship  Cepstral  chilling  cloud  CMU  cognition  cognitive  command  computer  computing  concierge  configuration  content  contextual  conversion  converter  crawler  CSR  CUDA  customization  data  Death  deep  Delicious  delivery  density  detection  development  dictation  dictionary  download  dragon  e-book  ebook  effect  elevator  email  emotion  engine  english  entrepreneur  EU  experiment  face  facial  fake  feature  festival  FestVox  flash  free  generative  generator  gesture  google  GPU  guide  Guy  hidden  howto  human  identification  InFerret  Inferret  information  intensity  interaction  intercept  interest  internet  intiative  intonation  IP  iPhone  IR  IT  japan  japanese  java  javascript  JIbbigo  JR  julius  justice  kawasaki  koe-tan  laboratory  language  learning  library  linguistics  linux  live  lookup  mac  machina  machine  macro  marketing  mashup  media  microphone  microsoft  MIT  mobile  modeling  movement  MP3  music  narration  natural  NEC  network  neural  neuroscience  Nina  noise  Novauris  Nuance  Nvidia  online  opensource  OSX  output  personal  phone  Pioneer  pitch  pitching  platform  plugin  portable  presentation  processing  productivity  programming  Promptu  proximity  psychology  public  query  question  railroad  railway  reader  reading  ReadSpeaker  recognition  recording  reference  research  russia  science  SEAL  search  security  selfcensorship  sensor  server  service  setting  silence  simon  simulation  Siri  sms  social  sociometer  software  sound  spam  spanish  spectrogram  speech  speech-to-speech  speech-to-text  speed  sphinx  sphinx-4  Spider  spinvox  SPIT  spoken  standard  startup  statistical  subway  syllable  synthesis  synthesizer  synthetic  system  technology  telephony  test  text  text-to-speech  threat  timetable  TiTech  tokyo  toolkit  tools  toolsuite  tracking  train  transit  translation  translator  TTS  turing  tutorial  UI  understanding  utilities  verification  video  vigilante  virtual  vision  visual  vocal  voice  VoiceForge  voicemail  voicer  voicexml  VoIP  vox  wavenet  wearble  WearIT@work  web  webassembly  webdev  WebReader  widget  windows  WSRtoolkit  xml  Yandex  yokoku.in  Zypr 

Copy this bookmark:



description:


tags: