Animal consciousness

Posted on April 30, 2013 by Marco Barnig

Last update : August 6, 2013

Animal consciousness, or animal awareness, has been actively researched for over 100 years, but there has never been a agreement among scientists wether there is an animal consciousness or not, mainly because the problem of other minds. As the field of consciousness research evolved and as new techniques and strategies for human and non-human animal research have been developed, the question has been answered last year.

Francis Crick, an English molecular biologist, biophysicist and neuroscientist, co-discovered the structure of the DNA molecule in 1953, together with James Watson. Francis Crick, James Watson and Maurice Wilkins were jointly awarded the 1962 Nobel Prize for Physiology or Medicine for their discoveries.

In 1994, Francis Crick published the book The Astonishing Hypothesis (The Scientific Search for the Soul) about consciousness. In Februrary 2003, Francis Crick and Christof Koch published A framework for consciousness in the Nature Neuroscience magazine. The same year, one year before his death, Francis Crick was one of 21 Nobel Laureates who signed the Humanist Manifesto, published by the American Humanist Association (AHA).

In July 2012 took place the first annual Francis Crick Memorial Conference at the University of Cambridge, UK. The upshot of the meeting was the Cambridge Declaration on Consciousness signed by Christof Koch, David Edelman, Philip Low, Diana Reiss, Bruno van Swinderen and Jaak Panksepp.

Cartoon about animal consciousness drawn by Andrzej Krauze

Cartoon by Andrzej Krauze

The Cambridge Declaration concludes that “non-human animals have the neuroanatomical, neurochemical, and neurophysiological substrates of conscious states along with the capacity to exhibit intentional behaviors. Consequently, the weight of evidence indicates that humans are not unique in possessing the neurological substrates that generate consciousness. Non-human animals, including all mammals and birds, and many other creatures, including octopuses, also possess these neurological substrates.”

The related cartoon about animal consciousness, drawn by the Polish-born, British cartoonist Andrzej Krauze, was published in the NewScientist article Animals are conscious and should be treated as such, edited by Marc Bekoff.

Alan Turing and Robert Moog Google Doodles

Posted on April 29, 2013 by Marco Barnig

To celebrate Robert Moog’s 78th Birthday, Google published on May 23, 2012 an interactive doodle of the electronic analog Moog Synthesizer.

Google Doodle : Moog Synthesizer

The doodle was synthesized from a number of smaller components to form a unique instrument. When experienced with browsers supporting the Web Audio API, the sound is generated natively. For other browsers the Flash plugin is used. The doodle takes advantage of JavaScript, Closure libraries, CSS3 and tools like Google Web Fonts, the Google+ API, the Google URL Shortener and App Engine.

The Moog doodle was created by Google engineers Reinaldo Aguiar and Rui Lopes and the doodle team lead Ryan Germick.

For Alan Turing’s Centennial, Google published one month later (June 23, 2012) an interactive doodle showing a Turing Machine. The doodle was designed by Jered Wierzbicki and Corrie Scalisi, Software Engineers, and by Doodler Sophia Foster-Dimino. The code for this doodle was open sourced and is available at Google Code.

Google Doodle : Turing Machine

A video about the Art & Technology behind Google Doodles is available at Youtube.

Protected: Noms de domaines intéressants

Posted on April 28, 2013 by Marco Barnig

Processing software and projects

Posted on April 19, 2013 by Marco Barnig

Last update : January 21, 2014

Processing Software Logo

Processing is an open source programming language and environment for people who want to create images, animations, and interactions. Since 2001, Processing has promoted software literacy within the visual arts and visual literacy within technology. Initially created to serve as a software sketchbook and to teach computer programming fundamentals within a visual context, Processing evolved into a development tool for professionals.

Processing is an open project initiated by Ben Fry and Casey Reas. It evolved from ideas explored in the Aesthetics and Computation Group at the MIT Media Lab. The current version is 2.1, released on October 27, 2013.

The following websites help to learn processing :

processing.org : the official website with exhibitions, references, downloads, forums, feeds, wikis, code snippets, examples, blogs and more.
Processing on Wikipedia
Processing.js : processing javascript offical website
Learning Processing : a beginners guide by Daniel Shiffman (pdf file)
OpenProcessing : to share sketches with others
Sketchpatch : programming playground
HasCanvas : tool for creating and sharing processing.js sketches
SketchPad : featured processing.js sketches in the gallery
Stackoverflow : about 1.000 questions tagged with processing
Arduino : Playground processing
Processing tutorials, by Joseph Alexander Boston

Google text to speech (TTS) with processing

Posted on April 17, 2013 by Marco Barnig

Referring to the post about Google STT, this post is related to Google speech synthesis with processing. Amnon Owed presented in November 2011 processing code snippets to make use of Google’s text-to-speech webservice. The idea was born in the processing forum.

The sketch makes use of the Minim library that comes with Processing. Minim is an audio library that uses the JavaSound API, a bit of Tritonus, and Javazoom’s MP3SPI to provide an easy to use audio library for people developing in the Processing environment. The author of Minim is Damien Di Fede (ddf), a creative coder and composer interested in interactive audio art and music games. In November 2009, Damien was joined by Anderson Mills who proposed and co-developed the UGen Framework for the library.

I use the Minim 2.1.0 beta version with this new UGen Framework. I installed the Minim library in the libraries folder in my sketchbook and deleted the integrated 2.0.2 version in the processing (2.0b8) folder modes/java/libraries.

Today I run succesful trials with the english, french and german Google TTS engine. I am impressed by the results.

Google speech to text (STT) with processing

Posted on April 17, 2013 by Marco Barnig

Processing is an open source programming language and environment for people who want to create images, animations, and interactions.

Florian Schulz, Interaction Design Student at FH Potsdam, presented a year ago in the processing forum a speech to text (STT) library, based on the Google API. The source code is available at GitHub, a project page provides additional informations. The library is based on an article of Mike Pultz, named Accessing Google Speech API / Chrome 11, published in March 2011.

I installed the library in my processing environment (version 2.0b8) and run the test examples with success. I did some trials with the french and german Google speech recognition engines. I am impressed by the results.

Additional informations about this topic are provided in the following link list :

The Telecoms’ Future presented by Ernst & Young

Posted on April 12, 2013 by Marco Barnig

Ernst & Young developed in the past six months a scenario study entitled How will consumers communicate in 2020? The results have been presented today by the Telecommunications, Media& Entertainment and Technology (TMT) experts of Ernst & Young at its premises in Luxembourg.

Their thorough analysis resulted in two core uncertainties :

security and privacy (two extremes : in control or full chaos)
degree of Internet integration into our daily lives (two extremes : fragmeneted or fully integrated)

Placing these two core uncertainties on the axes of a coordinate system results in the following four divergent and challenging scenarios :

Full Speed Ahead scenario (self-regulation and uniform standards)
Roller Coaster scenario (high speed innovation, no rules)
Speed Limit Control scenario (stringent rules and regulations, more expensive, less user-friendly)
Gear Down scenario (lost of trust in the Internet)

Ernst & Young has sketched these four illustrative scenarios in an interactive video.

The Internet Society engaged in a similar scenario study about the future of the Internet a few years ago.

Blackberry Protect

Posted on April 11, 2013 by Marco Barnig

Last update : September 28, 2013

BlackBerry Protect is designed to help find your lost BlackBerry smartphone and keep the information on it secure. In BlackBerry 10, BlackBerry Protect is a feature built into the OS. For smartphones running on BlackBerry 7 OS and earlier versions of BlackBerry Device Software, BlackBerry Protect is a free application you can download on your smartphone.

The current location of your device can be mapped, you can make it ring loudly to help you find it even if sound is currently turned off, a customized message can be sent to your home screen even if the device is locked, it can be locked and optionally a new password can be set and you can permanently delete all data from the device. All these actions are done remotely from the Blackberry Protect Website.

View Location of a Blackberry device in Blackberry Protect

My Blackberry Z10 was located correctly in Cloche d’Or, Luxembourg.

Voice driven web applications

Posted on April 10, 2013 by Marco Barnig

Last update : July 17, 2013

The new JavaScript Web Speech API specified by W3C makes it easy to add speech recognition to a web page and to create voice driven web applications. It enables developers to use scripting to generate text-to-speech output and to use speech recognition as an input for forms, continuous dictation and control. The JavaScript API allows web pages to control activation and timing and to handle results and alternatives.

The Web Speech specification was published by the Speech API Community Group, chaired by Glen Shires, software engineer at Google. The specification is not a W3C Standard nor is it on the W3C Standards Track.

A demo working in the Chrome browser 25 and later is available at the HTML5 rocks website.

There are two processes : Text-to-Speech (speech synthesis : TTS) and Speech-to-Text (speech recognition : ASR). There are at least three different approaches to synthesize text :

integrated : a TTS module is built into the OS, or a separately installed TTS engine can plug-in to the OS’s TTS module.
packaged : instead of requiring a separate install, a synthesizer and voices can be packaged and shipped with the application.
in the cloud : a web-service is used to synthesize text. The advantage of this is a more predictable and consistent voice quality, independent from the hardware and operation system used on the mobile client.

Concerning ASR, Wolf Paulus, an internationally experienced technologist and innovator, compared the performance (speed and accuracy) of the speech recognition systems developed by Google, Nuance, iSpeech and AT&T.

A HTML Speech XG Speech API Proposal, introduced by Microsoft to the HTML Speech Incubator Group, is available as unofficial draft at the W3C website.

A list of speech recognition software is available at Wikipedia. The main hosted speech applications are presented below :

iSpeech

iSpeech provides speech solutions for individuals and business, in different fields as mobiles, connected homes, automotive, publishing (audio books), e-learning and more. The solutions include Text-to-speech (TTS) and speech recognition (ASR).

iSpeech offers API’s and SDK for developers for different devices and programming languages (iPhone, Android, Blackberry, PHP, JAVA, Python, .NET, Flash, Ruby, Perl) and comprehensive documentations, integration guides, web samples and FAQ’s. iSpeech povides development keys to use the three servers :

Mobile Development
Mobile Production
Web/General/Desktop/Other Production

The applications must be configured to use the correct servers.To make the web/general key work, you need to buy credits. The low usage price is $0.02 per word (TTS) or per transaction (ASR).

An free iSpeech app for iOS devices (version 1.3.5 updated May 13, 2013) to convert text to speech with the best sounding voices is available at the iTune store. This app is powered by the iSpeech.org Text to Speech (TTS) software as a service (SaaS) API. Other apps for iOS and Android devices are listed at the iSpeech website. A Text-to-Speech demo is also available.

Nuance

Nuance Communications is a multinational computer software technology corporation, headquartered in Burlington, Massachusetts, that provides speech and imaging applications.

In August 2012, Nuance announced Nina, a collection of personal assistant technologies that will bring Siri-like functionality to customer service mobile apps.

Nuance provides the Dragon Mobile SDK to developers that joined the NDEV Dragon Mobile developer program. This creates a unique opportunity in the mobile developer ecosystem to power any application with Nuance’s proven, best-in-class Dragon Naturally Speaking voice recognition technology.

In joining NDEV Mobile, developers have free access to wrappers and widgets for simple application customization, all through a self-service website. Developers also have access to an on-line community forum for support, a variety of code samples and full documentation. Once an NDEV Mobile developer has integrated the SDK into their application, Nuance provides 90 days of free access to the cloud-based speech services to validate the power of speech recognition on their application. To put an application in production, a licence fee of 3.000 $ has to be prepaid.The low usage price is 0,009 $ per transaction.

The following platforms are supported :

Apple iOS
Android
Windows Phone
HTTP web services interface

A mobile assistant & voice app for iOS and Android is available in the iTunes at GooglePlay stores.

AT&T Watson Speech engine

AT&T offers a free speech development program to access the tools needed to build, test, onboard and certify applications across a range of devices, OSes and platforms.

There are three classes of functionality in the AT&T speech API family :

Speech to Text : 9 contexts are optimized to return the text of what the end users say. The text can be returned in multiple formats, including, JSON and XML.
Text to Speech : Male and female ‘characters’ are available for both English and Spanish.
Speech to Text Custom : the speech service is customized by sending a list of words or phrases commonly spoken by the end users to improve recognition of those unique words. The Grammar List supports 19 languages, the Generic with Hints supports English and Spanish.

The Call Management (Beta) API that is powered by Tropo™ exposes SMS and Voice Calling RESTful APIs, which enable app developers to create voice-enabled apps that send or receive calls, provide Interactive Voice Response (IVR) logic, Automatic Speech Recognition (ASR), Voice to Text (VTT), Text (SMS) integration, and more. SDK’s are available for HTML5 (Sencha Touch), Android, iOS and Microsoft. Tools are provided for key platforms, including Android, Brew MP, HTML5, RIM BlackBerry and Windows Phone.

The Speech API provides two methods for transcribing audio into text and one method for rendering text into audio. An AT&T Natural Voices Text-to-Speech Demo is availbale at the AT&T research website.

API access to the AT&T sandbox and production environments costs 99$ a year. The sandbox and production environments allow you to develop, test, and deploy applications using AT&T APIs, including 1 million points (one transaction = one point) each month to spend on any APIs they like. A US based credit card is required to charge 20$ for each additional group of 2,000 points exceeding one million. See the AT&T pricelist.

AT&T Application Resource Optimizer (ARO) is a free diagnostic tool for analyzing the performance of your mobile applications. It can help your app run faster and smarter by providing recommendations to help optimize your mobile application’s performance, speed, network impact and battery utilization.

Speech API FAQ’s as well as code samples, documents, tutorials, guides, SDK’s, tools, blogs, forums and more are available at the AT&T speech development website.

Google Speech API

The Google Speech API can be accessed safely through a Chrome browser using x-webkit-speech. Some people have reverse engineered the Google speech API for other uses on the web. The interface is free, but it is not an official public API.

On February 23, 2013, Google announced at the Chrome Blog that the new stable Chrome release includes support for the Web Speech API, which developers can use to integrate speech recognition capabilities into their web apps in more than 30 languages. A web speech API demo is available at the Google website. In the Peanut Gallery, you can add intertitles to old black-and-white movies simply by talking to Chrome.

The following list provides links to more informations about the Google speech API’s :

Google Text-to-Speech TTS support
Chrome TTS, Google
speech2text, by Todd Fisher
Voicecolor, by filosophy
Introducing the Google Translate app for iPhone, Google Mobile Blog

More speech applications from other suppliers are listed hereafter :

Apple Siri (rumored to be powered by Nuance)
eSpeak
Free TTS
Infovox
Ivona (Amazon)
Julius
Mozilla HTML5 Speech API
OpenEars
speechapi
Speechutil (free TTS conversion)
Sphinx-4 and CMU Sphinx
Voiceware
Voxeo

The Eclipse Voice Tools Project (VTP) allows you to build and run speech recognition application using industry standards such as VoiceXML and Speech Recognition Grammar Specification (SRGS).

Internet with a Brain

Your browser becomes your personal assistant and Internet gets a synthetic consciousness

Author Archives: Marco Barnig