Kinect for Windows SDK Beta + Speechで音声認識

 Micorosft Speechをインストールして、Kinectのアレイマイクによる音声認識も試してみました。サンプル「Microsoft Research KinectSDK Samples\Audio\Speech\CS\Speech.sln」をちょっと改造し、以下の文(C#コード)を再現できるようなプログラミング用の単語とおまけの単語を設定しておき、読み上げて音声認識させてみました。

人が読み上げる内容

namespace Speech {
	class Main {
		public static void Main() {
			System.Console.WriteLine("Hello World");
		}
	}
}

認識結果

 以下のように認識結果が出ました。英語版のまま試したので私には発音が難しい点はありますが、なかなか使えそうな感じがします。現在Kinect対応ゲームが実現しているように、各シチュエーションで似た音を避けて、うまく単語を設定すれば、ゲームやアプリケーションの簡単な指示を音声で実行可能にすることは難しくなさそうです。

 文脈で判定を加えたり、場所によって単語を動的に増減したり、インテリセンスのような補完候補の選択が構築できれば、音声でのプログラミングに近づくかも知れませんね(笑)。

※何度か試した後の結果です。なんとなく追加した不正解の単語にいくつかヒットしています。

Using: Microsoft Server Speech Recognition Language - Kinect (en-US)
Recognizing. Say: ... Press ENTER to stop
Speech Hypothesized: namespace
Speech Recognized: namespace
alternatives: namespace
Speech Hypothesized: once
Speech Recognized: args
alternatives: args
Speech Hypothesized: Speech
Speech Recognized: Speech
alternatives: Speech
Speech Hypothesized: open brace
Speech Recognized: open brace
alternatives: open brace
Speech Hypothesized: class
Speech Recognized: class
alternatives: class case args
Speech Hypothesized: Main
Speech Recognized: Main
alternatives: Main
Speech Hypothesized: open brace
Speech Recognized: open brace
alternatives: open brace
Speech Hypothesized: public
Speech Recognized: public
alternatives: public
Speech Hypothesized: static
Speech Recognized: static
alternatives: static
Speech Hypothesized: void
Speech Recognized: void
alternatives: void
Speech Hypothesized: Main
Speech Recognized: Main
alternatives: Main
Speech Hypothesized: function
Speech Recognized: function
alternatives: function
Speech Hypothesized: open brace
Speech Recognized: open brace
alternatives: open brace
Speech Hypothesized: System
Speech Recognized: System
alternatives: System
Speech Hypothesized: dot
Speech Recognized: dot
alternatives: dot
Speech Hypothesized: Console
Speech Recognized: Console
alternatives: Console
Speech Hypothesized: dot
Speech Recognized: dot
alternatives: dot
Speech Hypothesized: protected
Speech Recognized: protected
alternatives: protected
Speech Hypothesized: fighter
Speech Recognized: fighter
alternatives: fighter
Speech Hypothesized: fighter
Speech Rejected

Writing file: RetainedAudio_21.wav
Speech Hypothesized: WriteLine
Speech Recognized: WriteLine
alternatives: WriteLine
Speech Hypothesized: Console
Speech Recognized: Console
alternatives: Console
Speech Hypothesized: call
Speech Recognized: call
alternatives: call
Speech Hypothesized: doublequoted
Speech Recognized: doublequoted
alternatives: doublequoted
Speech Hypothesized: Hello World
Speech Recognized: Hello World
alternatives: Hello World
Speech Hypothesized: semicolon
Speech Recognized: semicolon
alternatives: semicolon
Speech Hypothesized: close brace
Speech Recognized: close brace
alternatives: close brace
Speech Hypothesized: once more
Speech Recognized: once more
alternatives: once more
Speech Hypothesized: once more
Speech Recognized: once more
alternatives: once more