[Cuis] Audio and Video Object Analysis

Sun Dec 15 13:24:49 CST 2013

I will be working toward having Cuis programmable by voice.  There are many
voice transcribers which do not transcribe my voice accurately.  The first
step is getting audio input then using Fast Fourier Transform (FFT in Cuis
4.2 package) to reverse it from time to frequency format to enable
recognizing patterns of spoken syllables and characters.  One idea I read
recently is to use overlapping frames to capture speech events that may be
longer or shorter depending on speaking speed.  I don't yet know how to
display FFT output to look for patterns or how to match a pattern like a
triangle that may be equilateral in one speech sample but isosceles in
another sample.  I hope to eventually detect some nuanced attributes like
feelings being communicated.  Is anyone interested in collaborating on this
project?

I'm also interested in Video object analysis.  This is to let Cuis know what
it is seeing so Cuis can be used in robotic applications like the Google
self-driving cars or street optical character recognition for blind or
foreign language people.  The first step is getting a video stream into
Cuis, which has been done in Squeak, then enable capturing and processing
individual frames with arc, line, and corner detection to match patterns.  A
set of pattern match methods will identify visual object classifications for
more detailed analysis until the scene is described in text.  Another step
would be to analyze camera motion across multiple frames to produce more
accurate descriptions of 3D objects.  Is anyone interested in collaborating
on this project?  

--
View this message in context: http://forum.world.st/Audio-and-Video-Object-Analysis-tp4730295.html
Sent from the Cuis Smalltalk mailing list archive at Nabble.com.