Why Expectation-Based, Multi-focal, Saccadic Vision for Vehicles? (EMS-Vision)
Listed in
This article is not in any list yet, why not save it to one of your lists.Abstract
Computational animation of spatiotemporal models for 3-D shape and motion is considered as core of dynamic real-time vision. This is based on sets of features evaluated from multi-focal image streams. Subjects are defined as special objects capable of sensing environmental parameters and of initiating own actions in combination with stored knowledge. Object/subject recognition and scene understanding are achieved on different levels. Multiple objects are tracked individually for perceiving their actual state (‘here and now’). Fast saccadic jumps in gaze direction allow flexible concentration on objects or parts of actual interest. By analyzing motion of relevant objects/subjects over a larger time scale on the level of state variables and documenting them in the ‘scene tree representation’ (computer graphics), the situation with respect to decision making is assessed. Various behavioral capabilities of subjects are represented on an abstract level for characterizing their potential behaviors. These are generated by stereotypical feed-forward and feedback control applications on a separate level close to the actuator hardware with corresponding methods. This dual representation on one level for decision making and one on the implementation level allows for flexibility and easy adaptation or extension. Results are shown for road vehicle guidance based on three cameras on a gaze-controlled platform.