Active Vision Project > Overview > Coevolution

Coevolution of Active Vision & Feature Selection

Dario Floreano, Toshifumi Kato, Davide Marocco and Eric Sauser

Abstract

We show that complex visual tasks, such as position- and size-invariant shape recognition and navigation in the environment, can be tackled with simple architectures generated by a coevolutionary process of active vision and feature selection. Behavioral machines equipped with primitive vision systems and direct pathways between visual and motor neurons are evolved while they freely interact with their environments. We describe the application of this methodology in three sets of experiments, namely, shape discrimination, car driving, and robot navigation. We show that these systems develop sensitivity to a number of oriented, retinotopic, visual-feature-oriented edges, corners, height, and a behavioral repertoire to locate, bring, and keep these features in sensitive regions of the vision system, resembling strategies observed in simple insects.

Introduction

Top


1. Shape Discrimination


Toshifumi Kato and Dario Floreano

We've implemented an autonomous agent called "Artificial Retina" which is able to move freely in a visual field and change its size and filtering strategies. Pixel value informations obtained by Artificial Retina are used as input values to the artificial neural network which controls the retina. The Artificial Retina moves around in a visual field to find specific features that may be useful for pattern discrimination or navigation tasks or any other behavioral relevant information. We've successfully evolved an Artificial Retina composed of nine visual neurons connected to a single-layer perceptron to descriminate between triangle and square, concave shape and square that can appear at any location and with any size in an image. Despite the fact that such images are non-linearly separable, the evolutionary active perceptron can successfully discriminate among different figure types by exploiting image exploration.

System Overview

The figure below shows the Artificial Retina (with green window frame) in a visual field (image). It consists of 9 square shaped cells. The number of cells is fixed to 9. Each cell gives the system an input value from pixel value information inside a cell, by using some filtering methods. Therefore, this 9-cell artificial retina window gets 9 inputs and they are all the visual information the system receives from a visual field. The idea is that we let this artificial retina explore the image and find features to discriminate shapes.

Filtering Methods

There are two simple methods available for this system. One is to use the pixel on the top left corner of the artificial retina, which is called sampling method. The other is to use the average value of all pixels in a cell, which is called averaging method.

Variable Resolution

The Artificial Retina can change its resolution at run-time by increasing or decreasing the number of pixels that are fed into each visual cells. In the experiments in our paper, we allowed three receptive field sizes, 15, 30 and 60 pixel per siderespectively.

Active Motion

The Artificial Retina can move in an image by determining how far and which way to go. The top left corner point is used to represent the position of the artificial retina in x-y coordinates. The maximum distance is 50 pixels and minimum is 0. The direction ranges from 0 degree to 359 degrees. The procedure from perceiving inputs to producing outputs is called a step. On each step, the artificial retina can move and select its size and filtering strategies. We gave the system 50 steps per image, so the artificial retina can move 50 times per image.

Neural Architecture

The architecture of the neural network controlling the artificial retina is as follows:


Figure: Architecture of Neural Network

Genetic Encoding

The 6 output nodes are connected to all 17 input nodes and the resulting 102 connection strengths are encoded in the genetic string by using 5 bits per connection and decoding them in the range -4 to 3. The genetic algorithm uses rank-based selection; the best 20% individuals are preserved. It uses elitism as well. Crossover probability per pair is 0.1 and mutation probability per bit is 0.01.

Experimental Details and Videos

The following links describe experimental details of a number of experiments. You can also find videos of the evolved retinas while they scan the image.

An Evolved Retina

The figures below show a sequence of displacements of the retina over the image. For each position the retina must provide a guess about the type of figure; its profile is red when the answer is wrong and green when it is correct. The retina starts with a very low resolution covering almost the whole image and immediately shrinks to a small size while moving around the image. The top left corner of the retina is highlighted by a bluespot.

The most common evolved strategies work as follows. By default, the system responds, "there is a triangle in an image". From the global view at the beginning of an image, the artificial retina roughly knows the location of a figure, though this does not function every time. Then the artificial retina moves searching for a figure. When it finds a figure, it checks the right side to determine whether the figure is a square or not. In case the artificial retina finds a vertical side, it stays there still and the system switches its response from "triangle" to "square". If the side is not vertical, as you could see in the figure on the right, the artificial retina tries to find a vertical side for several times. After some steps, it simply gives up and moves off the triangle and goes look for another figure. We'd like to note that the sampling method was more preferred in this experiment. This is because the system could obtain stronger stimulus from images.

Conclusion

In summary, the artificially evolved active vision system we have developed is able to perform shape recognition with a few computational resources. And the artificial evolution is interesting because the active vision system is free to select features and strategies to perform the task. This system is interesting to compare to the performance of the human eye. Current work is aimed at expanding the architecture and configurations of artificial retina for more complex tasks and at installing the system in mobile robots with active vision systems that could autonomously decide where to look and how to move in the environment.

Publications

A conference paper where the active vision system has been presented and compared to a conventional neural image processing system.