Perceptual Organisation

(Back to Vision Science)

Contents

Introduction

The visual system must be able to reconstruct a meaningful percept from all of the lines, edges and primitives extracted from low-level processing. It needs to know which parts of the image to group together into objects and separate objects into backgrounds. So how do we turn a jumble of constituent shapes into a picture?

A selection of multicoloured shapes on the left with the shapes making up an image of a girl on the right

How does the visual system know how to organise primitives into a visual perception of an object?

The Gestalt Approach (Top)

Pre-1920s, psychologists advocated structuralism, an idea where perception is built up from building blocks known as "sensations". This is similar to the impressionists paintings where little coloured dots are mentally grouped together to produce an object or scene.

 

Yellow and brown dots close together form the shape of a banana on the left and green and brown dots form the shape of an apple and stalk on the right

Structuralism - coloured dots form fruits

In the 1920s, Gestalt psychology formed the idea that whole is more than just the sum of its parts. There is plenty of evidence for the Gestalt approach and the main evidence is that our perception of one part of a stimulus relies on other parts of the stimulus.  This cannot be explained using structuralism.

An example of this can be seen in the picture to the right, whereby the horse in part a) can be seen as either rearing up on its hind legs or jumping. As we have more information in part b), with a show jump appearing in the picture, our perception is that the horse is more likely to be jumping over the jump than just rearing on to its hind legs.

 

Image a), left, looks at a horse rearing without context. Image b), right, looks at a the same silhouette of a horse but next to some horse jumps, indicating the horse is jumping.

Perception is more than just a sum of its parts - see how the the added visual information in b) provides a different visual perception of just a) alone

Overview of Gestalt Theory(Top)

 

The first overview point of Gestalt theory is that the whole is different to the sum of its parts. We use higher cortical processing and personal experience to interpret and collate "building blocks" of lower visual processing (i.e. knowing about the world will influence our perception of it). There are laws which govern how we organise these building blocks into chunks of the whole visual percept and the top-down model for visual processing concentrates on the high-level analysis of information influencing how the lower-level input is processed.

 

These constituent theories are easy to understand upon individual explanation with visual examples to complement the theory. We shall look at these individual Gestalt theories below.

 Gestalt Theory 1: Whole is Different to Sum of Parts

 

This first theory states that perception is more than a simple analysis of the components observed and the relationships between them.

 

Gestalt Theory 1An example of this can be seen to the left. Why do the four individual squares on the left hand side of the diagram appear to form the corners of a large square, but the individual quadrilaterals to the right of the image appear as random, individual shapes?

 

Gestalt Theory 2: Ambiguous and Unambiguous Pictures

The retinal image is a 2D image and it is down to the brain to determine what 2D or 3D interpretation to make of that retinal image. In some picture (such as those in diagram below) there are two perfectly plausible perceptions of the picture, but it is impossible to perceive both images at the same time, so our perception switches between the two (this causes a change in perception without a change in the scene).

 

Image on the left show 5 black triangles forming a shape, on the tight a white vase silhouetted on a pink background. This background could also be perceived as two purple silhouette people about to kiss

Which way are the arrows pointing? Is that a vase or a kiss?

As the diagram above shows, the triangles on the left can appear to be pointing upwards or down and right. No matter how hard it is attempted, it is not possible to perceive both directions as such simultaneously. Furthermore, the picture to the right either shows a white vase on a purple background, or two purple silhouette faces about to kiss. Again, both of these perceptions are plausible but only one can be seen at any one time.

side a has a hexagon with all opposite vertices linked forming a shape that could be perceived in b) as a different view of a wireframe cube

Unambiguous pictures: whereby there is only one accepted perception, but many plausible perceptions exist

 

Unambiguous pictures are those where more than one plausible explanation is there, but we only tend to perceive one. The diagram to the left demonstrates this. In column a), the first perception is that of a hexagon that has been sliced into six sections. The top image of column b) is also a reasonable perception of this shape, but it is not the shape that is easily perceived when looking at the shape in a).

 

The bottom left of the above diagram show three overlapping circles. This is the most common percept, however it is perfectly plausible that the picture is made up of a circle and two crescents. Due to our knowledge of the world, it is perceived as three overlapping circles and thus helps to prove that the whole percept is more than the sum of parts.

 

Gestalt Theory 3: Laws of Perceptual Organisation(Top)

The reason that our perception of the world is usually stable is because organisation of image components follow a series of eleven laws. The laws that are followed are; proximity, similarity, common fate, good continuation, closure, relative size, surroundedness, orientation, symmetry, familiarity/meaningfulness and the law of pragnanz. These will now be covered in more detail.

Law of Proximity

The law of proximity states that objects that are close together are generally grouped together and thus assumed to be part of the same object. Due to the proximity of the dots or lines in the diagram to the right, they are either perceived as a block, columns or rows due to the grouping associated with their close proximity.

 

Left: 16 green circles arranged in 4 rows of 4, but look like they are making a square. Middle: 18 yellow dots, arranged in 3 columns of 6 (three distinct entities) and Right: 3 blue lines separated by a gap and followed by 5 blue lines, indicating 2 separate entities

The law of proximity groups the shapes as a block, columns or lines

Law of Similarity

This law states that similar objects appear to be grouped together. Similarity extends to lightness, colour, size and other similar parameters. An example of this is again seen in the diagram above, where the block is separate from the columns and rows due to the constituent parts being made up of different sized pieces of different colours. This aids in keeping each part as a separate group.

Another example can be seen in the diagram to the right, where similar orientation can cause a grouping effect under the law of similarity. It is clear that the top left hand part of the image is different to remaining three quarters of the figure.  This is because the top left is composed of lines that are in a different orientation to that of the rest of the picture and our brain groups them as different.

Top right of the image has multiple small vertical dashes, the rest of the image is comprised of small horizontal dashes, which clearly is seen as a difference in the image

Law of Common Fate
The law of common fate states that things that appear to be moving together are grouped together through a common speed and direction. This can be seen in the drawing below. where the two schools of fish can be separated as they are in two distinct groups moving in a common direction and at a common speed.

two shoals of fish swimming in opposite direction. Two separate entities are distinguishable as all members of the same group are swimming in the same direction

This law of common fate can be a problem to those animals and plants that use the law of similarity as a form of camouflage as movement may prevent the animal from being grouped with the still background and thus revealing itself to the observer.

The law of common fate is also seen in ballet, where the ballet dancers all appear as one by their similar dress and body shapes, whilst moving together in a uniform motion.  This causes the observer to see them grouped together as one.

 

Law of Good Continuation

Perceptual organisation tends to favour smooth continuation than abrupt changes and the law of good continuation is the spatial analogue of the law of common fate. This is demonstrated in drawing to the right. The line configuration gives the appearance of two crossing lines as opposed to two arrowheads touching, as this is the easiest perception due to good continuation.

 

Law of Closure

The visual perception system prefers closed objects over open objects and will force this upon us in situations where there is more than one geometrical organisation possible. This can be seen in the diagram to the left, where the top four squares can be perceived as a forming a larger square, rather than two adjoining horizontal blocks. This is because a square is a closed shape instead of an open cross.

 

 

Law of Relative Size 

The law of relative size states that a smaller area is seen to be an object against a larger background.  This can be seen in the diagram to the right.

Left: A lime green circle with a blue diamond in the centre. Right: A blue circle with a green diamond in centre.

Essentially, the image on the left is seen to be a blue diamond on a green background and the image on the right seen as a green diamond on a blue background. There is a possibility that the circle could be an object with a hole cut in it against the different colour background, but the law of relative size causes us to choose the first percept as the one that makes the most sense.

 

The Law of Surroundedness

The law of surroundedness states that the surrounding area tends to be seen as the background and the part of the image being surrounded is the object.  This too can be seen in the diagram above. It appears that the diamond in the centre is the object as it is being surrounded by the circle.

 

The Law of Orientation

The law of orientation states that horizontal and vertical objects are preferred as the percept of the object rather than the background. In the diagram to the right, the white is considered the object and the black cross is considered the background due to the white cross forming the most horizontal and vertical components of the image. It is also feasible that the image is in fact of a black cross on a white background.

4 white sections on a black background, but feasible (although less likely) a black cross on a white background.

The Law of Symmetry

The law of symmetry states that areas of a visual scene that are symmetrical will be seen as foreground objects over an assymetrical background.

 

The Law of Pragnanz (Simplicity)

The law of Pragnanz is essentially the law of simplicity.  It is defined as "of several geometrically possible organisations, that one will occur that possesses the best, simplest and most stable shape". An example of this can be seen below, where a given shape on the left is analysed for its most simple and stable shape. Which option is the most likely to be perceived?

 

Option a) has three components and many angles and thus appears an overly complex (although plausible) solution to the actual configuration of the shape.

 

Option b) is also plausible but the shapes are not regular.

 

Option c) states the object is made of two regular shapes and thus is the most likely perceived configuration of those provided

 

 

The Law of Familiarity / Meaningfulness

The law of familiarity states that things are likely to form groups if they appear meaningful or familiar (this law often falls under the umbrella of the law of Pragnanz). Essentially this law says that components are combined to give us the simplest, most sensible interpretation.  Many other laws are just manifestations of this law and the law of Pragnanz as all will provide the most meaningful or sensible interpretation.

 

Recent Approaches to Perceptual Organisation (Top)

There have been a range of recent studies that have set out to define Gestalt laws more precisely. The overall outcomes of these studies are described over the following few paragraphs.

 

Grouping by Similarity

Grouping by similarity means that components of objects will be similar and components of the background will be similar, but the components of both the object and background need to be different to one another in order to separate them. Looking at the image below, the differing segment in a) is easily identified due to a clear difference in line slope. In b), the slope is much less different and as such the segment is harder to see as the whole image is almost grouped completely as one. This can be used as a basic principle for camouflage.

 

Similar object components to background components make

the object harder to differentiate from the background

 

If the components of the background and the object are identical, but the configuration is different, the differentiation between object and background is harder still. This can be seen in image below. The background is composed here of upright "v" shapes which are composed of two diagonal lines. The object segment is made up of "^" shapes, which is the same component lines, but in a different configuration. This similar composition in a different configuration makes the differentiation between the two more difficult.

Multiple V shapes in lines and columns take the formation of square. The bottom right hand corner has the V shapes inverted but this is difficult to distinguish

If the object and background have identical components in a

different configuration, then differentiation is very difficult

 

It can therefore be seen that it is easier to differentiate object from background when the components are different between the two.

This means that a standard T (T) is more different to a sloping T (T) than a standard T is from an L. This is because the latter of the two have common orientation (i.e. the L and T is composed of purely horizontal and vertical components) in different configuration. The T and the T have different orientations and as such different components, which makes them importantly different in terms of grouping. This can be further seen in the image to the right.

Left: 16 T shapes slanting 45 degrees to the right. Centre: 16 T shapes at normal orientation. Left: 16 L shapes at normal orientation. A biggerdifference is noted between the sloping Ts and the rest of the image, with Ls and Ts appearing almost identical

The sloping Ts differ more than the rest due to the different orientation components as opposed to the L and T

 Therefore there is a difference between factors influencing grouping by similarity and those influencing on conceptual similarity. This is because similarity grouping is spontaneously at a low-processing stage (pre-attentive), which precedes pattern and object recognition. It therefore involves simple factors like orientation, colour and shape.

23 angular curved S shapes are on their side and an angular 10 is inserted within the picture. The 10 is very difficult to identify among the angular S shapes

This image requires scrutinising (high-level processing) to identify the"10" from the similar "S" shapes

To scrutinise images to detect an odd area, a higher level of cortical processing is involved and thus requires more time to perform.  An example of this is trying to find the number 10 amongst the letter S in the image to the left.  There is one number 10 in the large block of angled S's.  As all of the orientations of the components are identical and the configuration is different, the target is hard to identify. This causes the visual system to scrutinise the image and this will take longer than if the components were different to be picked up in the quicker pre-attentive stage.

 

Textons (Top)

 

Bela Julesz tried to determine factors involved in the pre-attentive grouping and presented displays with texture separation. When a subject can rapidly detect the border between areas then they must be using pre-attentive processing. The local features used for pre-attentive grouping were called TEXTONS.

When a subject takes longer to identify a boundary, then they must be using higher level processing. As the image to the right shows, the orientation or slope of a component is a low level processing unit (which can be referred to as a texton) as the perceptual segregation between the two areas is immediate.

 

Background is multiple small dashes with a downwards orientation and the central pattern is lines of an upwards configuration. This change in direction is easily identified.

Central orientation opposite to that of the background, allowing immediate recognition. A low-level texton.

When the configuration changes, as in the image with the T and Ts, the segregation is not so obvious and as such scrutiny of the image is required, drawing upon high-level processing.  As this is the case, configuration is not a texton.

 

Pre-attentive textons have been identified as:

·       Slope and orientation changes

·       Average brightness changes (i.e. dark to light)

·       Average wavelength changes (changes in colour)

·       Granularity changes (dots to blobs for instance would be an immediate pre-attentive cue)

 

Essentially this is how camouflage works.  The local features share granularity, colour and luminance with the granularity, luminance and colour of the background.  This helps to stop the pre-attentive stage from occurring and makes the camouflaged object or animal harder to see.  If scrutinising the image and utilising the higher level processing, then generally you can spot the object trying to maintain camouflage.

 

Properties extracted by the pre-attentive processing are those we know to be separated early on in the visual pathway. These properties/textons correspond to the primitives in the primal sketch and in Treisman's feature-integration theory.

 

 

Grouping By Simplicity (Top)

 

It is very interesting to see how the human visual system decides what is the simplest interpretation of an image and determine the difference between a simple 3D shape or a complex 2D shape.  The following three points are thought to have some influence upon this.

 

1. The number of angles present.  As 3D objects tend to have more angles due to viewpoint, the more angles that can be seen, the more likely the interpretation is that of a 3D object.

 

2. Number of different sized angles.  As 3D objects, dependent on angle viewed, have multiple  angles of varying degree, the more different sized angles increases the likelihood of the perception of a 3D object.

 

3. The number of lines.  As there are more angles, there should be more lines present to suggest that an image is a 3D one over a 2D one.

 

 

Why Gestalt Laws Must Work (Top)

 

It is thought that high-level processes constrain the lower level processes (i.e. top-down processing) based upon our experience of the real world (i.e. images are interpreted with assumptions based on the knowledge of real objects (and this ensures the law of Pragnanz).

Examples of the assumptions include:

 

· Similar surfaces reflect or absorb light, so we assume changes in brightness/colour as different regions (law of similarity).

· Matter is cohesive (made of solid substance) and therefore adjacent regions are likely to belong   to the same object and therefore move together (law of proximity and law of common fate).

· Natural contours tend to be smooth and not angled (law of good continuation).

 

 

THIS CONCLUDES THE UNIT ON PERCEPTUAL ORGANISATION

RETURN TO TOP OF PAGE  -  UNIT 3: OBJECT RECOGNITION