Regarding Perception, Photography, and Painting…PART I

Regarding Perception, Photography, and Painting…

Not long ago I was asked to join an online discussion group that was formed around the book Traditional Oil Painting, Advanced Techniques and Concepts from the Renaissance to the Present by artist and author Virgil Elliot. Upon joining, I purchased the book that inspired the group so that my contributions to the forum’s discussions could be framed in the context of the group’s focus.

I did not get too far into the book before I began to encounter a number of claims regarding the nature of visual perception that simply did not accord with a modern understanding of the subject. For example, the author makes a series of rather bold claims on page 19 of his text regarding the role of perception in the context of visual art. He writes, “ The artist must develop the ability to read all things with total objectivity in order to see the truth. This ability is what distinguishes the true artist from everyone else. The perceptions of others are influenced by irrelevancies. Those of the true artists are not. As artists, we must see things as they really are. ” Later, The author further doubles down on these claims on the same page writing, “ Total accuracy and objectivity of perception are achieved only after considerable study, ” and “ Consequences notwithstanding, the ability to see with total objectivity is essential if one is to create great art .”

Tucking away these comments, for the time being, I continued on reading–hoping the ideas would not resurface in any significant way. But lo, it was only a few dozen pages later, in a section of the book titled “The Photographic Image Versus Visual Reality ,” that the claims were revisited in the context of the good ol’ “photography-as-a-valid-tool-for-contemporary-painting” argument. This section in the book compared the human eye with a camera in a myriad of ways so as to lead the viewer to conclude that photography is a problematic, distorted source of visual information (as compared with the cultivable objectivity to be found with human perception.) It was here that I realized that the author’s argument against photography as a viable tool for observational representationalism hinged on the reality of objective perception. (Keep this in mind.)

With these claims regarding the nature of perception and the usefulness of photography in the context of observational representationalism put forward so unapologetically in the book, it should come as no surprise that these ideas could be found peppered throughout the related online discussion group. Recently, I decided to respond to these ideas in a post of my own. My intention was to address these assertions about perception, photography, and observational representationalism while also addressing related claims involving the idea of “slavish copying.” I wrote:

I have been reading through a thread here about the contemporary representational painter’s use of photography on this forum and I must say that I’m shocked to see so many fallacies and misrepresentations allowed to propagate unchecked. I paint, colloquially “from” life, from photographs, as well as from my imagination. Do you know what the actual case is regarding each scenario? It is that I am painting “from” memory. Each scenario involves the informing and sculpting of behaviors with a vast array of memory resources that are cultivated by visual experience. Yes, the type or mode of reference source can indeed impact behaviors very differently over a wide array of aspects—and those differences can indeed be illuminated thoughtfully—but the differences often put forward today are often fallacy-ridden arguments fueled by irrational dogmatic beliefs.

The argument that the products of photography are somehow a “lesser” form of visual reference source DUE TO the fact that a photograph does not represent “reality” is indeed fallacious as your actual vision system does not provide an “accurate” representation of what we would describe as reality. We’ve known this since the early 1700s. Can an argument be made that even the best possible photograph (surrogate) may provide far less information than the corresponding live percept? Yes—without a doubt. Does either represent actual “reality?” No.

Furthermore, while a photograph and a live percept may produce very different perceptual experiences—“slavish” copying is technically impossible in both scenarios as visual perception is not, in any way, a strictly linear bottom-up translation of physical measurement. The idea could not be any more erroneous.

For those that have followed my writings in the past, the content of the above post is nothing new. In fact, I had visited this topic in 2015 with a paper titled “The “Pitfalls” of reading about Photography “Pitfalls” after reading a similarly titled article, How to Avoid the Pitfalls of Painting from Photographs” by Courtney Jordan. I again approached the subject of photography again in my writing in the Summer of 2017 with an article titled, “Color, the Pitfalls of Intuition, and the Magic of a Potato. The 2015 article directly addressed many of the core arguments in this ongoing “photography-as-a-valid-tool-for-contemporary-painting” debate. The 2017 article is more focused on the nature of perception and the basics of color photography. However, what both articles lack is a more robust examination of the nature of visual perception. That is what I hope to provide here so that some of the arguments swirling around these topics may become easier to navigate.

First things first though–I am not going to sugar-coat it–visual perception is not an intuitive subject by any stretch of the term. In fact, arguably, one of the reasons that the system works so well is that we operate intuitively believing that it is doing something that it is not. Specifically, we intuitively think of visual perception as providing an objective window to the world. Unfortunately, as George Berkeley pointed out several centuries ago, sources underlying visual stimuli are unknowable in any direct sense. What this means is that the percepts generated by a biological vision system are not accurate, objective recordings of the environment. The reality is that the chasm between the physical world and our perceptions of it is significant and as such, we need to acknowledge that what we “see” is a construct of evolved biology—not an accurate, objective measurement of an external reality. The mechanics of the visual system should not be confused with devices that can garner reasonably accurate measurements of the physical world (e.g. calipers, light meter, spectrophotometer, etc.) Rather, the visual system responds to stimuli based on experience-cultivated neural networks in an effort to yield successful behavior. It is not an external reality that composes the image we “see”–rather it is the biology of the observer.

So let’s explore some of our modern picture of this complex, and still very mysterious, biological system and see if we might better evaluate the aforementioned author’s claims of objective perception. Visual perception can be defined as the ability to interpret the surrounding environment by processing information that is contained in visible light. It is a fascinating process that takes up about 30% of our brain’s cortex. While there is much that we have come to understand about this remarkable ability, there is a great deal of mystery still to be solved. Unfortunately, as stated earlier, very little about our vision system is intuitive. In fact, it is not uncommon to find people that intuitively think of the eyes as tiny cameras that record accurate, clear pictures that are eventually sent on to some inner region of the brain where a small “inner self” or homunculus (a Greek term meaning“little man”) awaits to review the image. What’s more is that even though some can push past the “homunculus fallacy,” they still manage to get mired, as we have seen here, in the idea that vision is (at least in some part) veridical (objectively truthful, corresponding to objective measurement or fact.) While current evidence continues to mount which refutes this idea, it does not stop many from arguing for the veridical nature of vision. One argument that I have encountered regarding this misconception recently was from a colleague who insisted that it is very likely that vision may become more accurate/veridical when percept properties reduce in complexity. Unfortunately, this argument would be akin to one which claims that the accuracy of one’s “clairvoyance” increases among pick-a-card prediction tasks when the number of cards in play is reduced. With this reasoning, if we limit the pile to one card–one would be likely to hit 100% accuracy.

Today, a growing body of interdisciplinary research adds weight to the idea that our visual percepts are generated according to the empirical significance of light stimuli (empirical information derived from past experience), rather than the characteristics of the stimuli as such. In other words, The vision system did not evolve for veridicality–but rather for evolutionary fitness. To this point, Neuroscientist Dale Purves states,“ ​…vision works by having patterns of light on the retina trigger reflex patterns of neural activity that have been shaped entirely by the past consequences of visually guided behavior over evolutionary and individual lifetime. Using the only information available on the retina (i.e., frequencies of occurrence of visual stimuli, light intensities), this strategy gives rise to percepts which incorporate experience from trial and error behaviors in the past. Percepts generated on this basis thus correspond only coincidentally with the measured properties of the stimulus for the underlying objects. ​”

Support for this idea comes not only from fields like perceptual neuroscience and modern vision science but from several other realms of inquiry. Donald D. Hoffman, a professor of cognitive science at the University of California, Irvine has spent the past three decades studying perception, artificial intelligence, evolutionary game theory, and the brain. His findings indeed lend significant support to the idea that visual perception is indeed non-veridical. In a 2016 article with Quanta Magazine, Dr. Hoffman states, “​ The classic argument is that those of our ancestors who saw more accurately had a competitive advantage over those who saw less accurately and thus were more likely to pass on their genes that coded for those more accurate perceptions, so after thousands of generations we can be quite confident that we’re the offspring of those who saw accurately, and so we see accurately. That sounds very plausible. But I think it is utterly false. It misunderstands the fundamental fact about evolution, which is that it’s about fitness functions — mathematical functions that describe how well a given strategy achieves the goals of survival and reproduction. The mathematical physicist Chetan Prakash proved a theorem that I devised that says: According to evolution by natural selection, an organism that sees reality as it is will never be more fit than an organism of equal complexity that sees none of reality but is just tuned to fitness. Never.​” Furthermore, in regards to the hundreds of thousands of computer simulations run by Dr. Hoffman and his research team, he states, “​Some of the [virtual]organisms see all of the reality, others see just part of the reality, and some see none of the reality, only fitness. Who wins? Well, I hate to break it to you, but perception of reality goes extinct. In almost every simulation, organisms that see none of reality but are just tuned to fitness drive to extinction all the organisms that perceive reality as it is. So the bottom line is, evolution does not favor veridical, or accurate perceptions. Those perceptions of reality go extinct.​”

But the counterintuitiveness of the vision system is not limited to the big picture (pardon the pun.) Even at the earliest steps through the vision process, we find one counterintuitive factor or scenario after another. For example, did you know that when light energy interacts with our specialized light-sensitive receptors, they respond with less activity instead of more? (in other words, our light-sensitive cells (photoreceptors) are far more active in the absence of light!) Or How about the fact that all of the biological machinery that effect processes downstream from those receptors is found upstream (getting in the way of the light)? And while many of you might be aware that the incoming light patterns are inverted and reversed on the retina, did you know that those patterns must first pass through a dense web of blood vessels that you have perceptually adapted to? It’s true. So let’s now move to look at some of the observable machinery and processes that facilitate our perceptions so that we may better appreciate, dissect, or navigate this topic.

Let’s begin with light. As I put forth earlier, visual perception can be defined as the ability to interpret the surrounding environment by processing information that is contained in visible light. This term, visible light, describes a portion of the electromagnetic spectrum that is visible to the human eye. The electromagnetic radiation in this range of wavelengths is called visible light or more simply, light. A typical human eye will respond to wavelengths ranging from about 390 to 700 nanometers.

A diagram of the electromagnetic spectrum, showing various properties across the range of frequencies and wavelengths

Visible light rays enter the eye through a small aperture called the pupil. A dome-shaped transparent structure covering the pupil (called the cornea) assists in this entrance, and with the help of a biconvex lens right inside of the pupil, guides the light rays into a focused light pattern onto a region at the back of the inner eyeball called the retina. This tissue is the neural component of the eye that contains specialized light-sensitive cells called photoreceptors. It is these specialized photosensitive receptors that will convert the environmental energy (light) into electrical signals that our brains can use. This process is called phototransduction.

Now upon reading that last paragraph, some might be quick to state that thus far, the eye sort of sounds like a camera. I mean, the basic idea behind photography is to record a projected light pattern with an electronic sensor or light-sensitive plate that can be used to generate a percept surrogate. Generally speaking, in the case of the digital camera sensors (something often compared with the retina), each pixel in the sensor’s array absorbs photons and generates electrons. These electrons are stored as an electrical charge proportional to the light intensity at a location on the sensor called a potential well. The electric charge is then converted to an analog voltage that is then amplified and digitized (turned into numbers.) The composite pattern of data from this process (stored as binary) represents the pattern of light that the sensor was exposed to and may be used to create an image of the light pattern recorded during the exposure event. Additionally, a Bayer mask or Bayer filter (a color filter array) is placed over the sensor so as to collect wavelength information at each pixel in addition to information regarding light intensity.

So is this how the eye or vision works? Do we simply use objective recordings of wavelength and intensity responses at each photoreceptor to produce a clear percept ?

Hardly. There are indeed some similarities between the eye and a camera in terms of optics and photosensitive materials–but that is where any and all significant similarities end. Let’s take a look at some of the differences:

First, it is important to understand that our photoreceptor landscape is not anything like the uniform array of pixels found with a digital image sensor. The 32mm retina (ora to ora) has a very uneven distribution of photoreceptors. Two main types, known as rods and cones, differ significantly in number, morphology, and function as well as their manner of synaptic connection. Rods are far more numerous than cones (about 120 million rods to 8 million cones), are far more light-sensitive, provide very low spatial resolution, and hold only one photopigment. Conversely, cones are far fewer in number, less sensitive to light, provide very high spatial resolution, and come in three types with each type carrying a photopigment that is differentially sensitive to specific wavelengths of light (thus facilitating what we understand as color vision.)

Cones are present at a low density throughout the retina with a sharp peak within a 1.5mm central region known as the fovea. Rods, however, have a high-density distribution throughout the retina but have a sharp decline in the fovea, being completely absent at the absolute center of the fovea (a .35mm central region called the foveola.) To better appreciate the size of our high acuity window within the visual field resulting from this photoreceptor distribution you need only look to your thumbnail at arm’s length.

Here we can see some of the uniformity and distribution differences between the photosensitive cells of the retina and the photosensitive pixels in a digital camera sensor. (A) Distribution of cone photoreceptors in the fovea (left) and the cone/ rod distribution the periphery (right). (B) High-resolution image of the foveal cone mosaic obtained with the Rochester AOSLO (adaptive optics scanning laser ophthalmoscopy). © Peripheral photoreceptor mosaic showing both rods and cones, imaged at 10° temporal and 1° inferior. (obtained with the Rochester AOSLO.) Scale bars are 20 microns. (D) Micrograph of a CMOS sensor at 2 microns. (E) Micrograph of a digital sensor with Bayer mask/filter. (F) Closer look at a Bayer mask/filter.

Another intuitive misconception worth mentioning here, in relation to acuity, is the idea that things get more and more blurry as we move outward from the fovea to the periphery. The truth is, this lower acuity does not yield image blur, but rather a spatial imprecision. Neuroscientist Margaret Livingstone offers a great demonstration on this point in her book Vision and Art: The Biology of Seeing. Here is a recreation of that demonstration:

As one stares at the top black dot between the letter strings, we find difficulty in identifying the individual letters. The same holds for the bottom example, however, in staring at the bottom dot we can perceive the blur even though the letters continue to evade identification . This is because our periphery is not “out of focus” or blurred, but is rather spatially imprecise.

Second, we need to look at what is happening with the output of our photoreceptors. It is very important to understand here that our rods and cones do not register some objective measurements of light intensity and wavelength as seen with the image sensor. Rather, responses from these specialized receptors trigger a cascade of highly dynamic, complex processing through multiple cell layers and a myriad of receptive fields. The resulting signals from this retinal activity are then ushered off to our next stop on the visual pathway–the thalamus. But before we head over to this well known “relay station” in the brain, I would like to present another issue at the level of the retina worth consideration when comparing a camera with the eye–the quality of that initial light pattern projection.

Do you remember the last time you took a picture with your camera when you had some debris or smudge on the lens? Did it ruin the shot you were trying to take? How about the last time you had to deal with a piece of tape keeping light from entering a part of the lens? Or the last time you pulled up a pile of plant roots to suspend in front of your camera before taking that nice portrait shot?

Do those last two questions seem a bit ridiculous? Well, those seemingly ridiculous factors represent some real issues that our visual system has to deal with early on in the perception process. As I mentioned when first describing the counterintuitiveness of the visual system, the human eye has evolved with all of the biological “machinery” used to process the output of the photoreceptor in FRONT of the photoreceptors themselves. That’s right–all of the incoming light has to pass through all of the cell layers that will process the output of the photoreceptors. Now while these cell layers themselves are not too much of a problem (as they are relatively transparent), a problem indeed arises when the signals from all of that machinery need to exit the eye.

After the signals that arise from the photoreceptors make their way through all of the other cell layers, they will eventually reach the cells (ganglion cells) whose axons (long slender projections of nerve cells, or neurons, that usually conduct electrical impulses away from the neuron’s cell body or soma ) will need to leave the eye in the form of a nerve bundle that we call the optic nerve. The region where this nerve exits the eye is called the optic disk and it indeed creates a deficit in our receptor array. This exit, or “blind spot”, is about 1.86 × 1.75 mm. Oddly enough, this deficit in our visual field measures slightly LARGER than our rod-free region of highest visual acuity.

Here we see an image of the back of the eye (which is known as a fundus photograph). The darker region in the center is an area known as the macula (about 5.5 mm) which contains the fovea. To the left of this darker region we can the a lighter region which is the optic disk. disk has some pigmentation at the perimeter of the lateral side, which is considered non-pathological. Veins are darker and slightly wider than corresponding arteries. IMAGE CREDIT: Häggström, Mikael, “Medical gallery of Mikael Häggström 2014”. WikiJournal of Medicine 1.

If you like, you can even “experience” your blind spot with a simple exercise:

Cover your left eye and look at the dot on the left in this image. You will notice the cross in your periphery, but don’t look at it – just keep your eye on the dot. Move your face closer to the page or monitor (depending on how you are reading this), and farther away. At some point, you should notice the cross disappear. Stay at that point, cover your right eye, and Look to the cross. You should see that the dot has disappeared.

Another consideration worth mentioning here is in regards to something that you might have already noticed with the fundus photograph above–a rich web of blood vessels that populates the eye. As you might suspect, these little buggers are also in the way of the incoming light that is heading toward the retina. Now due to what we call sensory or neural adaptation (when a sensory stimulus is unchanging, we tend to stop “processing” it—like the way you don’t feel the clothes on your body after a bit), we don’t normally perceive these vessels, but by influencing the incoming light we can force this network to reveal itself. To do this you’ll need an index card, a pencil or something to poke a small hole in the index card, and a bright surface.

Close one eye, and closely look through the hole at a plain (homogenous) brightly illuminated surface. (The card should be right up to your eye). Carefully jitter the card horizontally or vertically and, almost immediately, you will begin to see a grayish web of blood vessels appear. The hole in the index card changes the way that light is entering the eye and thus begins to change the way the blood vessels cast shadows onto the retina. This change allows us to perceive them for a short time.

So, if we consider that initial light pattern suggestion that we often think is a clear window on the world, we need to consider that this “image” on the retina is inverted, left-right reversed, passing through ill-placed machinery that ultimately results in a relatively large blind spot, facing occlusion by a significant web of blood vessels, and all of this falling onto an unevenly distributed array of varied photoreceptors. So what might that look like?

On the left we see a surrogate representing a live percept while the right simulates what an aggregate of inversion, left-right reversal, acuity reduction due to photoreceptor distribution, blood vessel interference and a blind spot presence may appear

Now many might be quick to intuitively think that this just can’t be true. The image I “see” is so rich and detailed, how could all of this “stuff” be in the way? Again, because visual perception does not involve any sort of clear window on the world. Take a moment to consider macular degeneration. This unfortunate, incurable condition is the leading cause of vision loss, affecting more than 10 million Americans–more than cataracts and glaucoma combined. It is the deterioration of the central portion of the retina, known as the macula (which contains the fovea and foveola.) It just so happens that because of the way our brain uses sensory data, the deterioration may simply go unnoticed, especially in cases with spotty macular cell damage or dysfunction, thus leading many to their ophthalmologist only when disease is fairly advanced. Our brain takes what sensory data it encounters and reflexively responds in a manner that we have evolved to find useful. I can’t stress enough how much visual perception is NOT like a camera taking snapshots via light intensity/wavelength measurements… nothing like it.

Oh, and I almost forgot! There really isn’t any “color” in this initial projection as color is not a property of the environment. Contrary to what some may believe, we do not sense color . While that may also sound counterintuitive–it is indeed true. Color is the visual experience that arises from our biology interacting with the spectral composition of the light (electromagnetic radiation) that is emitted, transmitted, or reflected by the environment. Our various photoreceptors DO respond differently to different wavelengths of light, thus resulting in an ability to discriminate different wavelengths, resulting in an experience of color vision, but it is just not a component of the early projection. So, perhaps a more “realistic” representation of what is falling on the retina might be this: