Cats and Dogs
Written: March 2023
In a Tom Friedman article on March 21, 2023, Tom quoted Craig Mundie, the former chief research and strategy officer for Microsoft, as saying artificial intelligence (AI), “is going to change everything about how we do everything. I think that it represents mankind’s greatest invention to date.” That is a bold statement, but I believe it 100%.
Why is AI experiencing the tremendous growth it is undergoing? I believe there are several reasons. Computer servers are getting extraordinarily more powerful. The Internet is getting faster and more reliable. Computer scientists are getting smarter, and the programming tools they use are becoming more sophisticated and robust. Venture capital money is pouring into AI startups based on a conviction AI will become more than a trillion-dollar industry with the potential for outsized profits.
AI has many dimensions and components, and it can be overwhelming to try and understand it. In this article, I will explain some basics of one component of AI called machine learning (ML). I will explain it based on cats and dogs.
Imagine you are sitting on a bench in the park reading a book on your Kindle. You notice something moving out of the corner of your eye. You look up and a cat runs by. A bit later you look up again, and a dog runs by. How did you know the cat was a cat and the dog was a dog? Just because a cat and a dog is what they were. You did not have to think about it or do any analysis. You actually did do quite a bit of analysis, but it happened so fast you did not realize it.
The image of cat or dog travels over a visual pathway consisting of a series of cells and synapses that carry visual information from your environment to the brain. Your vision begins with light passing through the cornea and the lens, which combine to produce a clear image of the cat or dog on a sheet of photoreceptors called the retina. The axons, nerve fibers which conduct electrical impulses, exit the retina via your optic nerve, on the way to your brain for processing. If you place your hand over the back of your head, you will be touching the location of your occipital lobe, one of the four lobes of your brain. The image of the cat or dog from the retina is relayed to your primary visual cortex, a thin sheet of tissue less than one-tenth of an inch thick and a bit larger than a half-dollar located in your occipital lobe.
Now, the visual processing begins. The processing to convert light energy into a meaningful image is a complex process facilitated by multiple layers of the brain. One layer may process information mainly about shape, a second about color, and a third about movement, location, and spatial orientation. The result of the multi-layer processing in your brain is your conclusion the cat was a cat, and the dog was a dog. The conclusion was based on many years of experience.
This was an abbreviated description of how we see cats and dogs. There is a lot not well understood, but research is adding to what is known at a fast pace. If you want to learn more about how visual processing works, visit brainfacts.org.
Let’s return to the park bench. Suppose the local police have installed a surveillance camera on a utility pole behind the park bench. Assuming it was a standard surveillance camera, the camera saw the cat and the dog go by but did not know what they were. The camera was simply capturing a video image and displaying it on video monitors at the police station or monitoring center. However, if the video information was made available to a powerful computer, could the computer determine which was a cat and which was a dog? The answer was no until recently.
As humans, we learned how to differentiate cats from dogs based on our experience. A computer can do the same thing based on data, a lot of data. Scientists or researchers could show the computer thousands of digital pictures of cats and dogs and be told which was which. Google has developed a neural network with algorithms which can sort through vast amounts of data and learn to spot patterns. Much like the human brain, the neural network has multiple layers which can apply filters to determine what it sees. One layer might look at the shape of ears. Another one might look at feet. Other characteristics to be examined and compared could include color, fur texture, neck length, structure of legs, etc. The more data available to the neural network, the more accurate it becomes. Google’s neural network with 22 layers accurately learned the concept of a cat from 10 million images. One estimate shows there are more than six billion images of cats on the Internet.
Now, back to the park bench again. You finished reading your book and were ready to go for a walk in the park. When the camera saw you, did it know who you were? Differentiating between cats and dogs using AI is interesting, but facial recognition is even more interesting.
AI will be touching every aspect of business and our personal lives. The rate of progress is stunning. The technology has risks but also many potential benefits. Next week, I will discuss how AI can recognize our faces and what the implications are.