An MIT Technology Review article introduces the man responsible for the 30-year-old deep learning approach, explains what deep machine learning is, and questions whether deep learning may be the last significant innovation in the AI field. The article also touches on a potential way forward for developing AIs with qualities more analogous to the human brain’s functioning.
From this article, which first describes the progress in grounded cognition theories, then goes into how this should be applied to robotics and artificial intelligence. Some excepts:
“Grounded theories assume that there is no central module for cognition. According to this view, all cognitive phenomena, including those considered the province of amodal cognition such as reasoning, numeric, and language processing, are ultimately grounded in (and emerge from) a variety of bodily, affective, perceptual, and motor processes. The development and expression of cognition is constrained by the embodiment of cognitive agents and various contextual factors (physical and social) in which they are immersed. The grounded framework has received numerous empirical confirmations. Still, there are very few explicit computational models that implement grounding in sensory, motor and affective processes as intrinsic to cognition, and demonstrate that grounded theories can mechanistically implement higher cognitive abilities. We propose a new alliance between grounded cognition and computational modeling toward a novel multidisciplinary enterprise: Computational Grounded Cognition. We clarify the defining features of this novel approach and emphasize the importance of using the methodology of Cognitive Robotics, which permits simultaneous consideration of multiple aspects of grounding, embodiment, and situatedness, showing how they constrain the development and expression of cognition.”
“According to grounded theories, cognition is supported by modal representations and associated mechanisms for their processing (e.g., situated simulations), rather than amodal representations, transductions, and abstract rule systems. Recent computational models of sensory processing can be used to study the grounding of internal representations in sensorimotor modalities; for example, generative models show that useful representations can self-organize through unsupervised learning (Hinton, 2007). However, modalities are usually not isolated but form integrated and multimodal assemblies, plausibly in association areas or ‘convergence zones'” (Damasio, 1989; Simmons and Barsalou, 2003).
“An important challenge is explaining how abstract concepts and symbolic capabilities can be constructed from grounded categorical representations, situated simulations and embodied processes. It has been suggested that abstract concepts could be based principally on interoceptive, meta-cognitive and affective states (Barsalou, 2008) and that selective attention and categorical memory integration are essential for creating a symbolic system” (Barsalou, 2003).
Should it surprise us that human biases find their way into human-designed AI algorithms trained using data sets of human artifacts?
Machine-learning software trained on the datasets didn’t just mirror those biases, it amplified them. If a photo set generally associated women with cooking, software trained by studying those photos and their labels created an even stronger association.
It’s common for brain functions to be described in terms of digital computing, but this metaphor does not hold up in brain research. Unlike computers, in which hardware and software are separate, organic brains’ structures embody memories and brain functions. Form and function are entangled.
Rather than finding brains to work like computers, we are beginning to design computers–artificial intelligence systems–to work more like brains.
Caltech researchers have identified the brain mechanisms that enable primates to quickly identify specific faces. In a feat of efficiency, surprisingly few feature-recognition neurons are involved in a process that may be able to distinguish among billions of faces. Each neuron in the facial-recognition system specializes in noticing one feature, such as the width of the part in the observed person’s hair. If the person is bald or has no part, the part-width-recognizing neuron remains silent. A small number of such specialized-recognizer neurons feed their inputs to other layers (patches) that integrate a higher-level pattern (e.g., hair pattern), and these integrate at yet higher levels until there is a total face pattern. This process occurs nearly instantaneously and works regardless of the view angle (as long as some facial features are visible). Also, by cataloging which neurons perform which functions and then mapping these to a relatively small set of composite faces, researchers were able to tell which face a macaque (monkey) was looking at.
These findings seem to correlate closely with Ray Kurzweil’s (Google’s Chief Technology Officer) pattern-recognition theory of mind.
BMCAI library file (site members only)
An article at Wired.com considers the pros and cons of making the voice interactions of AI assistants more humanlike.
The assumption that more human-like speech from AIs is naturally better may prove as incorrect as the belief that the desktop metaphor was the best way to make humans more proficient in using computers. When designing the interfaces between humans and machines, should we minimize the demands placed on users to learn more about the system they’re interacting with? That seems to have been Alan Kay’s assumption when he designed the first desktop interface back in 1970.
Problems arise when the interaction metaphor diverges too far from the reality of how the underlying system is organized and works. In a personal example, someone dear to me grew up helping her mother–an office manager for several businesses. Dear one was thoroughly familiar with physical desktops, paper documents and forms, file folders, and filing cabinets. As I explained how to create, save, and retrieve information on a 1990 Mac, she quickly overcame her initial fear. “Oh, it’s just like in the real world!” (Chalk one for Alan Kay? Not so fast.) I knew better than to tell her the truth at that point. Dear one’s Mac honeymoon crashed a few days later when, to her horror and confusion, she discovered a file cabinet inside a folder. A few years later, there was another metaphor collapse when she clicked on a string of underlined text in a document and was forcibly and instantly transported to a strange destination.
Having come to terms with computers through the command-line interface, I found the desktop metaphor annoying and unnecessary. Hyperlinking, however–that’s another matter altogether–an innovation that multiplied the value I found in computing.
On the other end of the complexity spectrum would be machine-level code. There would be no general computing today if we all had to speak to computers in their own fundamental language of ones and zeros. That hasn’t stopped some hard-core computer geeks from advocating extreme positions on appropriate interaction modes, as reflected in this quote from a 1984 edition of InfoWorld:
“There isn’t any software! Only different internal states of hardware. It’s all hardware! It’s a shame programmers don’t grok that better.”
Interaction designers operate on the metaphor end of the spectrum by necessity. The human brain organizes concepts by semantic association. But sometimes a different metaphor makes all the difference. And sometimes, to be truly proficient when interacting with automation systems, we have to invest the effort to understand less simplistic metaphors.
The article referenced in the beginning of this post mentions that humans are manually coding “speech synthesis markup tags” to cause synthesized voices of AI systems to sound more natural. (Note that this creates an appearance that the AI understands the user’s intent and emotional state, though this more natural intelligence is illusory.) Intuitively, this sounds appropriate. The down side, as the article points out, is that colloquial AI speech limits human-machine interactions to the sort of vagueness inherent in informal speech. It also trains humans to be less articulate. The result may be interactions that fail to clearly communicate what either party actually means.
I suspect a colloquial mode could be more effective in certain kinds of interactions: when attempting to deceive a human into thinking she’s speaking with another human; virtual talk therapy; when translating from one language to another in situations where idioms, inflections, pauses, tonality, and other linguistic nuances affect meaning and emotion; etc.
In conclusion, operating systems, applications, and AIs are not humans. To improve our effectiveness in using more complex automation systems, we will have to meet them farther along the complexity continuum–still far from machine code, but at points of complexity that require much more of us as users.
Here’s a useful artificial intelligence introductory lesson from an MIT course:
This NY Times article is worth your time, if you are interested in AI–especially if you are still under the impression AI has ossified or lost its way.
Google and others are developing neural networks that learn to recognize and imitate patterns present in works of art, including music. The path to autonomous creativity is unclear. Current systems can imitate existing artworks, but cannot generate truly original works. Human prompting and configuration are required.
Google’s Magenta project’s neural network learned from 4,500 pieces of music before creating the following simple tune (drum track overlaid by a human):
Click Play button to listen->
Is it conceivable that AI may one day be able to synthesize new made-to-order creations by blending features from a catalog of existing works and styles? Imagine being able to specify, “Write me a new musical composition reminiscent of Rhapsody in Blue, but in the style of Lynyrd Skynyrd.
There is already at least one human who could instantly play Rhapsody in Blue in Skynyrd style, but even he does not (to my knowledge) create entirely original pieces.
Original article: https://www.technologyreview.com/s/601642/ok-computer-write-me-a-song/