DataRobot Three AI Insights from the First Three Episodes of More Intelligent Tomorrow a DataRobot Podcast blog background image v1.0

Professor William O’Grady: Where AI and Linguistics Meet

September 27, 2021
by
· 4 min read

Professor William O’Grady joined us on the More Intelligent Tomorrow podcast to discuss language acquisition, the relationship between linguistics and AI, and how AI can help preserve endangered languages.

William O’Grady is a professor of linguistics at the University of Hawai’i at Mānoa. His research focuses on syntax, language acquisition, Korean, and the acquisition and maintenance of heritage languages. He’s the author of numerous books and papers on these subjects and is devoted to the preservation and revitalization of Jejueo.

Professor O’Grady explains that “Linguistics is the study of how language works, how it’s acquired, how it’s used, how it changes over time, and how it’s represented in the brain. Linguistics is extremely important to cognitive science—and AI is in the realm of cognitive science.”

For children, the key to learning a language is hearing utterances and being able to match them with a corresponding situation. A mother points and says “cat” and their child points and says “cat.” But learning vocabulary is only one part of language. Even young children reach the point where they start to produce messages—even sentences—that are entirely novel and don’t match anything they’ve heard.

Linguists believe there’s something built-in that children bring to language acquisition—more than environment, exposure, or general intelligence. There’s an internal cognitive system that works hard to adjust to input, make sense of it, and find ways to process it. Professor O’Grady explains the key is the processing instinct: the desire to make sense of input by mapping it to another level of representation—form onto meaning or meaning onto form.

When a human learns a word, its meaning lends itself to a network of metaphors. We know what the word “hand” means, and Professor O’Grady suspects no one has to explain to a child what “give me a hand” means. AI will be good at associating words like “finger” and “glove” with “hand.” But it will struggle with idioms, metaphors, puns, humor, creating new jokes, and word play.

Professor O’Grady follows Google Translate. He tried to fool it recently and was able to get it to mistranslate English sentences. “It’s getting better and better. But, no matter how good it gets, it’s not going to be a model of how language works in the human brain any more than the best airplane in the world is going to be a model for how a bird flies. There are similarities, but they’re different things.”

Asked about GPT-3, he says, “They were trying to find a workaround to build sentences and paragraphs and essays without having to speak the language or be a good writer. To some degree, the sentences and their arrangement are coherent, and that’s impressive.” Comparing what GPT-3 and humans do well, he says GPT-3 would win on search—if it’s asked to find some content or to catch an irregular verb in grammar. But if it searches 20,000 research articles, it will fail at comprehension.

He believes the big question in both cognitive science and AI relates to the nature of the fit between a child’s brain and human language. One crucial factor involves processing. Two pressures help shape language: trying to reduce the burden on working memory and trying to increase opportunities for predicting what’s coming next.

“Working memory may not be a big deal for a computer because its capacity is so extraordinary. But computers care a lot about prediction and they’re very good at it. Many cognitive scientists say the human brain is basically a massive prediction machine. So there’s a place where the two likely come together.”

Professor O’Grady’s work focuses on how the two types of processing pressures shape the morphology and syntax of language.

Asked what people working in AI can do to bring linguistics and AI closer together, he recommends learning about linguistics, including syntax, language acquisition, and the role of processing in shaping language. The literature reflects many different approaches. One resource is his essay Natural Syntax: An Emergentist Primer.

For those interested in collaborating, there’s a real need in the area of language revitalization. Roughly half the world’s languages are endangered—3,500. Estimates suggest that, by the end of the 21st century, 90 percent of them will have been lost. But it’s very difficult to save a language once it’s critically endangered and the only fluent speakers are elderly.

The goal is to use deep learning techniques to develop chatbots that provide children learning endangered languages opportunities for one-on-one conversations. The challenge is obtaining enough data from actual conversations. In Professor O’Grady’s work on Jeju Island, there’s too little data available to do deep learning that would provide a fluent chatbot.

Finding a solution would be a turning point in both the fields of AI and language revitalization. It would bring hope to communities that otherwise have no reason to hope.

To hear more about Professor O’Grady’s work and the relationship between linguistics and AI, check out the More Intelligent Tomorrow episode. You can also listen everywhere you enjoy podcasts, including Apple Podcasts, Spotify, Stitcher, and Google Podcasts.

Podcast
William O’Grady: Programming Linguistics
Watch Now
About the author
DataRobot

Value-Driven AI

DataRobot is the leader in Value-Driven AI – a unique and collaborative approach to AI that combines our open AI platform, deep AI expertise and broad use-case implementation to improve how customers run, grow and optimize their business. The DataRobot AI Platform is the only complete AI lifecycle platform that interoperates with your existing investments in data, applications and business processes, and can be deployed on-prem or in any cloud environment. DataRobot and our partners have a decade of world-class AI expertise collaborating with AI teams (data scientists, business and IT), removing common blockers and developing best practices to successfully navigate projects that result in faster time to value, increased revenue and reduced costs. DataRobot customers include 40% of the Fortune 50, 8 of top 10 US banks, 7 of the top 10 pharmaceutical companies, 7 of the top 10 telcos, 5 of top 10 global manufacturers.

Meet DataRobot
  • Listen to the blog
     
  • Share this post
    Subscribe to DataRobot Blog
    Newsletter Subscription
    Subscribe to our Blog