Understanding NLU: Overcoming Language Barriers Through Meaning
Written on
In this article, we conclude our examination of the challenges posed by existing scientific models before progressing to potential solutions in the next installment. Additionally, a related video can be found on YouTube here: https://youtu.be/ZwdAr8kvkj8.
To summarize, we have scrutinized the assertion that the primary purpose of linguistic analysis lies in grasping grammar and structure. While this perspective might have resonated with behaviorists in the mid-20th century, neglecting meaning has proven to be a significant obstacle for NLU.
Parsing Challenges in Computational Linguistics
Parsing and NLU are often deemed AI-Complete, indicating that these tasks cannot be accomplished until all AI-related challenges are addressed due to the inherent ambiguity and complexity of natural language. We referenced Stanford Professor Daniel Jurafsky, who identified parsing as an NP-Complete problem when applied to comprehensive human languages, suggesting that parsing is unfeasible for NLU.
The Issues with Parts of Speech
Today, we will explore another critical challenge in parsing: the foundational elements that contribute to the dreaded and perplexing combinatorial explosion. Our proposed solution involves incorporating a dictionary that eliminates the ambiguities linked to parts of speech, thereby preventing the loss of data associated with rigid rules.
Is the mainstream view—particularly among formal and computational linguists—that languages are constructed from grammatical components, such as parts of speech? Indeed, this belief persists. As noted previously, this concept was introduced to linguistics by Bloomfield during his research in the 1930s, particularly in relation to immediate constituent analysis.
Here’s an image from a presentation by Professor Chris Manning at Stanford in 2017. Manning, an Australian linguist and machine learning expert, illustrates a fixed sequence of processing: determining words, then syntax (parsing), followed by meaning and context.
The challenge arises as the meaning of a word or phrase is frequently dictated by its context, and some languages may not be as straightforward as English in this regard. I present three representations of phrases in various languages that support the notion that the three steps (word-phrase-meaning) are essentially a singular step. Based on my experience, the final step must be integrated for NLU, allowing a unified system for multiple languages (we have tested nine in our lab) and aligning with the Role and Reference Grammar (RRG) linking algorithm that connects syntax to semantics.
In English, the core predicate "speared" has referent phrases (NP in the diagram) linking actor and undergoer roles, with a location predicate (where) surrounding it.
In the Australian language Dyirbal, the referent phrases (NPs) are dispersed throughout the sentence, yet the core predicate still connects its actor and undergoer with the location predicate encompassing it.
In this head-marking language from Georgia, the sentence structure for "the man gave the book to the woman" comprises a single word as part of the predicate, thus occurring at the morphological level. Here, morphology assumes the role of phrase syntax.
As I will elaborate on another occasion, Patom theory anticipates the integration of language analysis to facilitate automated learning.
The key takeaway is that as we analyze increasingly diverse languages, the path to NLU emerges as a synthesis of syntax, semantics, and discourse-pragmatics. The solution lies in RRG.
I refer to the analysis of human language as pattern matching. This pattern cannot be fragmented, as the absence of information obstructs the resolution of meaning.
I am aware that many computational linguists may question how their models fall short in the face of authentic NLU. However, as Galileo famously stated when faced with skeptics regarding his observations of Jupiter's moons, "In questions of science, the authority of a thousand is not worth the humble reasoning of a single individual."
Current Dictionary Definitions
Lexical categories (parts of speech or POS) derive from an age-old model of language composition. POS serves as the foundation for our dictionaries. Unfortunately, relying solely on POS introduces unnecessary redundancy in definitions as meanings overlap across parts of speech, leading to a multitude of combinatorial issues—especially when considering phrases.
The existing model is inadequate for NLU, but we can rectify this by incorporating meaning.
While dictionaries typically utilize headwords to introduce inflected forms (such as cat/cats for nouns, happy/happier/happiest for adjectives, and run/runs/running/ran for verbs), they often redundantly replicate definitions among these parts of speech.
In the table below, all variations of the word forms reflect a singular definition. The differences in form signify additional semantic elements (person, tense, etc.). For common words like "to be" and "to go," the form variation can be substantial, with one form replacing the standard variation (e.g., be-is-was / go-went). This phenomenon is termed suppletion, but the principle remains unchanged as a form retains a single definition while adding semantic content.
This method seems logical for dictionary construction. By consolidating a single definition for various forms of the same word, the definition is documented only once—right?
However, unlike the singular headword for one definition depicted above, the meaning of "running" is duplicated across adjective, noun, and verb forms in the dictionary.
For early rule-based systems, this definitional ambiguity resulted in a daunting proliferation of rules and parsed tree structures, along with challenges in dictionary maintenance.
There exists a third scenario in which words with differing 'manners' can operate under a unified core definition. This approach allows for headwords like "give," "take," "pick up," "carry," and "grab" to share a singular definition. We will explore this further later.
Data Loss Due to Parts of Speech
The parsing rules also contribute to data loss, as symbols like NP (a noun phrase) obscure the meaning of the underlying word when the rule is applied.
In a syntactic model, a sentence composed of terminal symbols like "the travelling" and "the cat" is converted into a non-terminal symbol, NP. Conversely, a meaning-based model translates the former into "p:travel" (indicating the activity of travel) and the latter into "r:cat" (denoting the referent meaning cat) along with some attributes. When extending the sentences with "to Princeton," a syntactic model adds the symbol PP (resulting in NP PP), while a meaning-based model appends the 'goal' to "p:travel" without affecting "r:cat," as it lacks relevance.
The meaning-based system differentiates between 'travel' and 'cat'—the former is a predicate and the latter a referent. These are meaning-based (semantic) terms, with travel representing an action and cat being a tangible object. I acknowledge that there are ambiguous terms that might render this introduction seem insufficient. The next phase in NLU involves validating the predicate with its arguments to rule out invalid instances. This validation occurs initially with the dictionary and subsequently with context.
Enhancing the Dictionary
To simplify our model, we can hypothesize that languages are constructed based on the meanings of words rather than merely grammatical categories. Beyond comprehending a word's meaning, how it is represented is equally crucial to the system employing it. Neuroscience informs us that various types of sensory representations are localized, including qualities like color, visual motion, facial recognition, and elements of speech. This is essential for automatic language learning. The scientific method should refine the model to mirror how languages worldwide handle definitions.
Context serves as the sole means to validate specific types of ambiguity, and inquiry accomplishes this when the speaker is vague.
Ultimately, a word's meaning relates to the real world in some capacity, rather than merely to the arbitrary sign that links to it. Furthermore, meaning transcends language. The act of catching a ball is still an action where a ball is caught, regardless of the terminology used to describe it.
In our next discussion, we will examine how the duplication of definitions across parts of speech is addressed through meaning-based terms and the dictionary method described previously. This will result in a set of meaningless word forms that connect to the language-independent definition at some level.
Summary
We have initiated this series with a review of our current position and demonstrations illustrating NLU in action, breaking through the constraints of existing scientific methods. The journey ahead is long, as we will shift our focus from what has failed to what is effective.
In essence, NLU has been hindered by the inclination to parse sentences without regard for context and meaning, relying on parts of speech.
We recognized that this issue was identified as NP-complete in 1996. The remedy involves reintegrating language components into a more straightforward model that can be comprehended by the brain. As we progress, we will explore how Patom theory facilitates language acquisition and how NLU is achievable through the extensive insights of the mature linguistic model, Role and Reference Grammar, across a diverse range of languages.
(next — the dictionary modification to mitigate a primary driver of the parsing combinatorial explosion)