ACML 2024 Distinguished Speakers Series: Professor Heng Ji

ABOUT THE SPEAKER

Heng Ji is a professor at the Siebel School of Computing and Data Science and an affiliated faculty member at the Electrical and Computer Engineering Department, Coordinated Science Laboratory, and Carl R. Woese Institute for Genomic Biology of the University of Illinois Urbana-Champaign. She is an Amazon Scholar and the Founding Director of the Amazon-Illinois Center on AI for Interactive Conversational Experiences (AICE).

Heng Ji received her B.A. and M.A. in Computational Linguistics from Tsinghua University and her M.S. and Ph.D. in Computer Science from New York University. Her research interests focus on Natural Language Processing, especially on Multimedia Multilingual Information Extraction, Knowledge-enhanced Large Language Models and Vision-Language Models, and AI for Science. She has received many awards from the industry, including the Outstanding Paper Award at ACL2024, two Outstanding Paper Awards at NAACL2024, "Young Scientist" by the World Laureates Association in 2023 and 2024, and was a member of the Global Future Council on the Future of Computing by the World Economic Forum in 2016 and 2017, among others.

Prof. Heng Ji

University of Illinois Urbana-Champaign

ABOUT THE TALK

Title: Making Large Language Model’s Knowledge More Accurate, Organized, Up-to-date and Fair

Large language models (LLMs) have demonstrated remarkable performance on knowledge reasoning tasks, owing to their implicit knowledge derived from extensive pretraining data. However, their inherent knowledge bases often suffer from disorganization and illusion, bias towards familiar entities, and rapid obsolescence. Consequently, LLMs frequently make up untruthful information, exhibit resistance to updating outdated knowledge, or struggle with generalizing across multiple languages. In this talk, Heng Ji will aim to answer the following questions:

Where and How is Knowledge Stored in LLM?
Why does LLM Lie?
How to Update LLM’s Dynamic Knowledge?
How can we reach LLM’s Knowledge Updating Ripple Effect?
What can knowledge + LLM do for us?

She will also present a case study on “SmartBook” – situation report generation.

Her investigations reveal several underlying causes:

LLMs acquire implicit knowledge primarily through attention-weighted associations between words rather than an explicit understanding of concepts, entities, attributes, relations, events, semantic roles, and logic. We will investigate where various types of knowledge are stored inside LLMs.
Frequent word associations overshadow uncommon ones due to training data imbalance and broad context, particularly in contexts involving dynamic events.
Counter-intuitive updating behaviors are elucidated through a novel gradient similarity metric.
LLMs are often unaware of real-world events occurring after their pretraining phase, complicating the anchoring of related knowledge updates.

While existing methods focus primarily on updating entity attributes, her research underscores the necessity of updating factual knowledge based on real-world events, such as participants, semantic roles, time, and location. Her team proposes a novel framework for knowledge updating in LLMs that leverages event-driven signals to identify factual errors preemptively and introduces a training-free self-contrastive decoding approach to mitigate inference errors.