Hello, I'm Chien Nguyen

I am a Ph.D. student in Computer Science at the University of Oregon, where I am fortunate to be advised by Prof. Thien Huu Nguyen. Prior to this, I completed my Bachelor’s degree in Computer Science at Hanoi University of Science and Technology, under the supervision of Dr. Linh Ngo Van. Before joining UoO, I spent two wonderful years working as an AI Research Resident in Natural Language Processing Group at VinAI Research, Vietnam.

My current research interest lies in exploring generative models at scale, focusing on their applications in Multimodal Learning, Large Language Models, and Information Extraction.

Outside of research, I am also a big fan of Doraemon and used to dream of becoming the next Dragon Warrior like Po.

CV Scholar Github Twitter

News

June 2024: I joined Adobe Research as a Research Scientist Intern with Franck Dernoncourt.
Mar 2024: Two papers were accepted at COLING 2024.
Jan 2024: We introduced Vistral, a state-of-the-art conversational LLM for Vietnamese.
Oct 2023: Three papers were accepted at EMNLP 2023.
Sep 2023: I started my Ph.D. life at University of Oregon!
May 2023: Two papers were accepted at ACL 2023.
Mar 2022: I joined VinAI Research as a Research Resident.

Selected Publications

CulturaX: A Cleaned, Enormous, and Multilingual Dataset for Large Language Models in 167 Languages

Thuat Nguyen, Chien Van Nguyen, Viet Lai, Hieu Man, Nghia Trung Ngo, Franck Dernoncourt, Ryan A Rossi, Thien Huu Nguyen

LREC-Coling 2024

We introduced a largest multilingual dataset with 6.3 trillion tokens in 167 languages, readily usable for Large Language Models (LLMs) development

Transitioning Representations between Languages for Cross-lingual Event Detection via Langevin Dynamics

Chien Van Nguyen, Huy Huu Nguyen, Franck Dernoncourt, Thien Huu Nguyen

EMNLP 2023 (Findings)

We explored a novel alignment method for cross-lingual transfer learning in Event Detection.

Okapi: Instruction-tuned Large Language Models in Multiple Languages with Reinforcement Learning

Viet Lai*, Chien Van Nguyen*, Nghia Trung Ngo, Thuat Nguyen, Franck Dernoncourt, Ryan A Rossi, Thien Huu Nguyen

EMNLP 2023 (Demonstration)

A framework that introduces resources and models for instruction tuning for LLMs with RLHF in 26 languages.

A Spectral Viewpoint on Continual Relation Extraction

Huy Huu Nguyen, Chien Van Nguyen, Linh Ngo Van, Anh Tuan Luu, Thien Huu Nguyen

EMNLP 2023 (Findings)

A novel method for Continual Relation Extraction (CRE) with Feature Decorrelation.

Retrieving Relevant Context to Align Representations for Cross-lingual Event Detection

Chien Van Nguyen, Linh Ngo Van, Thien Huu Nguyen

ACL 2023 (Findings)

A new approach for the cross-lingual transfer learning problem in Event Detection using Retrieval-Augmented method.

Contextualized Soft Prompts for Extraction of Event Arguments

Chien Van Nguyen, Hieu Man, Thien Huu Nguyen

ACL 2023 (Findings)

A novel approach for document-level Event Argument Extraction (EAE) using graph-based soft prompts with better customizability and stability.