Education
University of Oregon
Ph.D. in Computer Science, advised by Prof. Thien Huu Nguyen.
Ph.D. in Computer Science, advised by Prof. Thien Huu Nguyen.
Fall 2023 - Present
Hanoi University of Science and Technology
B.S. in Computer Science, advised by Dr. Linh Ngo Van.
B.S. in Computer Science, advised by Dr. Linh Ngo Van.
Fall 2018 - Spring 2023
Employment
VinAI Research
Research Resident on the Natural Language Processing Group. Worked on Information Extraction tasks and Large Language Models (LLMs).
Research Resident on the Natural Language Processing Group. Worked on Information Extraction tasks and Large Language Models (LLMs).
Mar 2022 - Sep 2023
Hanoi, Vietnam
Hanoi, Vietnam
VietAI
Lecturer for courses: - Advances in Natural Language Processing - LLMs & Industry Practices
Lecturer for courses: - Advances in Natural Language Processing - LLMs & Industry Practices
Mar 2023 - Current
Ha Noi, Vietnam
Ha Noi, Vietnam
ICOMM Media and Tech., Jsc
Research Intern on the RnD team. Developed and deployed an Open-domain Question Answering System for the Vietnamese language.
Research Intern on the RnD team. Developed and deployed an Open-domain Question Answering System for the Vietnamese language.
Jun 2019 - Sep 2020
Ha Noi, Vietnam
Ha Noi, Vietnam
Selected Publications
CulturaX: A Cleaned, Enormous, and Multilingual Dataset for Large Language Models in 167 Languages
We introduced a largest multilingual dataset with 6.3 trillion tokens in 167 languages, readily usable for Large Language Models (LLMs) development
LREC-Coling 2024
Transitioning Representations between Languages for Cross-lingual Event Detection via Langevin Dynamics
We explored a novel alignment method for cross-lingual transfer learning in Event Detection.
EMNLP 2023 (Findings)
Okapi: Instruction-tuned Large Language Models in Multiple Languages with Reinforcement Learning
A framework that introduces resources and models for instruction tuning for LLMs with RLHF in 26 languages.
EMNLP 2023 (Demonstration)
A Spectral Viewpoint on Continual Relation Extraction
A novel method for Continual Relation Extraction (CRE) with Feature Decorrelation.
EMNLP 2023 (Findings)
Retrieving Relevant Context to Align Representations for Cross-lingual Event Detection
A new approach for the cross-lingual transfer learning problem in Event Detection using Retrieval-Augmented method.
ACL 2023 (Findings)
Contextualized Soft Prompts for Extraction of Event Arguments
A novel approach for document-level Event Argument Extraction (EAE) using graph-based soft prompts with better customizability and stability.
ACL 2023 (Findings)