Author Image

Hi, I am Tommy

Tommy Chien

Postdoctoral Researcher at Beijing Academy of Artificial Intelligence

I am Tommy Chien (Hongjin Qian), a postdoctoral researcher specializing in Natural Language Processing (NLP), jointly affiliated with Peking University and the Beijing Academy of Artificial Intelligence (BAAI). I earned my PhD in 2024 from the Gaoling School of Artificial Intelligence (GSAI) at Renmin University of China, under the supervision of Prof. Zhicheng Dou and Prof. Ji-Rong Wen. I hold a Master’s degree from the University of Sydney (2019) and a Bachelor’s degree from Nankai University (2017). My experience includes research internships at Huawei and WeChat Group, as well as contributions to AI startups.

Cooking
Photography
Snowboarding
Guitar
Academic
Coding

Experiences

1
Postdoctoral Researcher
Peking University.

Oct 2024 - Present, Beijing, China

Responsibilities:
  • Long-context LLM
  • Efficiency KV cache techniques

Research Intern -> Postdoctoral Researcher
Beijing Academy of Artificial Intelligence.

Nov 2023 - Present, Beijing, China

Beijing Academy of Artificial Intelligence (BAAI) is a non-profit research institute dedicated to promoting collaboration among academia and industries, as well as fostering top talents and a focus on long-term research on the fundamentals of AI technology.

Responsibilities:
  • Long-context LLM
  • Rretrieval-Augmented Generation
2

3
Research Intern
Wechat Group, Tencent.

Jun 2023 - Oct 2023, Beijing, China

Responsibilities:
  • LLM for IR
  • LLM for QA

Research Intern
Poisson Lab, Huawei.

Apr 2022 - May 2023, Beijing, China

Responsibilities:
  • Pretraining for IR
  • Model-oriented IR
4

5
PhD. Researcher
Gaoling School of Artificial Intelligence, Renmin University of China.

Sept 2020 - Jun 2024, Beijing, China

The Gaoling School of Artificial Intelligence (GSAI) at Renmin University of China (RUC) is a prestigious institution dedicated to shaping the future of AI. GSAI has consistently ranked first in Information Retrieval worldwide, according to CSRankings, from 2022 to 2024.

Responsibilities:
  • Personalized AI
  • Conversational AI
  • Information Retrieval

Intern NLP Engineer
Beijing Academy of Artificial Intelligence.

Jun 2020 - Mar 2021, Beijing, China

Responsibilities:
  • QA System dedicated in Governance Domain
  • Dense Vector Search
  • Fine-Grained Named Entity Recognition
6

7
NLP Engineer
Elensdata.

Jan 2019 - Sept 2020, Beijing, China

Elensdata is a start-up company which offers high-calibre data science/AI solutions that help real businesses, in media, finance, etc.

Responsibilities:
  • Core NLP Toolkit for Chinese, English and Uyghur
  • NLP Applications in Multiple Domains (Financial, Security, Media etc.)
  • Large-Scale Pretrained Language Model and Text Generation
Publication Statistics
2020-2024

23 Papers in Total
15 Conference Papers
10 First-Author Papers

Patents Statistics
2019-2024

20 Patents in Total
18 Granted Patents
6 First-Inventor Patents

Academic Service
2021-2024

Reviewer:
Neurips, ICLR, ACL, EMNLP,
EACL, ACL ARR, SIGKDD, theWebConf, TOIS

Theoretical Insights into Fine-Tuning Attention Mechanism: Generalization and Optimization
Co-Author 2024

This paper explores two fine-tuning phenomena in Large Language Models (LLMs): the superior performance of optimizing only the Q and K matrices over full parameter optimization, and the benefit of distinct learning rates for faster convergence. Through theoretical and empirical analysis, the authors propose a new, efficient fine-tuning strategy that enhances generalization, memory efficiency, and optimization speed, validated on benchmark datasets.

A Survey of Conversational Search
Co-Author 2024

This survey examines recent advancements in conversational search, a next-generation paradigm that uses natural language dialogue and LLMs to enable intuitive, multi-turn information retrieval, highlighting critical modules, challenges, and future directions for enhancing user experience and system intelligence in search engines.

Trustworthiness in Retrieval-Augmented Generation Systems: A Survey
Co-Author 2024

This survey introduces a unified framework to evaluate the trustworthiness of Retrieval-Augmented Generation (RAG) systems across six key dimensions—factuality, robustness, fairness, transparency, accountability, and privacy—offering a structured benchmark and comprehensive evaluations to guide future research and enhance RAG reliability in real-world applications.

RAG-Studio: Towards In-Domain Adaptation Of Retrieval Augmented Generation Through Self Alignment
Co-Author 2024

RAG-Studio is an efficient self-aligned training framework that adapts general RAG models to specialized domains solely through synthetic data, producing a domain-specific RAG system that outperforms the use of human-annotated data for fine-tuning.

MemoRAG: Moving towards Next-Gen RAG Via Memory-Inspired Knowledge Discovery
First Author 2024

MemoRAG is a novel retrieval-augmented generation framework that incorporates long-term memory to handle tasks with ambiguous information needs, which standard RAG systems struggle with. By using a dual-system architecture to form global memory and generate draft answers for guiding retrieval, MemoRAG outperforms conventional RAG in both complex and straightforward tasks.

Are Long-LLMs A Necessity For Long-Context Tasks?
First Author 2024

This work challenges the necessity of long-LLMs for long-context tasks by introducing LC-Boost, a framework that enables short-LLMs to effectively handle long-context tasks by adaptively accessing and utilizing relevant context, achieving improved performance with fewer resources.

Extending Llama-3's Context Ten-Fold Overnight
Co-Author 2024

We successfully extended the context length of Llama-3-8B-Instruct from 8K to 80K using QLoRA fine-tuning, achieving superior performance across various tasks while preserving short-context capabilities, with the entire process efficiently completed in 8 hours on an 8xA800 GPU machine, driven by only 3.5K synthetic samples from GPT-4, highlighting the underexplored potential of LLMs for context extension.

Grounding Language Model with Chunking-Free In-Context Retrieval
First Author 2023

This paper introduces a Chunking-Free In-Context (CFIC) retrieval method for Retrieval-Augmented Generation (RAG) systems, improving evidence retrieval accuracy and efficiency by eliminating the need for document chunking and utilizing advanced decoding strategies.

Optimizing Factual Accuracy in Text Generation through Dynamic Knowledge Selection
First Author 2023

DKGen is a novel framework that improves factual accuracy in text generation by iteratively generating short text segments, dynamically selecting relevant references to avoid knowledge mix-up, and leveraging cross-attention distribution for better use of external knowledge, outperforming baseline models in experiments.

Learning on Structured Documents for Conditional Question Answering
Co-Author 2023

This research explores techniques for improving conditional question answering by learning from structured documents.

Search-oriented Conversational Query Editing
Co-Author 2023

EdiRCS is a highly efficient conversational query rewriting model that enhances search performance by selecting most rewrite tokens from the dialogue and generating minimal new tokens, supplemented by search-oriented objectives, outperforming state-of-the-art models in benchmarks with low latency and robustness to varied dialogues.​

Large Language Models Know Your Contextual Search Intent: A Prompting Framework for Conversational Search
Co-Author 2023

LLM4CS is a prompting framework that leverages large language models to enhance conversational search by generating multiple query rewrites and hypothetical responses, integrating them into a robust representation of users’ contextual search intent, and significantly outperforming existing methods on key benchmarks.

Webbrain: Learning to Generate Factually Correct Articles for Queries by Grounding on Large Web Corpus
First Author 2022

This paper introduces WebBrain, a new NLP task focused on generating short, factually-correct articles with references by mining supporting evidence from the web, and presents ReGen, a framework that enhances factual accuracy through improved evidence retrieval and task-specific pre-training, outperforming state-of-the-art methods on newly constructed large-scale datasets.

Learning Denoised and Interpretable Session Representation for Conversational Search
Co-Author 2022

LeCoRE is a sparse lexical-based conversational retriever that enhances conversational search by generating denoised and interpretable session representations through knowledge distillation and external query rewrites, significantly improving performance on public datasets compared to existing methods.

Topic-Enhanced Personalized Retrieval-Based Chatbot
First Author 2022

TopReC is a topic-enhanced personalized retrieval-based chatbot that deconstructs long and noisy dialogue histories into topic-dependent segments, filtering out irrelevant data to learn a more accurate and consistent user personality, significantly outperforming previous state-of-the-art methods on large datasets.

Explicit Query Rewriting for Conversational Dense Retrieval
First Author 2022

CRDR is a unified framework for conversational search that combines query rewriting and context modeling, enhancing accuracy and efficiency by making minimal modifications to the original query and improving contextualized query embeddings through explicit term highlighting, outperforming baseline models in experiments on TREC CAsT-19 and CAsT-20 datasets.

ConvTrans: Transforming Web Search Sessions for Conversational Dense Retrieval
Co-Author 2022

ConvTrans is a data augmentation method that automatically transforms web search sessions into conversational search sessions, addressing the data scarcity problem in conversational dense retrieval by eliminating gaps in session quality and query form, enabling models trained on ConvTrans-generated data to achieve comparable performance to those trained on expensive, manually-created datasets.

Less is More: Learning to Refine Dialogue History for Personalized Dialogue Generation
Co-Author 2021"

The MSP model refines and utilizes extensive dialogue history to enhance personalized response generation by extracting key information and leveraging data from similar users, outperforming existing methods in generating more informative and personalized responses.

Webformer: Pre-training with Web Pages for Information Retrieval
Co-Author 2021

This paper introduces a pre-training approach that leverages the hierarchical structure of HTML web pages and their DOM trees to enhance language models for information retrieval, demonstrating significant improvements in ranking performance over traditional pre-trained models by incorporating structured web data.

Curriculum Contrastive Context Denoising for Few-shot Conversational Dense Retrieval
Co-Author 2021

COTED is a novel framework for few-shot conversational dense retrieval that enhances context denoising through curriculum contrastive learning, progressively training the model to filter out noisy conversational turns, and demonstrating superior performance on CAsT-19 and CAsT-20 datasets compared to state-of-the-art baselines.

Learning Implicit User Profile for Personalized Retrieval-Based Chatbot
First Author 2021

IMPChat is a retrieval-based personalized chatbot model that learns an implicit user profile by separately modeling the user’s personalized language style and preferences, dynamically weighting context-relevant history, and fusing these signals for response ranking, outperforming baseline models in experiments on large datasets.

Pchatbot: A Large-Scale Dataset for Personalized Chatbot
First Author 2020

Pchatbot is a large-scale Chinese dialogue dataset, significantly larger than existing datasets, that has been meticulously normalized and includes anonymized user IDs and timestamps, enabling the development of personalized dialogue models that learn implicit user personality from dialogue history, with preliminary benchmarks provided for future comparisons.

Speaker or Listener? The Role of a Dialog Agent
Co-First Author 2020

The Initiative-Imitate model addresses the challenge of overly proactive dialogue agents by balancing the chatbot’s role between speaker and listener, enhancing conversational fluency and engagement through adaptive initiative, and showing competitive results in both automatic and manual evaluations.

A Semantic Parsing Method Based on Rules and Learning
First Inventor CN112347793A

This patent proposes a semantic parsing method that combines rules and learning-based approaches.

MemoRAG
Owner Aug 2024 - Present

MemoRAG is a next-generation retrieval-augmented generation system with long-term memory, enabling superior context-aware information retrieval and enhanced performance on complex tasks where traditional RAG systems struggle.