Yiding Liu
Yiding Liu
Bio
Recent
Publications
Contact
Light
Dark
Automatic
Featured
Exploring Memorization in Fine-tuned Language Models
We conduct the first comprehensive analysis to explore language models’ (LMs) memorization during fine-tuning across tasks. Our studies indicate that memorization presents a strong disparity among different finetuning tasks.
Shenglai Zeng
,
Yaxin Li
,
Jie Ren
,
Yiding Liu
,
Han Xu
,
Pengfei He
,
Yue Xing
,
Shuaiqiang Wang
,
Jiliang Tang
,
Dawei Yin
PDF
Cite
MAIR - A Massive Benchmark for Evaluating Instructed Retrieval
We propose MAIR (Massive Instructed Retrieval Benchmark), a heterogeneous IR benchmark that includes 126 distinct IR tasks across 6 domains, collected from existing datasets. We benchmark state-of-the-art instruction-tuned text embedding models and re-ranking models.
Weiwei Sun
,
Zhengliang Shi
,
Jiulong Wu
,
Lingyong Yan
,
Xinyu Ma
,
Yiding Liu
,
Min Cao
,
Dawei Yin
,
Zhaochun Ren
PDF
Cite
Code
The Good and The Bad: Exploring Privacy Issues in Retrieval-Augmented Generation (RAG)
We conduct extensive empirical studies with novel attack methods, which demonstrate the vulnerability of RAG systems on leaking the private retrieval database. We further reveal that RAG can mitigate the leakage of the LLMs’ training data.
Shenglai Zeng
,
Jiankun Zhang
,
Pengfei He
,
Yue Xing
,
Yiding Liu
,
Han Xu
,
Jie Ren
,
Shuaiqiang Wang
,
Dawei Yin
,
Yi Chang
PDF
Cite
Code
Aligning the Capabilities of Large Language Models with the Context of Information Retrieval via Contrastive Feedback
We propose an unsupervised alignment method, namely Reinforcement Learning from Contrastive Feedback (RLCF), empowering LLMs to generate both high-quality and context-specific responses.
Qian Dong
,
Yiding Liu
,
Qingyao Ai
,
Zhijing Wu
,
Haitao Li
,
Yiqun Liu
,
Shuaiqiang Wang
,
Dawei Yin
,
Shaoping Ma
PDF
Cite
Code
Mill: Mutual Verification with Large Language Models for Zero-shot Query Expansion
We propose a novel zero-shot query expansion framework utilizing both LLM-generated and retrieved documents.
Pengyue Jia
,
Yiding Liu
,
Xiangyu Zhao
,
Xiaopeng Li
,
Changying Hao
,
Shuaiqiang Wang
,
Dawei Yin
PDF
Cite
Code
A Robust Semantics-based Watermark for Large Language Model against Paraphrasing
We propose a semantics-based watermark framework SemaMark. It leverages the semantics as an alternative to simple hashes of tokens since the paraphrase will likely preserve the semantic meaning of the sentences.
Jie Ren
,
Han Xu
,
Yiding Liu
,
Yingqian Cui
,
Shuaiqiang Wang
,
Dawei Yin
,
Jiliang Tang
PDF
Cite
I3 Retriever: Incorporating Implicit Interaction in Pre-trained Language Models for Passage Retrieval
We incorporate implicit interaction into dual-encoders, and propose I^3 retriever. In particular, our implicit interaction paradigm leverages generated pseudo-queries to simulate query-passage interaction, which jointly optimizes with query and passage encoders in an end-to-end manner.
Qian Dong
,
Yiding Liu
,
Qingyao Ai
,
Haitao Li
,
Shuaiqiang Wang
,
Yiqun Liu
,
Dawei Yin
,
Shaoping Ma
PDF
Cite
Code
Pretrained Language Model based Web Search Ranking: From Relevance to Satisfaction
We focus on ranking user satisfaction rather than relevance in web search, and propose a PLM-based framework, namely SAT-Ranker, which comprehensively models different dimensions of user satisfaction in a unified manner.
Canjia Li
,
Xiaoyang Wang
,
Dongdong Li
,
Yiding Liu
,
Yu Lu
,
Shuaiqiang Wang
,
Zhicong Cheng
,
Simiu Gu
,
Dawei Yin
PDF
Cite
Cite
×