


default search action
Li Dong 0004
Person information
- affiliation: Microsoft Research Asia, Natural Language Computing Group, Beijing, China
- affiliation (PhD 2019): University of Edinburgh, School of Informatics, Edinburgh, UK
- affiliation (former): Beihang University, State Key Laboratory of Software Development Environment, Beijing, China
Other persons with the same name
- Li Dong — disambiguation page
- Li Dong 0001 — Tsinghua University, Department of Computer Science and Technology / System Division of Library, Beijing, China
- Li Dong 0002 — Bohai University, College of Mathematics and Physics, Jinzhou, China (and 1 more)
- Li Dong 0003
— University of Electronic Science and Technology of China, School of Life Science and Technology, Chengdu, China
- Li Dong 0005
— Dalian Nationalities University, College of Science, China
- Li Dong 0006
— Ningbo University, Faculty of Electrical Engineering and Computer Science, China (and 1 more)
- Li Dong 0007
— Xi'an Jiaotong University, School of Microelectronics, China
- Li Dong 0008
— Northeastern University, Engineering Optimization and Smart Antenna Institute, Qinhuangdao, China
- Li Dong 0009
— Hunan University of Technology and Business, Key Laboratory of Hunan Province for New Retail Virtual Reality Technology, Changsha, China (and 1 more)
- Li Dong 0010 — Microsoft Research
- Li Dong 0011
— Shenzhen Technology University, College of Big Data and Internet, China (and 1 more)
Refine list

refinements active!
zoomed in on ?? of ?? records
view refined list in
export refined list as
2020 – today
- 2025
- [i92]Chengzu Li, Wenshan Wu, Huanyu Zhang, Yan Xia, Shaoguang Mao, Li Dong, Ivan Vulic, Furu Wei:
Imagine while Reasoning in Space: Multimodal Visualization-of-Thought. CoRR abs/2501.07542 (2025) - 2024
- [j11]Hangbo Bao, Li Dong, Wenhui Wang, Nan Yang, Songhao Piao
, Furu Wei:
Fine-tuning pretrained transformer encoders for sequence-to-sequence learning. Int. J. Mach. Learn. Cybern. 15(5): 1711-1728 (2024) - [j10]Hongyu Wang
, Shuming Ma
, Li Dong
, Shaohan Huang
, Dongdong Zhang
, Furu Wei
:
DeepNet: Scaling Transformers to 1,000 Layers. IEEE Trans. Pattern Anal. Mach. Intell. 46(10): 6761-6774 (2024) - [j9]Wei Huang
, Zhiliang Peng
, Li Dong
, Furu Wei, Qixiang Ye
, Jianbin Jiao
:
Generic-to-Specific Distillation of Masked Autoencoders. IEEE Trans. Circuits Syst. Video Technol. 34(9): 8779-8793 (2024) - [c76]Zonglin Yang, Li Dong, Xinya Du, Hao Cheng, Erik Cambria, Xiaodong Liu, Jianfeng Gao, Furu Wei:
Language Models as Inductive Reasoners. EACL (1) 2024: 209-225 - [c75]Yuxian Gu, Li Dong, Furu Wei, Minlie Huang:
MiniLLM: Knowledge Distillation of Large Language Models. ICLR 2024 - [c74]Xichen Pan, Li Dong, Shaohan Huang, Zhiliang Peng, Wenhu Chen, Furu Wei:
Kosmos-G: Generating Images in Context with Multimodal Large Language Models. ICLR 2024 - [c73]Zhiliang Peng, Wenhui Wang, Li Dong, Yaru Hao, Shaohan Huang, Shuming Ma, Qixiang Ye, Furu Wei:
Grounding Multimodal Large Language Models to the World. ICLR 2024 - [c72]Xun Wu, Shaohan Huang, Wenhui Wang, Shuming Ma, Li Dong, Furu Wei:
Multi-Head Mixture-of-Experts. NeurIPS 2024 - [c71]Wenshan Wu, Shaoguang Mao, Yadong Zhang, Yan Xia, Li Dong, Lei Cui, Furu Wei:
Mind's Eye of LLMs: Visualization-of-Thought Elicits Spatial Reasoning in Large Language Models. NeurIPS 2024 - [i91]Yuxian Gu, Li Dong, Yaru Hao, Qingxiu Dong, Minlie Huang, Furu Wei:
Towards Optimal Learning of Language Models. CoRR abs/2402.17759 (2024) - [i90]Shuming Ma, Hongyu Wang, Lingxiao Ma, Lei Wang, Wenhui Wang, Shaohan Huang, Li Dong, Ruiping Wang, Jilong Xue, Furu Wei:
The Era of 1-bit LLMs: All Large Language Models are in 1.58 Bits. CoRR abs/2402.17764 (2024) - [i89]Wenshan Wu, Shaoguang Mao, Yadong Zhang, Yan Xia, Li Dong, Lei Cui, Furu Wei:
Visualization-of-Thought Elicits Spatial Reasoning in Large Language Models. CoRR abs/2404.03622 (2024) - [i88]Yutao Sun, Li Dong, Yi Zhu, Shaohan Huang, Wenhui Wang, Shuming Ma, Quanlu Zhang, Jianyong Wang, Furu Wei:
You Only Cache Once: Decoder-Decoder Architectures for Language Models. CoRR abs/2405.05254 (2024) - [i87]Tianzhu Ye, Li Dong, Yuqing Xia, Yutao Sun, Yi Zhu, Gao Huang, Furu Wei:
Differential Transformer. CoRR abs/2410.05258 (2024) - [i86]Qingxiu Dong, Li Dong, Xingxing Zhang, Zhifang Sui, Furu Wei:
Self-Boosting Large Language Models with Synthetic Preference Data. CoRR abs/2410.06961 (2024) - [i85]Yuxian Gu, Li Dong, Hongning Wang, Yaru Hao, Qingxiu Dong, Furu Wei, Minlie Huang:
Data Selection via Optimal Control for Language Models. CoRR abs/2410.07064 (2024) - [i84]Yaoyao Chang, Lei Cui, Li Dong, Shaohan Huang, Yangyu Huang, Yupan Huang, Scarlett Li, Tengchao Lv, Shuming Ma, Qinzheng Sun, Wenhui Wang, Furu Wei, Ying Xin, Mao Yang, Qiufeng Yin, Xingxing Zhang:
RedStone: Curating General, Code, Math, and QA Data for Large Language Models. CoRR abs/2412.03398 (2024) - 2023
- [j8]Zhiliang Peng, Li Dong, Hangbo Bao, Furu Wei, Qixiang Ye:
A Unified View of Masked Image Modeling. Trans. Mach. Learn. Res. 2023 (2023) - [c70]Damai Dai, Yutao Sun, Li Dong, Yaru Hao, Shuming Ma, Zhifang Sui, Furu Wei:
Why Can GPT Learn In-Context? Language Models Secretly Perform Gradient Descent as Meta-Optimizers. ACL (Findings) 2023: 4005-4019 - [c69]Yuxian Gu, Li Dong, Furu Wei, Minlie Huang:
Pre-Training to Learn in Context. ACL (1) 2023: 4849-4870 - [c68]Jian Yang, Shuming Ma, Li Dong, Shaohan Huang, Haoyang Huang, Yuwei Yin, Dongdong Zhang, Liqun Yang, Furu Wei, Zhoujun Li:
GanLM: Encoder-Decoder Pre-training with an Auxiliary Discriminator. ACL (1) 2023: 9394-9412 - [c67]Yutao Sun, Li Dong, Barun Patra, Shuming Ma, Shaohan Huang, Alon Benhaim, Vishrav Chaudhary, Xia Song, Furu Wei:
A Length-Extrapolatable Transformer. ACL (1) 2023: 14590-14604 - [c66]Barun Patra, Saksham Singhal, Shaohan Huang, Zewen Chi, Li Dong, Furu Wei, Vishrav Chaudhary, Xia Song:
Beyond English-Centric Bitexts for Better Multilingual Language Representation Learning. ACL (1) 2023: 15354-15373 - [c65]Jinghao Zhou, Li Dong, Zhe Gan, Lijuan Wang, Furu Wei:
Non-Contrastive Learning Meets Language-Image Pre-Training. CVPR 2023: 11028-11038 - [c64]Wei Huang, Zhiliang Peng, Li Dong, Furu Wei, Jianbin Jiao, Qixiang Ye:
Generic-to-Specific Distillation of Masked Autoencoders. CVPR 2023: 15996-16005 - [c63]Wenhui Wang, Hangbo Bao, Li Dong, Johan Bjorck, Zhiliang Peng, Qiang Liu, Kriti Aggarwal, Owais Khan Mohammed, Saksham Singhal, Subhojit Som, Furu Wei:
Image as a Foreign Language: BEIT Pretraining for Vision and Vision-Language Tasks. CVPR 2023: 19175-19186 - [c62]Yuxin Fang, Li Dong, Hangbo Bao, Xinggang Wang, Furu Wei:
Corrupted Image Modeling for Self-Supervised Visual Pre-Training. ICLR 2023 - [c61]Zhixiong Han, Yaru Hao, Li Dong, Yutao Sun, Furu Wei:
Prototypical Calibration for Few-shot Learning of Language Models. ICLR 2023 - [c60]Weizhi Wang, Li Dong, Hao Cheng, Haoyu Song, Xiaodong Liu, Xifeng Yan
, Jianfeng Gao, Furu Wei:
Visually-Augmented Language Modeling. ICLR 2023 - [c59]Hongyu Wang, Shuming Ma, Shaohan Huang, Li Dong, Wenhui Wang, Zhiliang Peng, Yu Wu, Payal Bajaj, Saksham Singhal, Alon Benhaim, Barun Patra, Zhun Liu, Vishrav Chaudhary, Xia Song, Furu Wei:
Magneto: A Foundation Transformer. ICML 2023: 36077-36092 - [c58]Tao Ge, Jing Hu, Li Dong, Shaoguang Mao, Yan Xia, Xun Wang, Si-Qing Chen, Furu Wei:
Extensible Prompts for Language Models on Zero-shot Language Style Customization. NeurIPS 2023 - [c57]Yaru Hao, Zewen Chi, Li Dong, Furu Wei:
Optimizing Prompts for Text-to-Image Generation. NeurIPS 2023 - [c56]Shaohan Huang, Li Dong, Wenhui Wang, Yaru Hao, Saksham Singhal, Shuming Ma, Tengchao Lv, Lei Cui, Owais Khan Mohammed, Barun Patra, Qiang Liu, Kriti Aggarwal, Zewen Chi, Nils Johan Bertil Bjorck, Vishrav Chaudhary, Subhojit Som, Xia Song, Furu Wei:
Language Is Not All You Need: Aligning Perception with Language Models. NeurIPS 2023 - [c55]Weizhi Wang, Li Dong, Hao Cheng, Xiaodong Liu, Xifeng Yan, Jianfeng Gao, Furu Wei:
Augmenting Language Models with Long-Term Memory. NeurIPS 2023 - [i83]Shaohan Huang, Li Dong, Wenhui Wang, Yaru Hao, Saksham Singhal, Shuming Ma, Tengchao Lv, Lei Cui, Owais Khan Mohammed, Barun Patra, Qiang Liu, Kriti Aggarwal, Zewen Chi, Johan Bjorck, Vishrav Chaudhary, Subhojit Som, Xia Song, Furu Wei:
Language Is Not All You Need: Aligning Perception with Language Models. CoRR abs/2302.14045 (2023) - [i82]Wei Huang, Zhiliang Peng, Li Dong, Furu Wei, Jianbin Jiao, Qixiang Ye:
Generic-to-Specific Distillation of Masked Autoencoders. CoRR abs/2302.14771 (2023) - [i81]Yuxian Gu, Li Dong, Furu Wei, Minlie Huang:
Pre-Training to Learn in Context. CoRR abs/2305.09137 (2023) - [i80]Weizhi Wang, Li Dong, Hao Cheng, Xiaodong Liu, Xifeng Yan, Jianfeng Gao, Furu Wei:
Augmenting Language Models with Long-Term Memory. CoRR abs/2306.07174 (2023) - [i79]Yuxian Gu, Li Dong, Furu Wei, Minlie Huang:
Knowledge Distillation of Large Language Models. CoRR abs/2306.08543 (2023) - [i78]Zhiliang Peng, Wenhui Wang, Li Dong, Yaru Hao, Shaohan Huang, Shuming Ma, Furu Wei:
Kosmos-2: Grounding Multimodal Large Language Models to the World. CoRR abs/2306.14824 (2023) - [i77]Jiayu Ding, Shuming Ma, Li Dong, Xingxing Zhang, Shaohan Huang, Wenhui Wang, Nanning Zheng, Furu Wei:
LongNet: Scaling Transformers to 1, 000, 000, 000 Tokens. CoRR abs/2307.02486 (2023) - [i76]Yutao Sun, Li Dong, Shaohan Huang, Shuming Ma, Yuqing Xia, Jilong Xue, Jianyong Wang, Furu Wei:
Retentive Network: A Successor to Transformer for Large Language Models. CoRR abs/2307.08621 (2023) - [i75]Qingxiu Dong, Li Dong, Ke Xu, Guangyan Zhou, Yaru Hao, Zhifang Sui, Furu Wei:
Large Language Model for Science: A Study on P vs. NP. CoRR abs/2309.05689 (2023) - [i74]Tengchao Lv, Yupan Huang, Jingye Chen, Lei Cui, Shuming Ma, Yaoyao Chang, Shaohan Huang, Wenhui Wang, Li Dong, Weiyao Luo, Shaoxiang Wu, Guoxin Wang, Cha Zhang, Furu Wei:
Kosmos-2.5: A Multimodal Literate Model. CoRR abs/2309.11419 (2023) - [i73]Xichen Pan, Li Dong, Shaohan Huang, Zhiliang Peng, Wenhu Chen, Furu Wei:
Kosmos-G: Generating Images in Context with Multimodal Large Language Models. CoRR abs/2310.02992 (2023) - [i72]Hongyu Wang, Shuming Ma, Li Dong, Shaohan Huang, Huaijie Wang, Lingxiao Ma, Fan Yang, Ruiping Wang, Yi Wu, Furu Wei:
BitNet: Scaling 1-bit Transformers for Large Language Models. CoRR abs/2310.11453 (2023) - 2022
- [j7]Haichao Zhu
, Li Dong, Furu Wei, Bing Qin
, Ting Liu:
Transforming Wikipedia Into Augmented Data for Query-Focused Summarization. IEEE ACM Trans. Audio Speech Lang. Process. 30: 2357-2367 (2022) - [c54]Jing Qian, Li Dong, Yelong Shen, Furu Wei, Weizhu Chen:
Controllable Natural Language Generation with Contrastive Prefixes. ACL (Findings) 2022: 2912-2924 - [c53]Tianyu Chen, Hangbo Bao, Shaohan Huang, Li Dong, Binxing Jiao, Daxin Jiang, Haoyi Zhou, Jianxin Li, Furu Wei:
THE-X: Privacy-Preserving Transformer Inference with Homomorphic Encryption. ACL (Findings) 2022: 3510-3520 - [c52]Haoyu Song, Li Dong, Weinan Zhang, Ting Liu, Furu Wei:
CLIP Models are Few-Shot Learners: Empirical Studies on VQA and Visual Entailment. ACL (1) 2022: 6088-6100 - [c51]Zewen Chi, Shaohan Huang, Li Dong, Shuming Ma, Bo Zheng, Saksham Singhal, Payal Bajaj, Xia Song, Xian-Ling Mao, Heyan Huang, Furu Wei:
XLM-E: Cross-lingual Language Model Pre-training via ELECTRA. ACL (1) 2022: 6170-6182 - [c50]Damai Dai, Li Dong, Shuming Ma, Bo Zheng, Zhifang Sui, Baobao Chang, Furu Wei:
StableMoE: Stable Routing Strategy for Mixture of Experts. ACL (1) 2022: 7085-7095 - [c49]Damai Dai, Li Dong, Yaru Hao, Zhifang Sui, Baobao Chang, Furu Wei:
Knowledge Neurons in Pretrained Transformers. ACL (1) 2022: 8493-8502 - [c48]Ze Liu, Han Hu, Yutong Lin, Zhuliang Yao, Zhenda Xie, Yixuan Wei, Jia Ning, Yue Cao, Zheng Zhang, Li Dong, Furu Wei, Baining Guo:
Swin Transformer V2: Scaling Up Capacity and Resolution. CVPR 2022: 11999-12009 - [c47]Jian Yang, Shaohan Huang, Shuming Ma, Yuwei Yin
, Li Dong, Dongdong Zhang, Hongcheng Guo, Zhoujun Li, Furu Wei:
CROP: Zero-shot Cross-lingual Named Entity Recognition with Multilingual Labeled Sequence Translation. EMNLP (Findings) 2022: 486-496 - [c46]Hangbo Bao, Li Dong, Songhao Piao, Furu Wei:
BEiT: BERT Pre-Training of Image Transformers. ICLR 2022 - [c45]Hangbo Bao, Wenhui Wang, Li Dong, Qiang Liu, Owais Khan Mohammed, Kriti Aggarwal, Subhojit Som, Songhao Piao, Furu Wei:
VLMo: Unified Vision-Language Pre-Training with Mixture-of-Modality-Experts. NeurIPS 2022 - [c44]Zewen Chi, Li Dong, Shaohan Huang, Damai Dai, Shuming Ma, Barun Patra, Saksham Singhal, Payal Bajaj, Xia Song, Xian-Ling Mao, Heyan Huang, Furu Wei:
On the Representation Collapse of Sparse Mixture of Experts. NeurIPS 2022 - [c43]Yunzhi Yao, Shaohan Huang, Li Dong, Furu Wei, Huajun Chen, Ningyu Zhang:
Kformer: Knowledge Injection in Transformer Feed-Forward Layers. NLPCC (1) 2022: 131-143 - [i71]Yunzhi Yao, Shaohan Huang, Ningyu Zhang, Li Dong, Furu Wei, Huajun Chen:
Kformer: Knowledge Injection in Transformer Feed-Forward Layers. CoRR abs/2201.05742 (2022) - [i70]Yuxin Fang, Li Dong, Hangbo Bao, Xinggang Wang, Furu Wei:
Corrupted Image Modeling for Self-Supervised Visual Pre-Training. CoRR abs/2202.03382 (2022) - [i69]Da Yin, Li Dong, Hao Cheng, Xiaodong Liu, Kai-Wei Chang, Furu Wei, Jianfeng Gao:
A Survey of Knowledge-Intensive NLP with Pre-Trained Language Models. CoRR abs/2202.08772 (2022) - [i68]Jing Qian, Li Dong, Yelong Shen, Furu Wei, Weizhu Chen:
Controllable Natural Language Generation with Contrastive Prefixes. CoRR abs/2202.13257 (2022) - [i67]Hongyu Wang, Shuming Ma, Li Dong, Shaohan Huang, Dongdong Zhang, Furu Wei:
DeepNet: Scaling Transformers to 1, 000 Layers. CoRR abs/2203.00555 (2022) - [i66]Haoyu Song, Li Dong, Wei-Nan Zhang, Ting Liu, Furu Wei:
CLIP Models are Few-shot Learners: Empirical Studies on VQA and Visual Entailment. CoRR abs/2203.07190 (2022) - [i65]Damai Dai, Li Dong, Shuming Ma, Bo Zheng, Zhifang Sui, Baobao Chang, Furu Wei:
StableMoE: Stable Routing Strategy for Mixture of Experts. CoRR abs/2204.08396 (2022) - [i64]Zewen Chi, Li Dong, Shaohan Huang, Damai Dai, Shuming Ma, Barun Patra, Saksham Singhal, Payal Bajaj, Xia Song, Furu Wei:
On the Representation Collapse of Sparse Mixture of Experts. CoRR abs/2204.09179 (2022) - [i63]Weizhi Wang, Li Dong, Hao Cheng, Haoyu Song, Xiaodong Liu, Xifeng Yan, Jianfeng Gao, Furu Wei:
Visually-Augmented Language Modeling. CoRR abs/2205.10178 (2022) - [i62]Zhixiong Han, Yaru Hao, Li Dong, Furu Wei:
Prototypical Calibration for Few-shot Learning of Language Models. CoRR abs/2205.10183 (2022) - [i61]Tianyu Chen, Hangbo Bao, Shaohan Huang, Li Dong, Binxing Jiao, Daxin Jiang
, Haoyi Zhou
, Jianxin Li, Furu Wei:
THE-X: Privacy-Preserving Transformer Inference with Homomorphic Encryption. CoRR abs/2206.00216 (2022) - [i60]Hangbo Bao, Wenhui Wang, Li Dong, Furu Wei:
VL-BEiT: Generative Vision-Language Pretraining. CoRR abs/2206.01127 (2022) - [i59]Yaru Hao, Haoyu Song, Li Dong, Shaohan Huang, Zewen Chi, Wenhui Wang, Shuming Ma, Furu Wei:
Language Models are General-Purpose Interfaces. CoRR abs/2206.06336 (2022) - [i58]Zhiliang Peng, Li Dong, Hangbo Bao, Qixiang Ye, Furu Wei:
BEiT v2: Masked Image Modeling with Vector-Quantized Visual Tokenizers. CoRR abs/2208.06366 (2022) - [i57]Wenhui Wang, Hangbo Bao, Li Dong, Johan Bjorck, Zhiliang Peng, Qiang Liu, Kriti Aggarwal, Owais Khan Mohammed, Saksham Singhal, Subhojit Som, Furu Wei:
Image as a Foreign Language: BEiT Pretraining for All Vision and Vision-Language Tasks. CoRR abs/2208.10442 (2022) - [i56]Hongyu Wang, Shuming Ma, Shaohan Huang, Li Dong, Wenhui Wang, Zhiliang Peng, Yu Wu, Payal Bajaj, Saksham Singhal, Alon Benhaim, Barun Patra, Zhun Liu, Vishrav Chaudhary, Xia Song, Furu Wei:
Foundation Transformers. CoRR abs/2210.06423 (2022) - [i55]Jian Yang, Shaohan Huang, Shuming Ma, Yuwei Yin
, Li Dong, Dongdong Zhang, Hongcheng Guo, Zhoujun Li, Furu Wei:
CROP: Zero-shot Cross-lingual Named Entity Recognition with Multilingual Labeled Sequence Translation. CoRR abs/2210.07022 (2022) - [i54]Jinghao Zhou, Li Dong, Zhe Gan, Lijuan Wang, Furu Wei:
Non-Contrastive Learning Meets Language-Image Pre-Training. CoRR abs/2210.09304 (2022) - [i53]Zhiliang Peng, Li Dong, Hangbo Bao, Qixiang Ye, Furu Wei:
A Unified View of Masked Image Modeling. CoRR abs/2210.10615 (2022) - [i52]Barun Patra, Saksham Singhal, Shaohan Huang, Zewen Chi, Li Dong, Furu Wei, Vishrav Chaudhary, Xia Song:
Beyond English-Centric Bitexts for Better Multilingual Language Representation Learning. CoRR abs/2210.14867 (2022) - [i51]Shuming Ma, Hongyu Wang, Shaohan Huang, Wenhui Wang, Zewen Chi, Li Dong, Alon Benhaim, Barun Patra, Vishrav Chaudhary, Xia Song, Furu Wei:
TorchScale: Transformers at Scale. CoRR abs/2211.13184 (2022) - [i50]Tao Ge, Jing Hu, Li Dong, Shaoguang Mao, Yan Xia, Xun Wang, Si-Qing Chen, Furu Wei:
Extensible Prompts for Language Models. CoRR abs/2212.00616 (2022) - [i49]Yaru Hao, Yutao Sun, Li Dong, Zhixiong Han, Yuxian Gu, Furu Wei:
Structured Prompting: Scaling In-Context Learning to 1, 000 Examples. CoRR abs/2212.06713 (2022) - [i48]Yaru Hao, Zewen Chi, Li Dong, Furu Wei:
Optimizing Prompts for Text-to-Image Generation. CoRR abs/2212.09611 (2022) - [i47]Jian Yang, Shuming Ma, Li Dong, Shaohan Huang, Haoyang Huang, Yuwei Yin
, Dongdong Zhang, Liqun Yang, Zhoujun Li, Furu Wei:
GanLM: Encoder-Decoder Pre-training with an Auxiliary Discriminator. CoRR abs/2212.10218 (2022) - [i46]Yutao Sun, Li Dong, Barun Patra, Shuming Ma, Shaohan Huang, Alon Benhaim, Vishrav Chaudhary, Xia Song, Furu Wei:
A Length-Extrapolatable Transformer. CoRR abs/2212.10554 (2022) - [i45]Damai Dai, Yutao Sun, Li Dong, Yaru Hao, Zhifang Sui, Furu Wei:
Why Can GPT Learn In-Context? Language Models Secretly Perform Gradient Descent as Meta-Optimizers. CoRR abs/2212.10559 (2022) - [i44]Zonglin Yang
, Li Dong, Xinya Du, Hao Cheng, Erik Cambria, Xiaodong Liu, Jianfeng Gao, Furu Wei:
Language Models as Inductive Reasoners. CoRR abs/2212.10923 (2022) - 2021
- [j6]Li Dong:
Learning natural language interfaces with neural models. AI Matters 7(2): 14-17 (2021) - [c42]Yaru Hao, Li Dong, Furu Wei, Ke Xu:
Self-Attention Attribution: Interpreting Information Interactions Inside Transformer. AAAI 2021: 12963-12971 - [c41]Yunzhi Yao, Shaohan Huang, Wenhui Wang, Li Dong, Furu Wei:
Adapt-and-Distill: Developing Small, Fast and Effective Pretrained Language Models for Domains. ACL/IJCNLP (Findings) 2021: 460-470 - [c40]Wenhui Wang, Hangbo Bao, Shaohan Huang, Li Dong, Furu Wei:
MiniLMv2: Multi-Head Self-Attention Relation Distillation for Compressing Pretrained Transformers. ACL/IJCNLP (Findings) 2021: 2140-2151 - [c39]Bo Zheng, Li Dong, Shaohan Huang, Wenhui Wang, Zewen Chi, Saksham Singhal, Wanxiang Che, Ting Liu, Xia Song, Furu Wei:
Consistency Regularization for Cross-Lingual Fine-Tuning. ACL/IJCNLP (1) 2021: 3403-3417 - [c38]Zewen Chi, Li Dong, Bo Zheng, Shaohan Huang, Xian-Ling Mao, Heyan Huang, Furu Wei:
Improving Pretrained Cross-Lingual Language Models via Self-Labeled Word Alignment. ACL/IJCNLP (1) 2021: 3418-3430 - [c37]Yuekai Zhao, Li Dong, Yelong Shen, Zhihua Zhang, Furu Wei, Weizhu Chen:
Memory-Efficient Differentiable Transformer Architecture Search. ACL/IJCNLP (Findings) 2021: 4254-4264 - [c36]Yaru Hao, Li Dong, Hangbo Bao, Ke Xu, Furu Wei:
Learning to Sample Replacements for ELECTRA Pre-Training. ACL/IJCNLP (Findings) 2021: 4495-4506 - [c35]Guanhua Chen, Shuming Ma, Yun Chen, Li Dong, Dongdong Zhang, Jia Pan, Wenping Wang, Furu Wei:
Zero-Shot Cross-Lingual Transfer of Neural Machine Translation with Multilingual Pretrained Encoders. EMNLP (1) 2021: 15-26 - [c34]Zewen Chi, Li Dong, Shuming Ma, Shaohan Huang, Saksham Singhal, Xian-Ling Mao, Heyan Huang, Xia Song, Furu Wei:
mT6: Multilingual Pretrained Text-to-Text Transformer with Translation Pairs. EMNLP (1) 2021: 1671-1683 - [c33]Bo Zheng, Li Dong, Shaohan Huang, Saksham Singhal, Wanxiang Che, Ting Liu, Xia Song, Furu Wei:
Allocating Large Vocabulary Capacity for Cross-Lingual Language Model Pre-Training. EMNLP (1) 2021: 3203-3215 - [c32]Zewen Chi, Li Dong, Furu Wei, Nan Yang, Saksham Singhal, Wenhui Wang, Xia Song, Xian-Ling Mao, Heyan Huang, Ming Zhou:
InfoXLM: An Information-Theoretic Framework for Cross-Lingual Language Model Pre-Training. NAACL-HLT 2021: 3576-3588 - [c31]Jian Yang, Shuming Ma, Haoyang Huang, Dongdong Zhang, Li Dong, Shaohan Huang, Alexandre Muzio, Saksham Singhal, Hany Hassan, Xia Song, Furu Wei:
Multilingual Machine Translation Systems from Microsoft for WMT21 Shared Task. WMT@EMNLP 2021: 446-455 - [i43]Zewen Chi, Li Dong, Shuming Ma, Shaohan Huang, Xian-Ling Mao, Heyan Huang, Furu Wei:
mT6: Multilingual Pretrained Text-to-Text Transformer with Translation Pairs. CoRR abs/2104.08692 (2021) - [i42]Damai Dai, Li Dong, Yaru Hao, Zhifang Sui, Furu Wei:
Knowledge Neurons in Pretrained Transformers. CoRR abs/2104.08696 (2021) - [i41]Guanhua Chen, Shuming Ma, Yun Chen, Li Dong, Dongdong Zhang, Jia Pan, Wenping Wang, Furu Wei:
Zero-shot Cross-lingual Transfer of Neural Machine Translation with Multilingual Pretrained Encoders. CoRR abs/2104.08757 (2021) - [i40]Yuekai Zhao, Li Dong, Yelong Shen, Zhihua Zhang, Furu Wei, Weizhu Chen:
Memory-Efficient Differentiable Transformer Architecture Search. CoRR abs/2105.14669 (2021) - [i39]Zewen Chi, Li Dong, Bo Zheng, Shaohan Huang, Xian-Ling Mao, Heyan Huang, Furu Wei:
Improving Pretrained Cross-Lingual Language Models via Self-Labeled Word Alignment. CoRR abs/2106.06381 (2021) - [i38]Bo Zheng, Li Dong, Shaohan Huang, Wenhui Wang, Zewen Chi, Saksham Singhal, Wanxiang Che, Ting Liu, Xia Song, Furu Wei:
Consistency Regularization for Cross-Lingual Fine-Tuning. CoRR abs/2106.08226 (2021) - [i37]Hangbo Bao, Li Dong, Furu Wei:
BEiT: BERT Pre-Training of Image Transformers. CoRR abs/2106.08254 (2021) - [i36]Yunzhi Yao, Shaohan Huang, Wenhui Wang, Li Dong, Furu Wei:
Adapt-and-Distill: Developing Small, Fast and Effective Pretrained Language Models for Domains. CoRR abs/2106.13474 (2021) - [i35]Yaru Hao, Li Dong, Hangbo Bao, Ke Xu, Furu Wei:
Learning to Sample Replacements for ELECTRA Pre-Training. CoRR abs/2106.13715 (2021) - [i34]Shuming Ma, Li Dong, Shaohan Huang, Dongdong Zhang, Alexandre Muzio, Saksham Singhal, Hany Hassan Awadalla, Xia Song, Furu Wei:
DeltaLM: Encoder-Decoder Pre-training for Language Generation and Translation by Augmenting Pretrained Multilingual Encoders. CoRR abs/2106.13736 (2021) - [i33]Zewen Chi, Shaohan Huang, Li Dong, Shuming Ma, Saksham Singhal, Payal Bajaj, Xia Song, Furu Wei:
XLM-E: Cross-lingual Language Model Pre-training via ELECTRA. CoRR abs/2106.16138 (2021) - [i32]Bo Zheng, Li Dong, Shaohan Huang, Saksham Singhal, Wanxiang Che, Ting Liu, Xia Song, Furu Wei:
Allocating Large Vocabulary Capacity for Cross-lingual Language Model Pre-training. CoRR abs/2109.07306 (2021) - [i31]Hangbo Bao, Li Dong, Wenhui Wang, Nan Yang, Furu Wei:
s2s-ft: Fine-Tuning Pretrained Transformer Encoders for Sequence-to-Sequence Learning. CoRR abs/2110.13640 (2021) - [i30]Jian Yang, Shuming Ma, Haoyang Huang, Dongdong Zhang, Li Dong, Shaohan Huang, Alexandre Muzio, Saksham Singhal, Hany Hassan Awadalla, Xia Song, Furu Wei:
Multilingual Machine Translation Systems from Microsoft for WMT21 Shared Task. CoRR abs/2111.02086 (2021) - [i29]Wenhui Wang, Hangbo Bao, Li Dong, Furu Wei:
VLMo: Unified Vision-Language Pre-Training with Mixture-of-Modality-Experts. CoRR abs/2111.02358 (2021) - [i28]Ze Liu, Han Hu, Yutong Lin, Zhuliang Yao, Zhenda Xie, Yixuan Wei, Jia Ning, Yue Cao, Zheng Zhang, Li Dong, Furu Wei, Baining Guo:
Swin Transformer V2: Scaling Up Capacity and Resolution. CoRR abs/2111.09883 (2021) - 2020
- [c30]Zewen Chi, Li Dong, Furu Wei, Wenhui Wang, Xian-Ling Mao, Heyan Huang:
Cross-Lingual Natural Language Generation via Pre-Training. AAAI 2020: 7570-7577 - [c29]Zhongli Li, Wenhui Wang, Li Dong, Furu Wei, Ke Xu:
Harvesting and Refining Question-Answer Pairs for Unsupervised QA. ACL 2020: 6719-6728 - [c28]Xiujun Li, Xi Yin, Chunyuan Li, Pengchuan Zhang, Xiaowei Hu, Lei Zhang, Lijuan Wang, Houdong Hu, Li Dong, Furu Wei, Yejin Choi, Jianfeng Gao:
Oscar: Object-Semantics Aligned Pre-training for Vision-Language Tasks. ECCV (30) 2020: 121-137 - [c27]Hangbo Bao, Li Dong, Furu Wei, Wenhui Wang, Nan Yang, Xiaodong Liu, Yu Wang, Jianfeng Gao, Songhao Piao, Ming Zhou, Hsiao-Wuen Hon:
UniLMv2: Pseudo-Masked Language Models for Unified Language Model Pre-Training. ICML 2020: 642-652 - [c26]Zewen Chi, Li Dong, Furu Wei, Xianling Mao, Heyan Huang:
Can Monolingual Pretrained Models Help Cross-Lingual Classification? AACL/IJCNLP 2020: 12-17 - [c25]Yaru Hao, Li Dong, Furu Wei, Ke Xu:
Investigating Learning Dynamics of BERT Fine-Tuning. AACL/IJCNLP 2020: 87-92 - [c24]Wenhui Wang, Furu Wei, Li Dong, Hangbo Bao, Nan Yang, Ming Zhou:
MiniLM: Deep Self-Attention Distillation for Task-Agnostic Compression of Pre-Trained Transformers. NeurIPS 2020 - [i27]Wenhui Wang, Furu Wei, Li Dong, Hangbo Bao, Nan Yang, Ming Zhou:
MiniLM: Deep Self-Attention Distillation for Task-Agnostic Compression of Pre-Trained Transformers. CoRR abs/2002.10957 (2020) - [i26]Hangbo Bao, Li Dong, Furu Wei, Wenhui Wang, Nan Yang, Xiaodong Liu, Yu Wang, Songhao Piao, Jianfeng Gao, Ming Zhou, Hsiao-Wuen Hon:
UniLMv2: Pseudo-Masked Language Models for Unified Language Model Pre-Training. CoRR abs/2002.12804 (2020) - [i25]Xiujun Li, Xi Yin, Chunyuan Li, Pengchuan Zhang, Xiaowei Hu, Lei Zhang, Lijuan Wang, Houdong Hu, Li Dong, Furu Wei, Yejin Choi, Jianfeng Gao:
Oscar: Object-Semantics Aligned Pre-training for Vision-Language Tasks. CoRR abs/2004.06165 (2020) - [i24]Yaru Hao, Li Dong, Furu Wei, Ke Xu:
Self-Attention Attribution: Interpreting Information Interactions Inside Transformer. CoRR abs/2004.11207 (2020) - [i23]Zhongli Li, Wenhui Wang, Li Dong, Furu Wei, Ke Xu:
Harvesting and Refining Question-Answer Pairs for Unsupervised QA. CoRR abs/2005.02925 (2020) - [i22]Zewen Chi, Li Dong, Furu Wei, Nan Yang, Saksham Singhal, Wenhui Wang, Xia Song, Xian-Ling Mao, Heyan Huang, Ming Zhou:
InfoXLM: An Information-Theoretic Framework for Cross-Lingual Language Model Pre-Training. CoRR abs/2007.07834 (2020) - [i21]Shuming Ma, Jian Yang, Haoyang Huang, Zewen Chi, Li Dong, Dongdong Zhang, Hany Hassan Awadalla, Alexandre Muzio, Akiko Eriguchi, Saksham Singhal, Xia Song, Arul Menezes, Furu Wei:
XLM-T: Scaling up Multilingual Machine Translation with Pretrained Cross-lingual Transformer Encoders. CoRR abs/2012.15547 (2020) - [i20]Wenhui Wang, Hangbo Bao, Shaohan Huang, Li Dong, Furu Wei:
MiniLMv2: Multi-Head Self-Attention Relation Distillation for Compressing Pretrained Transformers. CoRR abs/2012.15828 (2020)
2010 – 2019
- 2019
- [j5]Xi Wang, Jiagao Lyu, Li Dong, Ke Xu:
Multitask learning for biomedical named entity recognition with cross-sharing structure. BMC Bioinform. 20(1): 427:1-427:13 (2019) - [c23]Ratish Puduppully, Li Dong, Mirella Lapata:
Data-to-Text Generation with Content Selection and Planning. AAAI 2019: 6908-6915 - [c22]Ratish Puduppully, Li Dong, Mirella Lapata:
Data-to-text Generation with Entity Modeling. ACL (1) 2019: 2023-2035 - [c21]Haichao Zhu
, Li Dong, Furu Wei, Wenhui Wang, Bing Qin, Ting Liu:
Learning to Ask Unanswerable Questions for Machine Reading Comprehension. ACL (1) 2019: 4238-4248 - [c20]Hangbo Bao, Li Dong, Furu Wei, Wenhui Wang, Nan Yang, Lei Cui, Songhao Piao, Ming Zhou:
Inspecting Unification of Encoding and Matching with Transformer: A Case Study of Machine Reading Comprehension. MRQA@EMNLP 2019: 14-18 - [c19]Yaru Hao, Li Dong, Furu Wei, Ke Xu:
Visualizing and Understanding the Effectiveness of BERT. EMNLP/IJCNLP (1) 2019: 4141-4150 - [c18]Li Dong, Nan Yang, Wenhui Wang, Furu Wei, Xiaodong Liu, Yu Wang, Jianfeng Gao, Ming Zhou, Hsiao-Wuen Hon:
Unified Language Model Pre-training for Natural Language Understanding and Generation. NeurIPS 2019: 13042-13054 - [i19]Li Dong, Nan Yang, Wenhui Wang, Furu Wei, Xiaodong Liu, Yu Wang, Jianfeng Gao, Ming Zhou, Hsiao-Wuen Hon:
Unified Language Model Pre-training for Natural Language Understanding and Generation. CoRR abs/1905.03197 (2019) - [i18]Ratish Puduppully, Li Dong, Mirella Lapata:
Data-to-text Generation with Entity Modeling. CoRR abs/1906.03221 (2019) - [i17]Haichao Zhu, Li Dong, Furu Wei, Wenhui Wang, Bing Qin, Ting Liu:
Learning to Ask Unanswerable Questions for Machine Reading Comprehension. CoRR abs/1906.06045 (2019) - [i16]Yaru Hao, Li Dong, Furu Wei, Ke Xu:
Visualizing and Understanding the Effectiveness of BERT. CoRR abs/1908.05620 (2019) - [i15]Zewen Chi, Li Dong, Furu Wei, Wenhui Wang, Xianling Mao, Heyan Huang:
Cross-Lingual Natural Language Generation via Pre-Training. CoRR abs/1909.10481 (2019) - [i14]Haichao Zhu
, Li Dong, Furu Wei, Bing Qin, Ting Liu:
Transforming Wikipedia into Augmented Data for Query-Focused Summarization. CoRR abs/1911.03324 (2019) - [i13]Zewen Chi, Li Dong, Furu Wei, Xianling Mao, Heyan Huang:
Can Monolingual Pretrained Models Help Cross-Lingual Classification? CoRR abs/1911.03913 (2019) - 2018
- [j4]Ursula Challita
, Li Dong
, Walid Saad
:
Proactive Resource Management for LTE in Unlicensed Spectrum: A Deep Learning Perspective. IEEE Trans. Wirel. Commun. 17(7): 4674-4689 (2018) - [c17]Li Dong, Mirella Lapata:
Coarse-to-Fine Decoding for Neural Semantic Parsing. ACL (1) 2018: 731-742 - [c16]Li Dong, Chris Quirk, Mirella Lapata:
Confidence Modeling for Neural Semantic Parsing. ACL (1) 2018: 743-753 - [i12]Li Dong, Chris Quirk, Mirella Lapata:
Confidence Modeling for Neural Semantic Parsing. CoRR abs/1805.04604 (2018) - [i11]Li Dong, Mirella Lapata:
Coarse-to-Fine Decoding for Neural Semantic Parsing. CoRR abs/1805.04793 (2018) - [i10]Ratish Puduppully, Li Dong, Mirella Lapata:
Data-to-Text Generation with Content Selection and Planning. CoRR abs/1809.00582 (2018) - 2017
- [c15]Li Dong, Shaohan Huang, Furu Wei, Mirella Lapata, Ming Zhou, Ke Xu:
Learning to Generate Product Reviews from Attributes. EACL (1) 2017: 623-632 - [c14]Li Dong, Jonathan Mallinson, Siva Reddy, Mirella Lapata:
Learning to Paraphrase for Question Answering. EMNLP 2017: 875-886 - [i9]Ursula Challita, Li Dong, Walid Saad:
Proactive Resource Management in LTE-U Systems: A Deep Learning Perspective. CoRR abs/1702.07031 (2017) - [i8]Li Dong, Jonathan Mallinson, Siva Reddy, Mirella Lapata:
Learning to Paraphrase for Question Answering. CoRR abs/1708.06022 (2017) - 2016
- [j3]Li Dong, Furu Wei, Ke Xu, Shixia Liu
, Ming Zhou:
Adaptive Multi-Compositionality for Recursive Neural Network Models. IEEE ACM Trans. Audio Speech Lang. Process. 24(3): 422-431 (2016) - [c13]Li Dong, Mirella Lapata:
Language to Logical Form with Neural Attention. ACL (1) 2016 - [c12]Jianpeng Cheng, Li Dong, Mirella Lapata:
Long Short-Term Memory-Networks for Machine Reading. EMNLP 2016: 551-561 - [c11]Chuanqi Tan, Furu Wei, Li Dong, Weifeng Lv, Ming Zhou:
Solving and Generating Chinese Character Riddles. EMNLP 2016: 846-855 - [c10]Yichun Yin, Furu Wei, Li Dong, Kaimeng Xu, Ming Zhang, Ming Zhou:
Unsupervised Word and Dependency Path Embeddings for Aspect Term Extraction. IJCAI 2016: 2979-2985 - [i7]Li Dong, Mirella Lapata:
Language to Logical Form with Neural Attention. CoRR abs/1601.01280 (2016) - [i6]Jianpeng Cheng, Li Dong, Mirella Lapata:
Long Short-Term Memory-Networks for Machine Reading. CoRR abs/1601.06733 (2016) - [i5]Yichun Yin, Furu Wei, Li Dong, Kaimeng Xu, Ming Zhang, Ming Zhou:
Unsupervised Word and Dependency Path Embeddings for Aspect Term Extraction. CoRR abs/1605.07843 (2016) - 2015
- [j2]Li Dong, Furu Wei, Shujie Liu, Ming Zhou, Ke Xu:
A Statistical Parsing Framework for Sentiment Classification. Comput. Linguistics 41(2): 293-336 (2015) - [j1]Duyu Tang, Bing Qin, Furu Wei, Li Dong, Ting Liu, Ming Zhou:
A Joint Segmentation and Classification Framework for Sentence Level Sentiment Classification. IEEE ACM Trans. Audio Speech Lang. Process. 23(11): 1750-1761 (2015) - [c9]Ziqiang Cao, Furu Wei, Li Dong, Sujian Li, Ming Zhou:
Ranking with Recursive Neural Networks and Its Application to Multi-Document Summarization. AAAI 2015: 2153-2159 - [c8]Li Dong, Furu Wei, Ming Zhou, Ke Xu:
Question Answering over Freebase with Multi-Column Convolutional Neural Networks. ACL (1) 2015: 260-269 - [c7]Li Dong, Furu Wei, Hong Sun, Ming Zhou, Ke Xu:
A Hybrid Neural Model for Type Classification of Entity Mentions. IJCAI 2015: 1243-1249 - [c6]Li Dong, Furu Wei, Yichun Yin, Ming Zhou, Ke Xu:
Splusplus: A Feature-Rich Two-stage Classifier for Sentiment Analysis of Tweets. SemEval@NAACL-HLT 2015: 515-519 - 2014
- [c5]Li Dong, Furu Wei, Ming Zhou, Ke Xu:
Adaptive Multi-Compositionality for Recursive Neural Models with Applications to Sentiment Analysis. AAAI 2014: 1537-1543 - [c4]Li Dong, Furu Wei, Chuanqi Tan, Duyu Tang, Ming Zhou, Ke Xu:
Adaptive Recursive Neural Network for Target-dependent Twitter Sentiment Classification. ACL (2) 2014: 49-54 - [c3]Duyu Tang, Furu Wei, Bing Qin, Li Dong, Ting Liu, Ming Zhou:
A Joint Segmentation and Classification Framework for Sentiment Analysis. EMNLP 2014: 477-487 - [i4]Li Dong, Furu Wei, Shujie Liu, Ming Zhou, Ke Xu:
A Statistical Parsing Framework for Sentiment Classification. CoRR abs/1401.6330 (2014) - 2013
- [c2]Li Dong, Furu Wei, Yajuan Duan, Xiaohua Liu, Ming Zhou, Ke Xu:
The Automated Acquisition of Suggestions from Tweets. AAAI 2013: 239-245 - [i3]Xiao Liang, Jichang Zhao, Li Dong, Ke Xu:
Unraveling the origin of exponential law in intra-urban human mobility. CoRR abs/1305.6364 (2013) - 2012
- [c1]Jichang Zhao
, Li Dong, Junjie Wu, Ke Xu:
MoodLens: an emoticon-based sentiment analysis system for chinese tweets. KDD 2012: 1528-1531 - [i2]Xiao Liang, Jichang Zhao, Li Dong, Ke Xu:
Modeling collective human mobility: Understanding exponential law of intra-urban movement. CoRR abs/1212.6331 (2012) - 2011
- [i1]Jichang Zhao, Xu Feng, Li Dong, Xiao Liang, Ke Xu:
Performance of Local Information Based Link Prediction: A Sampling Perspective. CoRR abs/1107.1586 (2011)
Coauthor Index
aka: Xian-Ling Mao

manage site settings
To protect your privacy, all features that rely on external API calls from your browser are turned off by default. You need to opt-in for them to become active. All settings here will be stored as cookies with your web browser. For more information see our F.A.Q.
Unpaywalled article links
Add open access links from to the list of external document links (if available).
Privacy notice: By enabling the option above, your browser will contact the API of unpaywall.org to load hyperlinks to open access articles. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the Unpaywall privacy policy.
Archived links via Wayback Machine
For web page which are no longer available, try to retrieve content from the of the Internet Archive (if available).
Privacy notice: By enabling the option above, your browser will contact the API of archive.org to check for archived content of web pages that are no longer available. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the Internet Archive privacy policy.
Reference lists
Add a list of references from ,
, and
to record detail pages.
load references from crossref.org and opencitations.net
Privacy notice: By enabling the option above, your browser will contact the APIs of crossref.org, opencitations.net, and semanticscholar.org to load article reference information. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the Crossref privacy policy and the OpenCitations privacy policy, as well as the AI2 Privacy Policy covering Semantic Scholar.
Citation data
Add a list of citing articles from and
to record detail pages.
load citations from opencitations.net
Privacy notice: By enabling the option above, your browser will contact the API of opencitations.net and semanticscholar.org to load citation information. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the OpenCitations privacy policy as well as the AI2 Privacy Policy covering Semantic Scholar.
OpenAlex data
Load additional information about publications from .
Privacy notice: By enabling the option above, your browser will contact the API of openalex.org to load additional information. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the information given by OpenAlex.
last updated on 2025-03-28 00:33 CET by the dblp team
all metadata released as open data under CC0 1.0 license
see also: Terms of Use | Privacy Policy | Imprint