Research Interest
- Natural Language Processing
- Knowledge acquisition and reasoning
- Knowledge-based reasoning
- Information extraction
- Cross-lingual NLP
- Low-resource NLP
- Resource and Evaluation
- Crowdsourcing
- Human-in-the-loop
- Applications in the Human Resources domain
In general, I like to start by observing how people use language and finding interesting problems that can lead to a better society. So far, many of my research projects ended up proposing a new task and/or creating a new language resource rather than developing systems that compete well on existing benchmark datasets. I like exploring how linguistic theories and computational models can explain various linguistic phenomena and how external data, particularly structured knowledge, can improve language technologies in an interpretable way.
Education
- 2019/08 - 2023/05:
- Ph.D., LTI, CMU, Advisor: Eduard Hovy.
- 2017/08 - 2019/08:
- MS., LTI, CMU, Advisor: Eduard Hovy.
- 2015 - 2017:
- MS., Kyoto University, Advisor: Sadao Kurohashi.
- 2011 - 2015:
- Undergraduate, Kyoto University, Advisor: Hisashi Kashima.
Work Experience
- 2023/07 - Present:
- Research Scientist, Megagon Labs.
- 2019/08 - 2023/05:
- Research Assistant, Language Technologies Institute, Carnegie Mellon University.
- 2021/06 - 2021/08:
- Internship, Microsoft Research Redmond, Online, Topic: intent-based text representation for to-do management assistance.
- 2020/06 - 2020/08:
- Internship, Robert Bosch LLC, Online, Topic: dialogue response generation using common-sense.
- 2016/08 - 2016/10:
- Internship, Microsoft Research Asia, Beijing, China, Topic: paraphrasing and text normalization, Received Award of Excellence.
- 2016/02 - 2017/03:
- Internship, Yahoo! JAPAN, Tokyo, Topic: GWAP on spoken dialogue systems for knowledge acquisition.
- 2015/09 - 2015/10:
- Internship, Mentor: Akiko Murakami, IBM Research - Tokyo, Tokyo, Topic: abbreviation disambiguation.
Publications
dblpJournal
- Naoki Otani, Yukino Baba, and Hisashi Kashima. 2016. Quality Control of Crowdsourced Classification Using Hierarchical Class Structures. Expert Systems with Applications (ESWA), 58:155–163. http://doi.org/10.1016/j.eswa.2016.04.009
Conference
- Naoki Otani, Jun Araki, HyeongSik Kim, Eduard Hovy. 2023. A Textual Dataset for Situated Proactive Response Selection. In Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (ACL), pages 3856–3874, Toronto, Canada. July. Association for Computational Linguistics.
- Naoki Otani, Michael Gamon, Sujay Kumar Jauhar, Mei Yang, Sri Raghu Malireddi, and Oriana Riva. 2022. LITE: Intent-based Task Representation Learning Using Weak Supervision. In Proceedings of the 2022 Annual Conference of the North American Chapter of the Association for Computational Linguistics (NAACL), pages 2410-2424, Seattle, USA. July. Association for Computational Linguistics.
- Naoki Otani, Satoru Ozaki, Xingyuan Zhao, Yucen Li, Micaelah St Johns and Lori Levin. 2020. Pre-tokenization of Multi-word Expressions in Cross-lingual Word Embeddings. In Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP), pages 4451-4464, Online, November. Association for Computational Linguistics.
- Naoki Otani, and Eduard Hovy. 2019. Toward Comprehensive Understanding of a Sentiment Based on Human Motives. In Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics (ACL), pages 4672-4677, Florence, Itali, July. Association for Computational Linguistics. [poster] [slides]
- Ruochen Xu, Yiming Yang, Naoki Otani, and Yuexin Wu. 2018. Unsupervised Cross-lingual Transfer of Word Embedding Spaces. In Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing (EMNLP), pages 2465-2474, Brussels, Belgium, November. Association for Computational Linguistics.
- Naoki Otani, Hirokazu Kiyomaru, Daisuke Kawahara, and Sadao Kurohashi. 2018. Cross-lingual Knowledge Projection Using Machine Translation and Target-side Knowledge Base Completion. In Proceedings of the 27th International Conference on Computational Linguistics (COLING), pages 1508–1520, Santa Fe, New Mexico, USA, August. Association for Computational Linguistics. [poster]
- Aldrian Obaja Muis, Naoki Otani, Nidhi Vyas, Ruochen Xu, Yiming Yang, Teruko Mitamura, and Eduard Hovy. 2018. Low-resource Cross-lingual Event Type Detection via Distant Supervision with Minimal Effort. In Proceedings of the 27th International Conference on Computational Linguistics (COLING), pages 70–82, Santa Fe, New Mexico, USA, August. Association for Computational Linguistics.
- Naoki Otani, Toshiaki Nakazawa, Daisuke Kawahara, and Sadao Kurohashi. 2016. IRT-based Aggregation Model of Crowdsourced Pairwise Comparison for Evaluating Machine Translations. In Proceedings of the 2016 Conference on Empirical Methods in Natural Language Processing (EMNLP), pages 511–520, Austin, Texas, USA, November. Association for Computational Linguistics.
- Naoki Otani, Yukino Baba, and Hisashi Kashima. 2015. Quality control for crowdsourced hierarchical classification. In Proceedings of 2015 IEEE International Conference on Data Mining (ICDM), pages 937–942, Atlantic City, New Jersey, USA, November. IEEE.
Workshop
- Naoki Otani, Jun Araki, HyeongSik Kim, Eduard Hovy. 2023. On the Underspecification of Situations in Open-domain Conversational Datasets. In Proceedings of the 5th Workshop on NLP for Conversational AI, Toronto, Canada. July. Association for Computational Linguistics. (Outstanding Paper)
- Leonie Weissweiler, Taiqi He, Naoki Otani, David R. Mortensen, Lori Levin, Hinrich Schütze. 2023. Construction Grammar Provides Unique Insight into Neural Language Models. In Proceedings of the First International Workshop on Construction Grammars and NLP (CxGs+NLP, GURT/SyntaxFest 2023), pages 85-95, Washington D.C., USA, March.
- Shirley Anugrah Hayati, Aditi Chaudhary, Naoki Otani, and Alan W Black. 2016. What A Sunny Day ☔: Toward Emoji-Sensitive Irony Detection. In Proceedings of the 5th Workshop on Noisy User-generated Text (W-NUT), pages 212-216, Hong Kong, China, November. Association for Computational Linguistics. (Workshop at EMNLP)
- Naoki Otani, Daisuke Kawahara, Sadao Kurohashi, Nobuhiro Kaji, and Manabu Sassano. 2016. Large-Scale Acquisition of Commonsense Knowledge via a Quiz Game on a Dialogue System. In Proceedings of Open Knowledge Base and Question Answering (OKQBA) Workshop, pages 11-20, Osaka, Japan, December. The COLING 2016 Organizing Committee. (Workshop at COLING)
Service
- 2024: LREC-COLING, WNUT, CoLM, ARR (Apr, Aug), SRW (ACL), NLP4ConvAI, EMNLP (Industry Track)
- 2023: ACL, ARR (Feb, Oct, Dec), SRW (IJCNLP-AAACL), EMNLP (Industry Track)
- 2022: ARR (Dec), COLING, SRW (ACL)
- 2021: ARR (Sep, Oct), ACL-IJCNLP, EACL, EMNLP, SRW (NAACL), WNUT
- 2020: ACL, AACL-IJCNLP, SRW (ACL), SRW (ACL, AACL)
- 2019: EMNLP-IJCNLP, WNUT
Awards
- Funai Overseas Scholarship (2017-2019)
- The Award of Excellence, in recognition of my participation in the Microsoft Research Asia Internship Program as a member of Natural Language Computing Group, Microsoft Research Asia, 2016.
- Student Scholarship, EMNLP, 2016.
- Kyoto University Design School Award, "Lyric Generation with Deep learning" (joint work), Hack U Kyoto University 2015, Yahoo! Japan and Kyoto University, 2015.
- Semi-finalist, "Data Analysis Competition Platform on Bluemix", IBM Bluemix Challenge, IBM Japan, 2014.