研究
信頼できる情報検索¶
IR systems play a remarkable role in bridging the ever-increasing world wide web with vast users’ information needs ranging from simply finding information to acquiring knowledge and making decisions. IR technologies, therefore, have a significant impact across many domains, such as digital libraries, e-commerce, news recommendation, healthcare search, patent search, job search, just to name a few. We aim to make an endeavor for achieving trustworthy IR technologies through exploring the following aspects:
AI-powered Search¶
Benefiting from the revolution of LLM and generative AI, AI-powered search systems can smoothly interact with users and automate many tasks, freeing people from the tedious process of searching, extracting and digesting information from multiple sources.
Privacy-preserving Information Retrieval¶
Modern IR systems can achieve enhanced performance by analyzing huge amounts of log data gathered from users. Unfortunately, the data to derive such insights is personal and sensitive, which might give rise to catastrophic consequences, even if the system collecting such data has resolved to ‘do no evil’. By now, accurately and efficiently providing satisfactory results to users while preserving privacy is far from being resolved. This research topic aims to initiate research into privacy-preserving IR and develop scalable privacy-preserving IR methods.
Multimodal Information Retrieval¶
Surpassing the constraints of data modality, Multimodal Information Retrieval integrates and aligns heterogeneous data sources (such as text, images, audio, and video) so as to enable more comprehensive and context-aware search experiences.
Concersational Information Retrieval¶
The technique of Concersational Information Retrieval allows user interaction and provides adaptive search results. Essentially it can be viewed as context-driven information retrieval, the context information includes previously submitted queries, interactive behaviors, etc. Fine-grained topics include user modeling, intent identification, behavior understanding, adaptive item ranking, etc.
Retrieval-augmented Generation (RAG)¶
Benefiting from the boom of deep learning and the recent success of LLMs, generative AI have taken the world by storm, posing a significant impact on the entire AI community. Unfortunately, LLMs are prone to hallucinations—producing inaccurate or fictitious content. To cope with this challenge, RAG has emerged as a promising paradigm for integrating information retrieval methods to improve the output of LLMs.
Design of Evaluation Metric/Method¶
Benefiting from the boom of deep learning and the recent success of LLMs, the generative AI revolution is reshaping the landscape of IR, powering technologies from traditional 10-blue-links retrieval to AI-powered search. Given the advanced IR technologies, how to effectively evaluate the performance becomes a pressing challenge.
常識推理¶
According to the Oxford English Dictionary, common sense is described as: ``Intelligence or sagacity in relation to practical matters arising in everyday life; the ability to make sound judgments and sensible decisions regarding such matters''. The commonsense knowledge about general concepts and activities is trivial for humans, yet is surprisingly difficult to acquire by AI systems. In order to develop AI systems that can perform human-like natural communication and reasoning, the so-called commonsense intelligence becomes the fundamental challenge. We aim to make an endeavor for endowing AI systems with commonsense intelligence through exploring the following aspects:
Multimodal Generative Commonsense Reasoning¶
The ability of generative commonsense reasoning (GCR) reflects how well an AI system can produce trustworthy outputs that align with real-world commonsense knowledge. By multimodal, we aim to surpassing the constraints of data modality so as to achieve more comprehensive and context-aware GCR.
ユーザー理解¶
User Modeling¶
Nowadays, enormous volume of search requests are submitted everyday. For example, Google processes over 3.5 billion searches per day (according to Internet Live Stats). Query logs capture and store interactions between search engines and their users and comprise a source of rich information regarding the ways in which users express their information needs, seek and select desired information units. This technique aims to extract the knowledge embedded in query logs in order to understand users, facilitate research of relevant fields, such as Information Retrieval and Recommender Systems.
知識グラフ¶
Knowledge Graph is a structured representation of knowledge in a graph format. For example, A triple-based Knowledge Graph consists of entities, concepts, and their relationships, where nodes denote entities (such as individuals, locations, or objects), and edges define semantic relationships between them. A number of example topics that we are focusing on can be listed as follows:
Persona Knowledge Graph¶
Affordance Knowledge Graph¶
計算的手法による人間価値の理解¶
The recent advancements in LLMs have significantly revolutionized AI-oriented applications across various domains. However, alongside advanced AI capabilities come potential harmful impacts on individuals and societies that we are not yet fully prepared to handle well. We aim to cope with the potential harmful impacts of LLM-driven AI systems (LLMAISs) through the lens of computational human-value understanding. The core principle is: Real-world human-centric activities are driven by human values. If LLMAISs can fully understand the nuances of real-world human values, and align their behaviors with our desired human values, humanity can expect to live in harmony with AI systems. A number of example topics that we are focusing on can be listed as follows: