Chemoinformatics Laboratory | 奈良先端科学技術大学院大学 物質創成科学領域    
       

Chemoinformatics Laboratory

Staff & Contact

Educational Staff Prof. Tomoyuki Miyao
Assistant Prof. Akinori Sato
URL https://sites.google.com/view/naist-chemoinformatics

Education and Research Activities in the Laboratory

Chemoinformatics is a research area where chemical problems are addressed using informatics tools. Our mission is to develop chemoinformatics tools that are practical and useful for applications in the field of chemistry and biology. For example, molecular representations have been extensively investigated for virtual screening of large compound datasets to identify bioactive compounds. Likewise, the investigation of appropriate chemical reaction representations for predicting reaction parameters (e.g., yield, selectivity) is an active area of research. To develop tools and methods, one must understand both domain knowledge (chemistry or biology) and analytical techniques (statistics and machine learning). Either having experienced one of the two studies is preferable for conducting meaningful research. So far, most of the students in our group have chemistry or biology backgrounds. They have learned information techniques through training programs in our group. Starting with the basics of data analysis (machine learning), you will learn how to handle chemistry-related data and analyze it to extract useful information. Students with an information science background can study chemistry and biology, with a focus of drug discovery to conduct meaningful studies.

Research Themes

1. Methodology development for affinity prediction

Virtual screening is a process that selects potential candidate compounds for a specific target from a compound pool. In ligand-based approaches, the principle that similar compounds show similar biological activity holds. This principle, however, is not necessarily true when focusing on ligand-protein binding mechanisms. Methodological development to extract key information based on these mechanisms within ligand-based approaches will improve virtual screening.

2. Molecular representation for understanding chemical reactions

Conventional chemical reaction analysis employs a set of molecular descriptors representing key components of a single putative reaction mechanism to correlate the catalyst’s chemical structure with the reaction parameter, thereby ignoring other reaction paths, including side reactions. We are developing methods to incorporate multiple reaction pathways into a reaction-parameter prediction model to enhance its predictive capability.

3. Modeling approaches in the low data regime

Laboratory-scale chemistry data sets are small: fewer than 50 samples (sometimes around 10), all experimentally obtained under homogenous experimental conditions. Mechanism-oriented molecular representation in combination with traditional machine learning modeling would be a reasonable approach for this type of problem; however, recently developed DNN techniques, such as meta-learning and pre-training, would also be options.

Recent Research Papers and Achievements

  1. Hamakawa Y., Miyao T. J. Chem. Inf. Model. 2025, 65, 3388-3404.
  2. Iwasaki Y., Miyao T., Molecular Informatics, 2025, 44, e70000.
  3. Ue T., Sato A., Miyao T. J. Chem. Inf. Model. 2024, 64, 9350-9360.