A Caltech Library Service

ADME Evaluation in Drug Discovery. 8. The Prediction of Human Intestinal Absorption by a Support Vector Machine

Hou, Tingjun and Wang, Junmei and Li, Youyong (2007) ADME Evaluation in Drug Discovery. 8. The Prediction of Human Intestinal Absorption by a Support Vector Machine. Journal of Chemical Information and Modeling, 47 (6). pp. 2408-2415. ISSN 1549-9596. doi:10.1021/ci7002076.

[img] PDF (ADME Evaluation in Drug Discovery. 8.) - Supplemental Material
See Usage Policy.


Use this Persistent URL to link to this item:


Human intestinal absorption (HIA) is an important roadblock in the formulation of new drug substances. In silico models for predicting the percentage of HIA based on calculated molecular descriptors are highly needed for the rapid estimation of this property. Here, we have studied the performance of a support vector machine (SVM) to classify compounds with high or low fractional absorption (%FA > 30% or %FA ≤ 30%). The analyzed data set consists of 578 structural diverse druglike molecules, which have been divided into a 480-molecule training set and a 98-molecule test set. Ten SVM classification models have been generated to investigate the impact of different individual molecular properties on %FA. Among these studied important molecule descriptors, topological polar surface area (TPSA) and predicted apparent octanol−water distribution coefficient at pH 6.5 (logD_(6.5)) show better classification performance than the others. To obtain the best SVM classifier, the influences of different kernel functions and different combinations of molecular descriptors were investigated using a rigorous training-validation procedure. The best SVM classifier can give satisfactory predictions for the training set (97.8% for the poor-absorption class and 94.5% for the good-absorption class). Moreover, 100% of the poor-absorption class and 97.8% of the good-absorption class in the external test set could be correctly classified. Finally, the influence of the size of the training set and the unbalanced nature of the data set have been studied. The analysis demonstrates that large data set is necessary for the stability of the classification models. Furthermore, the weights for the poor-absorption class and the good-absorption class should be properly balanced to generate unbiased classification models. Our work illustrates that SVMs used in combination with simple molecular descriptors can provide an extremely reliable assessment of intestinal absorption in an early in silico filtering process.

Item Type:Article
Related URLs:
URLURL TypeDescription ItemSupporting Information
Li, Youyong0000-0002-5248-2756
Additional Information:© 2007 American Chemical Society. Received 12 June 2007. Published online 12 October 2007. Published in print 1 November 2007. T.H. is supported by a CTBP postdoctoral scholarship.
Funding AgencyGrant Number
University of California San DiegoUNSPECIFIED
Issue or Number:6
Record Number:CaltechAUTHORS:20170201-105717435
Persistent URL:
Official Citation:ADME Evaluation in Drug Discovery. 8. The Prediction of Human Intestinal Absorption by a Support Vector Machine Tingjun Hou, Junmei Wang, and Youyong Li Journal of Chemical Information and Modeling 2007 47 (6), 2408-2415 DOI: 10.1021/ci7002076
Usage Policy:No commercial reproduction, distribution, display or performance rights in this work are provided.
ID Code:73931
Deposited By: Ruth Sustaita
Deposited On:01 Feb 2017 19:17
Last Modified:11 Nov 2021 05:23

Repository Staff Only: item control page