Services
pHSol - 1.1
pH-dependent aqueous solubility of druglike molecules
The pHSol 1.1 server predicts pH-dependent aqueous solubility of druglike molecules.
Submission
CITATIONS
For publication of results, please cite:
Prediction of pH-dependent aqueous solubility of druglike molecules.
Niclas Tue Hansen, Irene Kouskoumvekaki, Flemming Steen
Jørgensen,
Søren Brunak and Svava Ósk Jónsdóttir.
J Chem Inf Model: 46(6): 2601-9, 2006.
Instructions
1. Specify the input molecules
This web-server calculates intrinsic solubility and pH-dependent solubility profiles of drugs and drug-like molecules from molecular structure.There are three possible ways of entering molecular structure information.
1) Insert the SMILES strings of the molecules in the window to the left.
2) Read the SMILES strings from a file (*.smi)
3) Read the structures from a sdf file (*.sdf) Sample copy
The SMILES entry window and the smi file should have the following format (SMILES-string identifier). Multiple structures can be entered on separate lines.
CC(=O)OC1=CC=CC=C1C(=O)O Aspirin
CCOC(=O)C1=CC=C(C=C1)N Benzocaine
C1C2C(C(C1Cl)Cl)C3(C(=C(C2(C3(Cl)Cl)Cl)Cl)Cl)Cl Chlordane
CCOC(=O)CC(C(=O)OCC)SP(=S)(OC)OC Malathion
C1=CC=C(C=C1)C2(C(=O)NC(=O)N2)C3=CC=CC=C3 Phenytoin
CC12CCC3C(C1CCC2O)CCC4=CC(=O)CCC34C Testosterone
Identifiers can be left out, entering SMILES only on separate lines.
2. Customize your run
Click on the button labelled "Intrinsic solubilities only" if you only want to calculate the intrinsic solubilities.Click on the button labelled "Generate graphics" if you want to calculate the pH-dependent solubility profiles as well.
3. Submit the job
Click on the "Submit" button. The status of your job (either 'queued' or 'running') will be displayed and constantly updated until it terminates and the server output appears in the browser window.At any time during the wait you may enter your e-mail address and simply leave the window. Your job will continue; you will be notified by e-mail when it has terminated. The e-mail message will contain the URL under which the results are stored; they will remain on the server for 24 hours for you to collect them.
Prediction
of intrinsic aqueous solubility (logS0): The prediction server calculates intrinsic
solubility (solubility of the non-ionized compound) values based on the
chemical structure information. For the prediction a neural network model
trained on a 4548 compound drug-like
data set from the PHYSPROP database, and tested on two different of external
validation sets, is used. The model is built on nine 2D-MOE (Molecular
Operational Environment) descriptors, and the training was done in a three-fold
cross validation using a fully connected feed-forward neural network with one
hidden layer and nine hidden neurons. (The chemical structures can be fed to
the server in SMILES or sdf format, or sketched in
the JME applet, the structures are transformed to 2D sdf-format with the Molconvert
program from ChemAxon and the descriptors at
generated with MOE and fed to the NN-predictor.)
Confidence
estimate for the intrinsic solubility prediction: A simple confidence index is assigned to each predicted solubility
value, indicating how well the compound matches the chemical space of the training
set. Compounds for which the predicted target value or the most important
descriptor fall within two standard deviations of the corresponding values of
the training set are considered to have high accuracy. For compounds that fall
within three standard deviation of either property the prediction accuracy is
evaluated to be moderate, and compounds that fall outside this range are
evaluated to have low prediction accuracy.
Prediction of pH-dependent aqueous solubility: The predicted pH-solubility profiles (logS) are calculated using the Henderson-Hasselbalch (HH) equation, using the logS0 values predicted with the server described above and acid-base dissociation coefficients (pKa values) computed with the Marvin program from ChemAxon.
In
case of a monoprotic acid the HH equation has the
form
log S = log S0
+ log
(1 + 10pH-pKa)
in the case of a monoprotic base becomes
log S = log S0
+ log
(1 + 10pKa-pH)
and for an ampholyte, the above two equations are combined to give
log S = log S0 + log (1 + 10pH-pKa(acid)
+ 10pKa(base)-pH)
The salt solubility limit of the compounds are not implemented in the present model, but will be included in the next version, and for this purpose a new data set has been measured by colleges at Warsaw University of Technology in Poland.
EXAMPLE OUTPUT
# Compound p_logS0 p_pKa p_pKb Rel. # ======================================================================== Aspirin -1.389 3.41 high # ========================================================================
Aspirin

Download the numerical data
References
Prediction of pH-dependent aqueous solubility of druglike molecules.
Niclas Tue Hansen, Irene Kouskoumvekaki, Flemming Steen Jørgensen1, Søren Brunak and Svava Ósk Jónsdóttir.
J Chem Inf Model: 46(6): 2601-9, 2006.
Center for Biological Sequence Analysis, Department of Systems Biology,
Technical University of Denmark, DK-2800 Lyngby, Denmark
1Danish University of Pharmaceutical Sciences,
Universitetsparken 2, DK-2100 Copenhagen, Denmark
PMID:
17125200