3. Data Sets
3.1. XXX Database
3.1.1. Abstract
The XXX data set is a multivariate data set used for efficient and accurate density-based prediction of macromolecular molecular properties. This data sets consists of 11 different types of Information-theoretic approach quantities (ITA). The rows being the ITA values and the columns being: Shannon, Fisher, Fisher’, GBP, E2, E3, R2, R3, G1, G2, G3, iso, aniso.
3.1.2. Download
Currently covered nucleophiles and electrophiles molecule .xyz files.
3.1.3. How to cite
When using this dataset, please make sure to cite the following two papers:
[1]:
from mlita import datasets
# import some data to play with
#zhaos = datasets.load_zhaos()
#X = zhaos.data[:, :2] # we only take the first two features.
#y = zhaos.target
3.2. Mayr’s Database
3.2.1. Abstract
Mayr’s Database Of Reactivity Parameters currently the database contains the reactivity parameters of 1256 nucleophiles and 344 electrophiles.
According to \(log\ k20°C = sN(N + E)\)
3.2.2. Download
Currently covered nucleophiles molecule .xyz files.
Currently covered electronphiles molecule .xyz files.
Covered nucleophiles range: -8.80 ≤ N ≤ 30.82
Covered electrophiles range: -29.60 ≤ E ≤ 8.02
[ ]:
from mlita import datasets
# import some data to play with
#mayrs = datasets.load_mayrs()
#X = mayrs.data[:, :2] # we only take the first two features.
#y = mayrs.target
[ ]: