Tutorials
1. What is a molecular descriptor
The molecular descriptor is the final result of a logic and mathematical procedure which transforms chemical information encoded within a symbolic representation of a molecule into a useful number or the result of some standardized experiment...

download tutorial (PDF)
 
2. Molecular descriptors and chemometrics
The concept of molecular structure is one the most important concepts in the development of the scientific knowledge of the XX century. As a matter of fact the reasoning based on the molecular structure has been the main engine for the great development of physical chemistry, molecular physics, organic chemistry, quantum chemistry, chemical synthesis, polymer chemistry, medicinal chemistry. By definition, a system is complex when...

download tutorial (PDF)
 
3. Basic requirements for valid molecular descriptors
Several scientists are involved in searching for new molecular descriptors able to catch new aspects of the molecular structure. This kind of research involves creativity and imagination together with solid theoretical basis to obtain numbers with some structural chemical meaning.
A molecular descriptor can be more or less useful, simple, interpretable, etc., but, in any case, it has to fulfil some mathematical requirements. In particular, the basic properties a molecular descriptor MUST HAVE are...

download tutorial (PDF)
 
4. Some relevant general papers on the molecular descriptors
A list of relevant papers dealing with history and general considerations about the molecular descriptors.

download tutorial (PDF)
 
5. Useful and unuseful summaries of regression models
How to make useful regression model reports and avoid unuseful things: simple definitions of some common regression quantities (coefficient of determination, error standard deviation, F-ratio test in regression, Predictive Error Sum of Squares), examples of questionable regression summaries and proposals of good summaries for regression models.

download tutorial (PDF)
 
6. Variable selection methods: an introduction
In order to develop regression/classification models, QSAR analysis typically uses molecular descriptors as independent variables. The number of molecular descriptors has hugely increased over time and nowadays thousands of descriptors, able to describe different aspects of a molecule, can be calculated by means of dedicated software. However, when modelling a particular property or biological activity, it is reasonable to assume that only a small number of descriptors is correlated to the experimental response and is, therefore, relevant for building the mathematical model of interest. As a consequence, a key step is the selection of the optimal subset of variables (i.e. molecular descriptors) for the development of the model.

download tutorial (PDF)
 
7. Defining the Applicability Domain of QSAR models: an overview
QSARs establish a quantitative relationship between chemical structures and their properties. In theory, QSAR models can be used to predict the properties of chemical structures, provided their structural information is available. The rising popularity of QSAR models is also accompanied by a question over their reliable predictions. In theory the applicability of QSAR models to the query chemicals is limited. Reliable predictions are usually confined to those chemicals, that are structurally similar to the training compounds used to build the model. The principle of Applicability Domain obliges the users to define the model limitations with respect to its structural domain and response space.

download tutorial (PDF)
 
8. Topological descriptors
Definitions of the most important topological descriptors

in preparation
 
9. Geometrical descriptors
Definitions of the most important geometrical descriptors

in preparation