MPI Series in Biological Cybernetics, Bd. 10
Learning is the process of inferring general rules from given examples. The examples are instances of some input space (pattern space), and the rules can consist of some general observation about the structure of the input space, or have the form of a functional dependancy between the input and some output space. Two types of learning problems are considered: classification and clustering. In both problems, the goal is to divide the input space into several regions such that objects within the same region "belong together" and "are different" from the objects in the other regions. The difference between the two problems is that classification is a supervised learning technique while clustering is unsupervised.
Machine learning algorithms are usually designed to deal with either similarities or dissimilarities. In general it is recommended to close an algorithm which can deal with the type of data given, but sometimes it may become necessary to convert similarities into dissimilarities or vice versa. In some situations this can be done without loosing information, especially if the similarities and distances are defined by a scalar product in an Euclidean space. If this is not the case, several heuristics can be invoked. The general idea is to transform a similarity into a dissimilarity function or vice versa by applying a monotonically decreasing function. This is according to the general intuition that a distance is small if the similarity is large, and vice versa. The connection between information theory and learning can be exploited in every-day machine learning applications.