Linear discriminant analysis (LDA) is a very important approach to selecting features in classification such as facial recognition. However it suffers from the small sample size (SSS) problem where LDA cannot be solved numerically. The SSS problem occurs when the number of training samples is less than the number of dimensions, which is often the case in practice. Researchers have proposed several modified versions of LDA to deal with this problem. However, a solid theoretical analysis is missing. In this paper, we analyze LDA and the SSS problem based on learning theory. LDA is derived from Fisher's criterion. However, when formulated as a least square approximation problem, LDA has a direct connection to regularization network (RN) algorithms. Many learning algorithms such as support vector machines (SVMs) can be viewed as regularization networks. LDA turns out to be an RN without the regularization term, which is in general an ill-posed problem. This explains why LDA suffers from the SSS problem. In order to transform the ill-posed problem into a well-posed one, the regularization term is necessary. Thus, based on statistical learning theory, we derive a new approach to discriminant analysis. We call it discriminant learning analysis (DLA). DLA is well-posed and behaves well in the SSS situation. Experimental results are presented to validate our proposal.