This article reviews the research literature on the differences between word reading and picture naming. A theory for the visual and cognitive processing of pictures and words is then introduced. The theory accounts for slower naming of pictures than reading of words. Reading aloud involves a fast, grapheme-to-phoneme transformation process, whereas picture naming involves two additional processes: (a) determining the meaning of the pictorial stimulus and (b) finding a name for the pictorial stimulus. We conducted a reading-naming experiment, and the time to achieve (a) and (b) was determined to be approximately 160 ms. On the basis of data from a second experiment, we demonstrated that there is no significant difference in time to visually compare two pictures or two words when size of the stimuli is equated. There is no difference in time to make the two types of cross-modality conceptual comparisons (picture first, then word, or word first, then picture). The symmetry of the visual and conceptual comparison results supports the hypothesis that the coding of the mind is neither intrinsically linguistic nor imagistic, but rather it is abstract. There is a potent stimulus size effect, equal for both pictorial and lexical stimuli. Small stimuli take longer to be visually processed than do larger stimuli. For optimal processing, stimuli should not only be equated for size, but should subtend a visual angle of at least 3°. The article ends with the presentation of a mathematical theory that jointly accounts for the data from word-reading, picture-naming visual comparison, and conceptual-comparison experiments.