The perception of objects is a cognitive function of prime importance. In everyday life, object perception benefits from the coordinated interplay of vision, audition, and touch. The different sensory modalities provide both complementary and redundant information about objects, which may improve recognition speed and accuracy in many circumstances. We review crossmodal studies of object recognition in humans that mainly employed functional magnetic resonance imaging (fMRI). These studies show that visual, tactile, and auditory information about objects can activate cortical association areas that were once believed to be modality-specific. Processing converges either in multisensory zones or via direct crossmodal interaction of modality-specific cortices without relay through multisensory regions. We integrate these findings with existing theories about semantic processing and propose a general mechanism for crossmodal object recognition: The recruitment and location of multisensory convergence zones varies depending on the information content and the dominant modality.