This study proposes an approach for emotion recognition in children with hearing impairments by utilizing physiological and facial cues and fusing them using machine learning techniques. The study is a part of a child-robot interaction project to support children with hearing impairments with affective applications in clinical setups and hospital environments and improve their social well-being. Physiological signals and facial expressions of children were collected and annotated by the collaborating psychologists as pleasant, unpleasant, and neutral, using the video recordings of the sessions. Both single and multimodal approaches are used to classify emotions using this data. The model trained using only facial expression features yielded a result of 43.67%. When only physiological data was used, the result increased to 58.68%. Finally, when the features of these two different modalities were fused in the feature layer, the accuracy further increased to 74.96% demonstrating that the multimodal approach for this data set has significantly improved the recognition of pleasant, unpleasant, and neutral emotions in children with hearing impairments.