This paper considers a regression-based clustering problem for multiple responses. We propose a nested optimization formulation, where two minimizations are performed one after the other. The first minimization is to determine the number of clusters, and the second is to minimize the L-infinity-norm of residuals. After fixing the number of clusters, our formulation reduces to a novel Mixed-Integer Linear Programming (MILP) problem, which can handle multiple responses simultaneously. We further propose an empirical approach based on cross-validation to determine a good number of clusters; this approach takes into account prediction accuracies of models when choosing the number of clusters. Using the JURA dataset, we illustrate that the classic approach in the literature, which considers one response at-a-time, usually assigns the same entity to different clusters with respect to different responses; hence, eventually, it is not evident to which cluster that entity belongs to. Also, even though the classic approach assigns that entity to the same cluster with respect to all responses, this assignment can be different than the one obtained when all responses are considered simultaneously; hence, the classic approach can result in a false clustering when multiple responses have to be considered at the same time.