Document Type

Conference Proceeding

Publication Date

Fall 10-20-2014


The human voice is a powerful instrument for producing sound sketches. The sonic space that can be spanned with the voice is vast and complex and, therefore, it is difficult to organize and explore. In this contribution, we report on our attempts at extracting the principal components from a database of 152 short excerpts of vocal imitations. We describe each excerpt by a set of statistical audio features and by a measure of similarity of the envelope to a small number of prototype envelopes. We apply k-means clustering on a space whose dimensionality has been reduced by singular value decomposition, and discuss how meaningful the resulting clusters are. Eventually, a representative of each cluster, chosen to be close to its centroid, may serve as a landmark for exploring the sound space.


Copyright © 2014 Davide Rocchesso. This is an open-access article distributed under the terms of the Creative Commons Attribution License 3.0 Unported, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.