In this paper, we propose a model for direct incorporation of image content into a (short-term) user profile based on correlations between visual words and adaptation of the similarity measure. The relationships between visual words at different contextual levels are explored. We introduce and compare various notions of correlation, which in general we will refer to as image-level and proximity-based. The information about the most and the least correlated visual words can be exploited in order to adapt the similarity measure. The evaluation, preceding an experiment involving real users (future work), is performed within the Pseudo Relevance Feedback framework. We test our new method on three large data collections, namely MIRFlickr, ImageCLEF, and a collection from British National Geological Survey (BGS). The proposed model is computationally cheap and scalable to large image collections.