People often seem to experience music as expressive of emotion in some way, and there are a number of different theories that have been proposed for this. I think there is good reason to think all of them inadequate, but ironically I also think they are all underestimated. In particular, I think there is good reason to think none of them can be complete, but that criticisms of each are often really based on oversimplifications of causation, or communication, or language, or emotions, or some such other thing, with the result that many of the objections made to each fail to recognize the resources available to each theory.
One of the most obvious families of theories of musical expressiveness is what is usually called arousal theories. The idea is that the expressiveness of a piece of music is its disposition or tendency to arouse emotions in the listener. The theory is often dismissed today, but I think critics often make it too easy for themselves. For instance, it's sometimes said in objection that people can hear music and not experience the emotions it is supposed to express (for instance, they can hear 'sad music' without feeling sad -- indeed, can be made happy by it). But the arousal theory does not require that people actually in every case have the effect. It does not claim that musical expressiveness is music's having a particular effect; it is the claim that it is music's disposition or tendency to a particular effect. Obviously whether listeners actually feel anything depends in part on the disposition of the listener, and not just on the disposition of the music. But the fact that anyone does feel emotions in response to music establishes that the music does play a causal role, and the fact that certain types of music often lead to feeling certain kinds of emotion establishes that something about the music itself is contributing to the result.
Moreover, the notion of arousing emotions is more complicated than the objectors often realize. Suppose you're sad and I am trying to cheer you up. Obviously, I am most successful at this if you actually feel good cheer, but this is not the whole of it, because I can have a wide variety of partial successes. Maybe I won't cheer you up, but you move in a 'cheerward' direction. It's the same causal story, just a less successful form of it. Perhaps I don't even have this success, but I do get you into a state of mind that cheer is going on, so to speak -- you might not feel it yourself, but you interpret yourself as being in a cheerful situation. This is a weaker form of success, but it would be by the same causal disposition. It's all related to causing good cheer -- actually having that effect is the central form of success, the complete and ideal one, and is what gives the disposition its name; but the fact that my words are suited to causing good cheer is shown not just in causing good cheer but in a wide variety of other related effects. One would expect this to be true in the musical case as well.
Arousal theory focuses on the listeners; obviously you could focus on the composer or the musician instead. It's this, after all, that seems to give 'musical expressiveness' its name: the music expresses the emotions of the composer or the musician. This family is usually called expression theories. The usual objection is that musicians need not actually be feeling the emotions that they express through music. This objection, however, runs into similar problems as the objection to arousal theories given above: the objection involves a simplistic view of the causal story involved in expressing emotion, just as the objection to arousal theories involved a simplistic view of the causal story in feeling emotions in response to things. Consider actors. Actors uncontroversially express emotions. The actor doesn't actually need to be feeling the emotions they express. But of course it's not as if actors simply make up expressions of emotions; that would just be weird. So where is the actor getting the materials for expressing (say) anger? The actor often draws from his own anger, but it doesn't have to be anger that he is feeling at that particular moment. We can express remembered emotions. What's more, we can draw on remembered emotions, and adapt our expression to whatever our particular ends may be -- exaggerate it, or restrain it, or shift it a bit to give it a different quality, or mix it with other things. In short, anything we can do with remembered emotions in our imagination, and thus we can be expressing imagined emotions, based on emotions that we have actually felt. And, what is more, in imagination we can approximate sympathetically what it would be like to have certain emotions we perhaps have not had, based on imitation of those who have had and 'putting ourselves in their shoes'.
As with arousal theories, the relevant family of expressions has a central case which lends its name to the whole family -- in this case, actually expressing the feeling that one has in the expression -- but there are related ways of expressing emotions, like expressing emotions as remembered or as imagined. And again, we see this very clearly in acting, so it seems entirely arbitrary to ignore it in the musical case. (Part of the reason the objection gets traction is perhaps that there is a tendency to distinguish 'expressing an emotion' from 'being expressive of an emotion', where the latter is a purely outward manifestation. But this, I think, is quite clearly an irrelevant distinction if we are just talking about the expressiveness of music. Since we don't have telepathy, the experience of the music qua expressive will be exactly the same in each case; and, far from being an important distinction to make in an account of expression -- whether of speech or of acting or of music -- any account of expression will have to straddle the divide.)
In a third family we find association theories. On these theories, the expressiveness of music is purely a matter of convention. Note that the claim is not that it is purely arbitrary; there can be a real, definite meaning of music, just as there can be a real, definite meaning of the sounds we call words and sentences; taking speech to mean something is not itself arbitrary. But the meaning of language, of course, is heavily conventional, and the idea behind association theories is that the expressiveness of music is very much like the meaningfulness of speech. (The fact that it can so easily show the analogy between the expression of music and other domains is a point in its favor.) Now, the usual objection given to association theories is that there are a great many differences between language and music -- most importantly that language is about emotions but often not expressive of it. However, again, I think this is a case of objectors making it too easy on themselves: in fact language is massively expressive, and a standard way we express emotions. Its representational function is so important that its expressive function is often quite secondary, but this is far from saying that it is not a very expressive medium. Philosophers of language may not have focused much on it (because it falls neither under sense nor under reference but under the 'tone' or 'coloring' or 'illumination' that is left over when focusing entirely on sense and reference), but we use language expressively all the time. The same sentence screamed, whispered, said slyly, said in an exaggerated way, said in a goofy voice, said very slowly, said very quickly, is capable of expressing many different things.
Resemblance theories form a fourth family. Music has a dynamic structure and resemblance theories take some analysis of this moving structure and note the analogies we tend to make between such structure and the dynamic structure of our natural and ordinary expression of emotion. We tend to interact with even obviously inanimate things as if they were quasi-animate, and so we're quite used to attributing emotional expression to other things even when we know that they are not expressing emotions at all. So we might joke about the anger of the wind if it slams a door, because door-slamming is something that can express anger, or we might talk about the ducks laughing, given that quacking and laughing can be similar sounds. So too, the suggestion is, with music. Even the more neutral language in which we talk about music is really something like this; the music moving 'up' and 'down' is obviously not physically moving in space. But we can make perfect sense in terms of resemblance of music going 'up' and 'down' -- and, in fact, when you do, you find that the mathematical similarities between moving up and down a musical scale and moving up and down on a spring are quite reasonably close. Now maybe the exact correspondence of direction has a conventional element -- I think I remember that some Asian languages have the directions reversed, so that what we would call a lower note they would call a higher note, and vice versa. But allowing for this, the relations are quite stable across different cultures. And this is true of a lot of things in music. Leaps in music have some formal similarities with physical leaps, pace in music with pace in locomotion, and so forth. In addition, music and dancing are very closely linked in all cultures, and we can see that some dancing fits various kinds of music, so we would expect any account of the expressiveness of dance to tell us something about the expressiveness of music by the 'fit' of the one to the other. On the other hand, one could well ask what any of the resemblances have to do with emotions in themselves; the resemblance theory seems to try to solve the problem by saying that music is similar to expressive thing X; but it leaves mysterious what makes the X expressive, and thus how it is that music can resemble X in the way in which X is expressive. But perhaps the resemblance theorist can reply that this is a research problem, not a fatal problem; that is, that we first have to identify what in music is expressive, which requires relating it to other expressive things, before we can determine exactly how that could be expressive.
It's noticeable that arousal and expression theories are both 'final' accounts (the expressiveness is linked to the objects of dispositions, either of the source of the music or of the music itself), while resemblance theories are 'formal' accounts (linking it to the structure either of the sound or, at times, the performance), and these theories, which all take musical expressiveness to be intrinsic in some way, can be contrasted with association theories, which gives an extrinsic account. It's tempting, then, to suggest that we should really see expression as a unified thing that can be analyzed in terms of something like the Aristotelian four causes plus conventional context. It does make independent sense to say that the expressiveness of music has to connect up with the communicative nature of music; and music as a communication, like language as a communication, has a unified structure:
source of communication -> shareable signs for communicating -> target of communication
The communication is not the bare signs, or anything in the source or target, but something to which they are all contributing in some way, and the way in which they each participate can be analyzed causally. Since it seems natural to think of musical expression as (at least) some kind of emotional communication, even when it is not (as we noted above) a central example of emotional communication; and a four-causes-with-convention account makes a lot of sense of language. And think of Prokofiev's "Peter and the Wolf" Op. 67, which is particularly interesting here because it is a deliberate attempt to pull together a number of different ways in which music can communicate. There's obviously a large conventional element to it, since it is put in a framework of a story; but the associations of the story are building on a prior expressiveness found in tempo, low and high notes, and the different sonorities of different instruments, which suggests something like a resemblance account, and the link with the story makes clear that these things are brought together with a definite purpose of arousing emotions, and possibly expression of them as well. There's a reason all of these have their proponents; they each seem to capture something you can genuinely find in the experience of music.
It's worth noting, too, that inevitably there are analogies between musical expressiveness and other forms of expressiveness, with the four major examples being everyday physical expression (natural gestures, spontaneous facial expressions, etc.), language, dance, and acting. No doubt there are some important differences among these different kinds of expressiveness, but any account of expressiveness really should be shedding light on all of these; one would expect all of them to have some fundamental base account even if there are differences arising from their distinctive features. But it seems plausible that each account alone will struggle with some features of expressiveness in these other contexts. For instance, resemblance theory can make some sense of music being like everyday physical expression, but it seems like it has to take everyday physical expression as just primitive -- at least, it's more challenging to say what spontaneous physical expressions would be resembling. Association theory can make sense of ways in which music is like language, but it does not do so well with some of the ways in which you can identify real resemblances between music and everyday physical expression. Plato's discussion of language in the Cratylus can be adapted to arguing that it's impossible to take alone either a conventionalist account like association theory or a resemblance account, if you want to capture what we do with language; and likewise, the perlocutionary aspect of language often seems to require something like an arousal account -- but obviously language is itself used expressively. And so it goes with other analogies to dance and to acting. It seems that a single-factor explanation is not going to be a generally adequate explanation.