We work in this field and so, in one of our approaches, have built such a lexicon. Our's is a small list and hence not comprehensive, but sufficient for our purposes. Now, I noticed that I had a lot more words tagged as negative rather than as positive. Stated in numbers, there were 434 words marked positive, and 1348 marked negative. I had initially built a much smaller list by hand, and then expanded the lexicon automatically by (partially) using an approach (pdf) described by Italian researchers Andrea Esuli and Fabrizio Sebastiani.
They had also created SentiWordNet. This extends WordNet, which is a popular language resource used in natural lanuage processing and in essence, is a dictionary-thesaurus on steroids (the good kind :-)). WordNet contains over 150,000 words and arranges them 'conceptually', by grouping together synonyms that make up unique 'senses' (these groups are called 'synsets') (it may be obvious why I didn't the word 'sensually' to describe the arrangement). SentiWordNet augments this by attaching a positive and a negative score to each synset. (Here, I won't discuss why a synset can have both a positive & negative score.) Words like 'horrible' or bad have a high negative score, while awesome and pleasant are very positive.
Coming back to our question. Seeing the difference in my list, I wondered if this was a possibly valid observation, or if my lexicon was just poorly constructed, or a consequence of applying the expansion technique in part. So I counted the number of positive & negative synsets in SentiWordNet (again, not going into details here). I found 14134 negative synsets and 12720 positive ones. Perhaps not a significant difference, but still the negative side is a little greater in number (and I haven't actually counted words, only sense groups). So it could just be that I chose or generated more negative words.
This is all anecdotal and perhaps some fun for language geeks to talk about when they're stuck in a long queue and haven't brought a book along :-)