Discussion

Throughout this project multiple problems with predicting enjoyable actor groups occured. These problems occured because of the way which the sentiment analysis and finding the communities were done.

The Communities are based on the realease year of the movies the actors were generally in and the countries of the movies the actors were in. Therefore it did not find communitities of enjoyable actors, but instead countries and time periods of movies.

Another problem is that the sentiment analysis is carried out based on words which has been categorized as positive and negative in general, not necessarely for movies. Therefore movies with a negative theme were seen as negative reviews and movies with a positive theme were seen as having a lot of positive reviews.

The way which these two parts of the project were done would therefore have to be changed, in order to potentially getting a better result.

Communities

As the project is now the communities are very large. In order to give a more precise view of certain actor groups which is enjoyable together, the communities had to be much smaller and contain only actors which actually tend to work together in a lot of movies. In order to do this the communties had to be done in a way where only closely related actors were actualy grouped together, creating much smaller communities, but also creating communities with actors which actually tend to work together in movies, instead of getting an entire country as a community. By doing this there would also be a larger difference in sentiment scores between comunities, since actor groups would be compared instead of comparing two different countries movies to each other.

Sentiment analysis

Instead of working with a premade list of positive and negative words, one possibility in order to make the sentiment analysis more accurate for this project, is to make a new sentiment analysis were words has been grouped as possitive and negative, especially for how words are used in movie reviews.

This would mean that words like "scary", "murder" and "killing" would no longer be seen as negative in the moview reviews, since these are words used when describing horrormovies in general, but instead focusing at words describing the likeability of the movies.

By doing this it would decrease the bias which occures throughout this project and make the sentiment analysis more accurate when predicting enjoyable actor groups.

Final comments

In order to make this project work as intended, both of the above changes would have to be changed, since only changing one of the would not fix the problem.

If only the communities were fixed, it might be possible to see a difference in sentiment scores between actor groups, however it would say that actors in commedies, disney movies etc would be enjoyable and actors groups which appear in horror movies would make bad movies, therefore not fixing the problem.

If only the sentiment analysis were fixed, the sentiment scores would be more accurate, however it would still only compare countries and time periods, thus not finding enjoyable actor groups.

Therefore these two changes would both have to be implemented, in order to try and find enjoyable actor groups.

References

[Dodds, 2011] - Sheridan Dodds, P., Decker Harris, K., M Kloumann, I., A Bliss, C., & M Danforth, C. (2017). Temporal patterns of happiness and information in a global social network: hedonometrics and Twitter. Plos One.

[Orman, 2011] - Orman, G. Ã. K., Labatut, V., & Cherifi, H. (2011). On Accuracy of Community Structure Discovery Algorithms. Journal of Convergence Information Technology, 6(11