logo Use CA10RAM to get 10%* Discount.
Order Nowlogo
(5/5)

how to analyze the source of the tweets on twitter

INSTRUCTIONS TO CANDIDATES
ANSWER ALL QUESTIONS

8 Project Description

Due Week 13, Friday 11:59 pm

The company "Old School Business", also known as OSB wants to start using social media to promote its business. They have approached your team with a request to find what other businesses have done successfully using social media. OSB is particularly interested in using Twitter and so has asked your group to perform the following analysis on Twitter. To begin, find a company that has a Twitter handle with over 10,000 followers and 1500 tweets, then perform the following tasks using the chosen Twitter handle.

8.1 Analysing the source of the tweets

The company wants to know how other companies and the public post their tweets. They want to use this information to understand if there is a relationship between the source of a tweet and the retweeting behavior.

  1. Use  rtweetlibrary to download 1000 tweets that the company posted. Save these tweets as “tweets.company”.
  2. Use  rtweetlibrary to download 1000 tweets about the company you selected. Save these tweets as “tweets.public".
  3. Examine the source column of both the company and the public tweets to see the source of tweets. Find out how many different levels of sources exist in the public and company tweets.
  4. Draw a bar plot of the top 10 most frequent tweet sources for both company tweets and the public tweets. Label each bar with the source name.
  5. Comment on your bar plots.
  6. By using an appropriate statistical test, test whether retweeting is independent of the tweet source that the public posted. Use the “source” and “is_retweet” columns to get the source and retweet information. Group the sources as; “Salesforce - Social Studio”, "Twitter for Android", “Twitter for Ipad”, “Twitter for iPhone”, “Twitter Web App”, “Twitter Web Client” and “Other”.
  7. What is the conclusion of the test? Interpret your results.
  8. Calculate a 95% confidence interval of the text width used in the tweets that the company posted. Use the “display_text_width” column to get this information.

8.2 Themes in public and company tweets

To be successful on Twitter, a company needs to provide useful information to its followers and encourage customers to talk about their posts. We will examine this information so that we can suggest what OSB can tweet about. We do not want to present all tweets to OSB, so we must identify if there is a set of common tweet themes between the public and company tweets. This process involves:

  1. Combine tweets.publicand tweets.company and save as tweets.
  2. Clean and pre-process the data (use TFIDF weights in your analysis).
  3. Compute the most appropriate number of clusters using the elbow method for the combined tweetsby using cosine distance.
  4. Cluster the tweets using the most appropriate clustering method.
  5. Visualize your clustering in 2-dimensional vector space. Show each cluster in a different colour and the tweets in tweets.publicand tweets.company with different symbols in your visualization.
  6. Comment on your visualization.
  7. Compute the proportion of tweets.publicat each cluster. Print these proportions.
  8. Which clusters are dominated by the public and which are dominated by the company?
  9. Draw a word cloud and a dendrogram of these two clusters to understand the theme of the clusters.

8.3 Following friends

We are unsure if friending leads to an increase in popularity. To examine this, we will: (you can use twitteR package in this section).

  1. Find the most popular 10 friends of the chosen Twitter handle.
  2. Obtain a 1.5-degree egocentric graph centered at the chosen Twitter handle and plot the graph. The egocentric graph should contain the most popular 10 friends of the chosen Twitter handle.
  3. Compute the betweenness centrality score for each Twitter handle in your graph. List the top 3 most central people in your graph according to the betweenness centrality.
  4.    Comment on your results.

Important notes:

Note that in Section 8.3, depending on the friends of the chosen twitter handle, you possibly will reach the rate limit of the Twitter API. I strongly recommend that you save your objects as an RData file once you download friends - so you can continue downloading friends the following day or with a different authentication key. For more information on how to save your objects see: https://stackoverflow.com/questions/19967478/how-to-save-data-file-into-rdata

See this https://developer.twitter.com/en/docs/basics/rate-limiting.html for more information on the rate limit.

It is also recommended that you save your tweets.public and tweets.company once you have downloaded them. Otherwise you will get different tweets each time you run your script and you will need to change your clustering.

The company wants the above three-part analysis to be written up as a professional report. Each part should have its own section of the report and all questions should have thoughtful answers. Include all the R code along with its output in your assignment. Output without the code, or code without the output will result zero marks for the relevant section.

 

(5/5)
Attachments:

Expert's Answer

988 Times Downloaded

Related Questions

. Introgramming & Unix Fall 2018, CRN 44882, Oakland University Homework Assignment 6 - Using Arrays and Functions in C

DescriptionIn this final assignment, the students will demonstrate their ability to apply two ma

. The standard path finding involves finding the (shortest) path from an origin to a destination, typically on a map. This is an

Path finding involves finding a path from A to B. Typically we want the path to have certain properties,such as being the shortest or to avoid going t

. Develop a program to emulate a purchase transaction at a retail store. This program will have two classes, a LineItem class and a Transaction class. The LineItem class will represent an individual

Develop a program to emulate a purchase transaction at a retail store. Thisprogram will have two classes, a LineItem class and a Transaction class. Th

. SeaPort Project series For this set of projects for the course, we wish to simulate some of the aspects of a number of Sea Ports. Here are the classes and their instance variables we wish to define:

1 Project 1 Introduction - the SeaPort Project series For this set of projects for the course, we wish to simulate some of the aspects of a number of

. Project 2 Introduction - the SeaPort Project series For this set of projects for the course, we wish to simulate some of the aspects of a number of Sea Ports. Here are the classes and their instance variables we wish to define:

1 Project 2 Introduction - the SeaPort Project series For this set of projects for the course, we wish to simulate some of the aspects of a number of

988 Times Downloaded

Ask This Question To Be Solved By Our ExpertsGet A+ Grade Solution Guaranteed

expert
Um e HaniScience

716 Answers

Hire Me
expert
Muhammad Ali HaiderFinance

874 Answers

Hire Me
expert
Husnain SaeedComputer science

530 Answers

Hire Me
expert
Atharva PatilComputer science

717 Answers

Hire Me