What is Social Buzz Top Shot?
An Accenture Data Analytics Virtual Experience
So tell me what have you ever been popular for? I bet it was something silly, well tell me about it in the comments if you don't mind.
Background
This article is a documentation of my Accenture Data Analytics Virtual Experience, this project is hosted on Forage a website that curates exciting internships across multiple fields for students to apply their technical and soft skills at building projects.
Scenario
As part of a consultancy team working for a client called Social Buzz, a social media and content creation company experiencing unexpected rapid growth and on the verge of an Initial Public Offering (IPO), Accenture is tasked with making sure they are ready for a successful IPO (Initial Public Offering).
As the team's Data Analyst, I am tasked with the analysis of Social Buzz content categories to highlight the top 5 categories with the largest aggregate popularity based on the provided metrics. This is to be submitted with the following deliverables ;
- A CSV file of the cleaned and modelled dataset for analysis.
- Data Visualization report highlighting relevant insights.
- A presentation of findings to the stakeholders and a ppt file.
Question for Analysis - "The Ask"
- What are the top 5 content categories based on aggregated sentiment scores?
Requirement Gathering
In order to answer the business question, the final dataset was modelled with 4 out of the 7 data sets relevant to the question. This included only the User
, Content
, Reaction
, ReactionTypes
tables.
Data Cleaning and Modelling
Data Sources: Dataset was internally provided by the client as 7 unique csv flat files with information on (Content
, Location
, Profile
, Reaction
, ReactionTypes
, Session
and User
)
The data structure and model for Social Buzz is explained here the cleaning and modelling process was carried out with Microsoft Excel
with the following steps explained below.
- Loading Dataset - The datasets were loaded all at once using the
New Query
function from theData
tab. - Elimination of irrelevant columns - columns that were not relevant to the report were deleted with
Power Query Editor
before being loaded. - Inspecting for missing datapoint - blank records and columns were inspected using the
filter
function and removed with theRemove Empty
function on the query editor before and after the merging. - Confirming data types - data types were inspected, changed and confirmed to match the expected types for each column header.
- Merging - After separate cleaning of the datasets merging was carried out on query editor using the
Merge Queries as New
in order to create a new table different from the original datasets. - Regularization of data points - the data points for the
Content
table underCategory
andReaction Type
table under theType
column had records of inconsistent spellings for the same data points (example"studying"
andstudying
) this was made consistent using theFind and Replace
functions to replace inconsistent spellings across the merged dataset. - Duplicates - Duplicate records were absent both in the unique and merged datasets.
Visualization and Findings
The cleaned and modelled dataset was interactively visualized using Power BI and can be explored below for more insights on the story-
Social Buzz Report Summary
After careful analysis in the context of the business problem the following findings and recommendations are offered
Observations & Recommendations
- Multiple Reactions and duplicate reactions by users on one content: The ability of one user to register multiple types of reactions and duplicate entries of one type of reaction on a unique content reduces the credibility of the overall sentiment score as this does not reflect a single feeling of a user per content.
- Change in reaction policies to reflect one reaction type per content is recommended to improve the overall sentiment score.
- The top 5 categories should be reviewed for monetization opportunities.
- 62 (12%) of the 500 unique users did not use the platform in 1 year. User retention strategies is recommended.