Skip to main content

Text and Web Mining Analysis of Nike's 2018 Just Do It Campaign





Colin Kaepernick, Nike, and the

Consequences of the 2018 Just Do It Campaign:


An Analysis of Social Media, Press, and Stock Reactions













Kwong Yau
Sarah Erbes
Scott Mow
& Jordan Cherry

December 14, 2018









Executive Summary
This project attempts to analyze the effect of Nike’s use of Colin Kaepernick as the lead athlete for its 2018 marketing campaign on its stock price during that period.
We outlined a three-fold approach:
·        Examine the social media reaction to determine overall sentiment about Colin Kaepernick’s partnership with Nike
·        Examine press reactions to determine press coverage sentiment about Colin Kaepernick’s partnership with Nike
·        Examine financial data to examine possible correlations between media (both traditional and social) reactions and Nike’s subsequent financial performance, for simplicity, measured in daily closing stock price.
By incorporating web and text mining using Rapid Miner and R to create a corpus of news articles, blogs and Tweets, our goal is to draw conclusions of the sentiment surrounding Nike during that period and gain insight into the possible value and peril of running politically controversial advertising campaigns for companies.   

Background

Colin Kaepernick, former quarterback for the NFL’s San Francisco 49ers began his protest of police brutality by refusing to stand for the national anthem in the 2016 NFL preseason. When asked about his decision not to stand for the anthem, Kaepernick stated his reasons for protesting. The controversy regarding his actions quickly became one of the biggest news stories of the year, and a dividing line in American politics. Sports Illustrated magazine awarded him with the honor of Sportsperson of the Year, while opponents widely and passionately criticized him. Although Kaepernick was released from his contract with the 49ers at the end of the NFL season, Nike maintained its sponsorship contract with the quarterback, even as he remained unsigned by any other NFL team.
On September 3, 2018, Colin Kaepernick and Nike sent out tweets signaling that Colin Kaepernick would be the face of Nike’s 2018 “Just Do It” campaign. The reaction on twitter was immediate; opponents sharply criticized Nike while supporters praised the company just as enthusiastically. As of the time of this report, Colin Kaepernick remains the face of Nike’s Just Do It campaign.
The caption of Kaepernick’s Nike ad reads, “Believe in something, even if it costs you everything.” The question we seek to answer is what, if anything, did the ad campaign cost Nike?

Methods

We used RapidMiner and R in order to conduct our analyses. In order to conduct each of our analyses, we determined that a multi-pronged approach was necessary. While financial data and media articles were relatively easy to obtain, limits on RapidMiner and R’s interfaces with Twitter made it impossible to access tweets from September 2018, when the reactions specific to Nike’s Colin Kaepernick ad were trending.  Further to this, Twitter imposes hefty charges for providing “filtered access” to archived content, e.g. entry level pricing is $1,250 for a one-time request limited to 1 million tweets over a 40-day period.

However, both Nike and Colin Kaepernick are still currently widely discussed on Twitter, and many of the original hashtags relating to Kaepernick’s protest are active. The tweets collected were used to gauge overall sentiment regarding Kaepernick and Nike as well as identify the themes and concepts most often connected to both Kaepernick and Nike. 

Article Analysis

A piece of our research was to analyze the sentiment of articles released during the time of the Nike Colin Kaepernick ad.  Our goal was to see if the sentiment of the articles aligned with the stock price movement of Nike.   Our article listing is a combination of news articles and opinion pieces written about the controversial advertisement.

Clustering Article Analysis


The RapidMiner process above was used for a cluster analysis of the articles and opinion blogs that had been collected.  The sub-processes under the “Process Documents” operator were Transform Cases, Tokenize, Filter Tokens, Stem(porter), Filter Stopwords, and Generate n-grams.  A limit of 4 clusters for the articles was imposed.  Below is the output for the word list among the articles showing the top 10 words:


Below is the output of the cluster analysis:

The two largest clusters were 1 and 3, the output of these clusters follow.  Below is the centroid table output for the top ten words in cluster 1:




Top ten words for Cluster 3:


While Cluster 3 contains more words that would be expected with a marketing ad for a clothing company, there are still politically driven words at the top of the list.  Cluster 1 contains heavily political words as well in their top ten, especially ‘black_male’ which is at the center of this political issue.

Analyzing the Sentiment of the Articles
 We used the Aylien extension to analyze the sentiment of the articles based on the text of the articles.  The results from this analysis were 11 positive articles, 0 negative articles, and 34 neutral articles with an average polarity confidence of 0.9.  It was surprising that none of the articles were found to be negative by the Aylien algorithm.  There were 30 objective articles and 15 subjective articles, with a confidence of 1.  Interesting to note that the articles that were marked with subjectivity were all opinion pieces.
We then ran the same articles through the Rosette sentiment analysis, this produced a different result.  There were 30 neutral, 3 positive, and 2 negative articles many more neutral than the Aylien analysis but Rosette was also able to identify some negative articles.  The sentiment score minimum was .545, maximum of 1, and an average of 0.857.  Below is the pie chart output from RapidMiner.


Both of the prediction models below are attempting to predict the sentiment based on the text mined from each article.  The sentiment was based on the Aylien sentiment analysis conducted above.
Predicting Polarity with Naïve Bayes:


This model took the text and publish date to try and predict the polarity.  As seen by the results, the Naïve Bayes model did not do a good job attempting to predict the polarity of the articles.  The accuracy level is only 12.5% and it was only able to predict some of the positive articles correctly. Next, we tried to use a decision tree to better predict the outcome.

Decision Tree Model:

The decision tree model was much better at predicting the classification than the Naïve Bayes model was.  The accuracy level was much higher at 60.83% but there are still issues with this model.  The model predicted every article as neutral, while the neutral recall is 100% the others are still 0%.

Article Conclusions:
There were a few key points to take away from the analysis of the articles.  The first, is the insight the clustering provided.  The word lists and the clusters both point to social activism words being present in these articles.  This confirms our thinking that this was advertisement was indeed more than just another clothing ad. Surprisingly, there was not as much negative sentiment for the advertisement as our team would have expected.  The positive reactions align with the stock market increase that Nike saw immediately after the advertisement was released.  The final take away is that Naïve Bayes as well as the decision tree were unable to successfully predict the sentiment of an article.  If further research were to be done, we would need to find a better prediction model.

 

Establishing Corpora of Tweets

We used RapidMiner in order to establish our corpora of tweets. Using Rapidminer’s Twitter interface, we were able to conduct searches for “Colin Kaepernick”, “Nike”, “#boycottNike”, and “#TakeAKnee.” The two hashtag searches were hashtags used by opponents and supporters, respectively. The searches specifically excluded retweets and tweets with URL links in order to emphasize more original tweets. These corpora were exported to Excel documents for further analysis.
A screenshot of our RapidMiner process for establishing our corpora can be found in Appendix A.

Sentiment Analysis

Once the corpora of tweets were collected, we used RapidMiner’s Aylien extension to perform sentiment analyses on each of the corpora. Unfortunately, we were unable to access Aylien’s dictionary for positive and negative terms, and thus the sentiment analysis operator represents a “black box” process in our analysis. However, the Aylien extension is considered reliable for the purposes of this course.
A screenshot of our RapidMiner process for analyzing corpora can be found in Appendix A.

Word Could Creation

Using Excel, we created CSV copies of the corpora that could be imported into R to create word clouds. Our R script used the TM package for text mining, and the WordCloud package for creating the word clouds.
The TM package was used to perform the following steps:
·        Create a corpus from the imported CSV files
·        Transform cases to all lower case
·        Remove numbers from the text
·        Remove stopwords
·        Strip extra white space
·        Stem the terms in the text
·        Remove the specific search terms from the text[1]
After we used the TM package to clean the text of the tweets, we used the WordCloud package to create word clouds showing the terms most associated with each search. All default settings were used for this process. The resulting word clouds can be found in the “Results” section.
The R script for creating word clouds can be found in Appendix B.


Stock Price
The daily stock prices were scraped from Yahoo.com.  The URL follows a consistent pattern, so we just had to substitute each company's stock symbol.  Using the “Get Pages” operator, we were able to read the raw HTML for each company's page.  There was a common class name for each row containing the data, so we were able to parse the close date and value using the “Cut Documents” operator with an XPath query type.  From there, dummy ids were used to join the data into a table.  The full RapidMiner process can be found in Appendix C.

Sentiment Analysis Results

The Aylien Sentiment analysis of the search “Nike” showed that of the 550 tweets collected, 291 of them were neutral, 138 were negative, and 121 were positive.
Graphs of the polarity and subjectivity of the search results for “Nike” can be seen below:






The Aylien sentiment analysis of the search for “Colin Kaepernick” found that out of 500 tweets, 243 were neutral, 64 were positive, and 193 were negative. 375 of the tweets were subjective. 
Graphs of the polarity and subjectivity of the search results for “Colin Kaepernick” can be seen below:





The Aylien sentiment analysis of the search for “#boycottNike” found that out of 77 tweets, 34 were negative, 13 were positive, and 30 were neutral.  Most of the tweets were considered subjective.
Graphs of the polarity and subjectivity of the search results for “#boycottNike” can be seen below:




The Aylien sentiment analysis of the search for “#TakeAKnee” found that out of 85 tweets, 46 were negative, 13 were positive, and 26 were neutral. A strong majority of the tweets were subjective.
Graphs of the polarity and subjectivity of the search results for “#TakeAKnee” can be seen below:




Twitter Analysis
 We focused on the sentiment analysis statistics in order to gain a high-level view of the tweets, and to gauge engagement with each of the search terms. The word clouds were analyzed with the goal of finding patterns in the content of the tweets and the terms and concepts that were used within the tweet texts.

Sentiment Analysis

The Sentiment analysis of the tweets reveal several important facts for understanding the social media reaction to Nike’s use of Colin Kaepernick in the Just Do It campaign. First, there is an overwhelming trend toward subjectivity rather than objectivity. This indicates that the campaign not only generated news, but reactions to that news. Second, the analysis indicated that the most common responses to “Nike” were neutral, while the responses to “Kaepernick” and the two hashtags were both negative. It is important to understand that the negative sentiment is more indicative of negative word choice than specific attitudes toward Nike, Colin Kaepernick, or the ad campaign, although the content of the tweets will be discussed further.  Third, the results for searches for Nike and Colin Kaepernick were robust (each search hit its respective limit), while the results for the hashtags were less than 100 tweets each. This could indicate that the hashtags themselves have gone out of usage, the organization of social media regarding Nike and Colin Kaepernick has become less centralized, or that the discussion of Nike and Kaepernick has become less political.

Word Cloud Analysis


Word Cloud for Nike: 


The word clouds offer further insight into the concepts associated with each search. The word cloud for Nike revealed the types of terms that one might normally associate with a shoe company – a mix of generic words such as “like,” “new,” “wear,” and “buy.” Adidas, Nike’s main competitor, is also mentioned, along with “brand” and “black”, which together make the name of a smaller rival, Brand Black. Neither Colin Kaepernick nor any term associated with his protest is not found within the Nike word cloud, which may indicate that he is not essential to the brand’s identity or perception, although the word “just” is one of the largest entries. This could either indicate Nike’s slogan, “just do it,” or the ad campaign which features Colin Kaepernick.

This is a word cloud of tweets involving Colin Kaepernick: 


The search for Colin Kaepernick is a mix of both football and protest-related terms. Football terms like “NFL,” “Redskins,” “Sanchez,” “play,” “team” and “job” are heavily featured, which would indicate interest in the possibility of Colin Kaepernick joining the Washington Redskins after their quarterback was injured. This is particularly interesting because Kaepernick being offered an NFL job again would possibly affect the main thesis of Nike’s ad campaign, “believe in something, even if it costs you everything.” In addition to football terms, there are several terms -- “kneel,” “protest,” “racist,” “Trump” and “police” -- that indicate Kaepernick’s protest is central to his public perception and identity. Furthermore, terms like -- “Ruben,” “Foster,” “Kareem,” and “Hunt” indicate a debate over Kaepernick not being offered jobs while NFL teams employ players accused of domestic violence like Kareem Hunt and Ruben Foster (It should be noted that both Kareem Hunt and Ruben Foster were released from their respective teams; Ruben Foster was claimed off waivers by the Washington Redskins).

Word Cloud for "#boycottNike": 

The searches for the hashtags return far more openly political results. The search for “#boycottNike” returned terms like "America,” “liberal,” “conservative,” and “anti,” which could be combined with any number of the descriptive terms returned by the search.  These terms indicate that users critical of Colin Kaepernick are using larger political concepts such as America, liberalism or conservatism in their arguments, and Colin Kaepernick is part of a larger political argument. This conclusion is further substantiated by the return of the terms “Keurig,” “Starbucks,” and “Elle Magazine,” which, while completely unrelated to Colin Kaepernick, are all companies that have also been subject to political criticism.

Word Cloud for #TakeAKnee: 



The search for “#TakeAKnee” also yielded heavily political results. Terms like “Black Lives Matter,” “shot,” “white,” “black,” “racist,” and “flag” have the same nods toward wide political debates and broad ideologies. However, there are more specific terms found in this search, particularly individuals. “OliverLNorth,” “TalbertSwan,” “realdonaldtrump,” and “daddydaddymac,” are all mentioned. Oliver North is a Reagan-era conservative political figure. Talbert Swan is a bishop often affiliated with police reform. @realdonaldtrump is the Twitter handle of US President Donald Trump, and @daddydaddymac is an active liberal Twitter user.  This suggests that while critics of Colin Kaepernick speak in broader political terms, supporters of Colin Kaepernick are engaging more with individual users of Twitter, including the President. Additionally, it should be noted that the term “Kareem Hunt” also appears in this word cloud. This could indicate that the users tweeting about Kareem Hunt in the “Kaepernick” search would be sympathetic toward Colin Kaepernick.

In conclusion, the word cloud data indicates that the tweets about the broader subjects of Nike and Colin Kaepernick are less politically charged than the tweets using the hashtags #TakeAKnee and #BoycottNike. Additionally, the number of tweets from each search suggests that the number of users actively tweeting using the hashtags is far fewer than the number of users engaging in the wider discussions of Nike and Colin Kaepernick. The data also suggests that while Colin Kaepernick’s public perception is intimately tied to his protest movement, and political discussion more broadly, Nike’s public image has not become intimately tied to the same protest even though Colin Kaepernick is the face of their “Just Do It” ad campaign.

Stock Price

It appears that Nike saw some level of success with the Kaepernick campaign.  Nike saw a small surge in stock price for about month following the September 3rd announcement ($3.35 increase from $82.20 on September 3rd to $85.55 on September 21st.  The effects did not last however, and the stock price has ultimately fallen back down to pre-campaign levels seen in June ($74.32).  When comparing Nike with competitors Adidas and Under Armour, it appears that Adidas has seen a similar change in stock price.  Under Armor doesn’t appear to have been affected but has actually grown in the last month.  There does seem to be evidence of a significant event around September 3rd for both Nike and Adidas. The following graph shows the change in price over the last five months.  Both companies see an increase prior to September, a small dip early in the month followed by an increase, and a slow decline through mid-December.

More interesting is the overall trend, both Nike and Adidas have seen an average loss in stock price of around twelve cents per day since September 3rd

It would be interesting to see if the sentiment surrounding Nike could be used as a predictor for both company’s stock prices.  Twitter is a superior for sourcing sentiment, when compared to articles, due to ease of use and size of corpora.  Unfortunately, due to the restrictions discussed earlier (limited free queryable history / expensive one-time fees), we were unable to procure tweets from early September.

Conclusions

Within the scope of this project, our group recognizes that we are assuming that public sentiment is the only force driving stock prices of Nike and competitors.  However, in reality there are a plethora of other factors that can affect the stock prices and financial performance of a company e.g. leadership changes, earnings reports and so on.
The sentiment from the news articles dating to around September 3rd were strongly neutral.  The stock prices near that date increased in price, and our group agrees that there is a higher than average possibility that a sentiment analysis on Tweets around that date would show a higher number of positive tweets compared to neutral or negative.  As we used the text mining tools more, we started to suspect that the sentiment analysis algorithms were not as robust as we initially thought, as the analyses frequently resulted in higher “neutral” results than a strong positive or negative.  We feel that our search queries were polarizing enough to expect results that would return fewer “neutral” results.
If we were able to recreate this research we would do a few things differently.  First, having the funding to access historical tweets would allow us to analyze the sentiment during the time of the ad release and compare that to the sentiment of the articles and stock prices.  Access to that data would allow us to build a predictive model to allow us to apply this research to future marketing campaigns.  Finally, doing a more robust search around other drivers of the stock market changes would allow us to see if it can be narrowed down to simply the ad causing the movement in stock prices. 
We learned that the Nike decision to use Colin Kaepernick as the star in their ads continues to drive social media discussion around larger social issues such as police brutality and the Black Lives Matter movement, but that the stock prices of Nike and its competitors seem relatively unaffected by the ongoing social media coverage.





Appendix A: ScreenShot of the RapidMiner Process for Assembling Tweet Corpora



 

Appendix B:  R Code for Word Cloud Visualizations



### Creates Word Cloud Visualizations from imported CSV files
### Used for MSA 8225 Final Project


### calls packages ###
require(twitteR)
require(RCurl)
require(tm)
require(wordcloud)
require(SnowballC)

### Reads in CSV List of Tweets
boycottnike_csv<- read.csv(file = "boycottnike.csv", header = TRUE)
kap_csv <- read.csv(file = "kaepernick.csv", header = TRUE)
nike_csv <- read.csv(file = "nike.csv", header = TRUE)
nikekap_csv <- read.csv(file = "nikekap.csv", header = TRUE)
takeaknee_csv <- read.csv(file = "takeaknee.csv", header = TRUE)

### Create Corpus from Tweet Vector ###
boycottnike_corpus <- Corpus(VectorSource(boycottnike_csv$Text))
kap_corpus <- Corpus(VectorSource(kap_csv$Text))
nike_corpus <- Corpus(VectorSource(nike_csv$Text))
nikekap_corpus <- Corpus(VectorSource(nikekap_csv$Text))
takeaknee_corpus <- Corpus(VectorSource(takeaknee_csv$Text))

### Prepares Text### 
boycottnike_clean <- tm_map(boycottnike_corpus, content_transformer(tolower))
boycottnike_clean <- tm_map(boycottnike_clean, content_transformer(removeNumbers))
boycottnike_clean <- tm_map(boycottnike_clean, removeWords, stopwords("english"))
boycottnike_clean <- tm_map(boycottnike_clean, stripWhitespace)
boycottnike_clean <- tm_map(boycottnike_clean, stemDocument)
boycottnike_clean <- tm_map(boycottnike_clean, removeWords, "boycott")
boycottnike_clean <- tm_map(boycottnike_clean, removeWords, "nike")

kap_clean <- tm_map(kap_corpus, content_transformer(tolower))
kap_clean <- tm_map(kap_clean, content_transformer(removeNumbers))
kap_clean <- tm_map(kap_clean, removeWords, stopwords("english"))
kap_clean <- tm_map(kap_clean, stripWhitespace)
kap_clean <- tm_map(kap_clean, stemDocument)
kap_clean <- tm_map(kap_clean, removeWords, "kap")
kap_clean <- tm_map(kap_clean, removeWords, "kaepernick")
kap_clean <- tm_map(kap_clean, removeWords, "colin")

nike_clean <- tm_map(nike_corpus, content_transformer(tolower))
nike_clean <- tm_map(nike_clean, content_transformer(removeNumbers))
nike_clean <- tm_map(nike_clean, removeWords, stopwords("english"))
nike_clean <- tm_map(nike_clean, stripWhitespace)
nike_clean <- tm_map(nike_clean, stemDocument)
nike_clean <- tm_map(nike_clean, removeWords, "nike")

nikekap_clean <- tm_map(nikekap_corpus, content_transformer(tolower))
nikekap_clean <- tm_map(nikekap_clean, content_transformer(removeNumbers))
nikekap_clean <- tm_map(nikekap_clean, removeWords, stopwords("english"))
nikekap_clean <- tm_map(nikekap_clean, stripWhitespace)
nikekap_clean <- tm_map(nikekap_clean, stemDocument)
nikekap_clean <- tm_map(nikekap_clean, removeWords, "kap")
nikekap_clean <- tm_map(nikekap_clean, removeWords, "kaepernick")
nikekap_clean <- tm_map(nikekap_clean, removeWords, "nike")
nikekap_clean <- tm_map(nikekap_clean, removeWords, "colin")

takeaknee_clean <- tm_map(takeaknee_corpus, content_transformer(tolower))
takeaknee_clean <- tm_map(takeaknee_clean, content_transformer(removeNumbers))
takeaknee_clean <- tm_map(takeaknee_clean, removeWords, stopwords("english"))
takeaknee_clean <- tm_map(takeaknee_clean, stripWhitespace)
takeaknee_clean <- tm_map(takeaknee_clean, stemDocument)
takeaknee_clean <- tm_map(takeaknee_clean, removeWords, "take")
takeaknee_clean <- tm_map(takeaknee_clean, removeWords, "knee")
takeaknee_clean <- tm_map(takeaknee_clean, removeWords, "takeakne")

### Create Wordclouds
wordcloud(boycottnike_clean, max.words = 100)
wordcloud(kap_clean, max.words = 100)
wordcloud(nike_clean, max.words = 100)
wordcloud(nikekap_clean, max.words = 100)
wordcloud(takeaknee_clean, max.words = 100)



Appendix C:  Financial Data Scrape



Appendix D: Article Links




Appendix E: Article RapidMiner Processes

















    






[1] Removing the specific search terms from the text allows for a better word cloud that is not dominated by the search terms themselves.


Comments

Popular posts from this blog

A Prediction Model for High-Dollar Black Friday Shoppers

Overview Black Friday is the most significant retail shopping day in America and being able to plan strategy around attracting high-dollar buyers would be an important advantage for retailers in an increasingly competitive retail market. Using a Kaggle dataset and Alteryx, I created three predictive models for finding shoppers likely to spend more than $10,000 on their Black Friday shopping.   Description of the Data The dataset for this project was obtained from Kaggle and uploaded into Alteryx. I used Alteryx to examine the data, looking for missing values or outliers. None were found in the data, and the data seemed to be very clean, so little to no cleaning the data was necessary. The dataset contained 537,577 records. Given the cleanliness of the data, I proceeded to explore the data. Model Selection I created multiple models in Alteryx, including boosted tree, decision tree, and neural network models. The model results are as follows: -                

Welcome

Hello, and welcome to my analytics portfolio. My name is Jordan Cherry, and I'm currently finishing up my master's degree in Analytics from Villanova University. I've established this space as a platform to show you what I've been able to do with my training, skills, and insight. I hope that this space gives you an opportunity to get to know me and my work, and perhaps learn a few things about the world as I post about various data projects I've been working on. Thanks for coming. I'm glad you're here. -Jordan