Oncology/EndocrineAssociation for Academic SurgeryA pattern-matched Twitter analysis of US cancer-patient sentiments
Introduction
Twitter (www.twitter.com) is a well-known online microblogging social media device that currently has 320 million monthly active members. The service allows for the users to send small messages called “tweets” that are limited to 140 characters; approximately 500 million tweets are sent per day.1 The Pew Research Center, which tracks social media usage among the United States adult internet users, reported that Twitter usage has significantly increased from 18% to 23% over the past year, with a significant increase of 5%-10% in users older than 65 y.2 Indeed, there has been wide recognition that Twitter is a powerful gauge of public sentiment across a spectrum of current social and medical issues: the impact of socioeconomic factors on happiness,3 climate policies,4 election results,5, 6 opioid abuse,7 understanding public perception of immunizations,8 predicting enrollment in Affordable Care Act marketplaces,9 perceptions of e-cigarettes,10 trending infectious disease,11, 12, 13 obesity, and allergies.14 Diez et al.15 sought to qualitatively characterize the content of breast cancer, colorectal cancer, and diabetes social groups on Facebook and Twitter. Twitter has also been used in the cancer arena to better understand breast cancer awareness month16 and to qualitatively categorize cervical and breast cancer screening patient dialog.17 At last, researchers have begun to understand the interconnectedness of cancer patients on Twitter and have sought to characterize those relationships.18 As patients increasingly turn to social media to express themselves about health care concerns, we sought to test the twittersphere as a potential means by which to collect and describe the content of patient tweets and to analyze patients' health sentiments with respect to the leading cancer diagnoses as documented by the National Cancer Institute.19 We hypothesized that the most prevalent cancers would be the most frequently tweeted and that patient happiness values would vary for each cancer diagnosis.
Section snippets
Methods
A large sample of English tweets from March 2014 to December 2014 with imbedded location coordinates (“geotagged”) were obtained from Twitter's streaming application programming interface. Pattern matching using “cancer” as a keyword returned 186,406 tweets. Using regular expression software (Perl), case insensitive pattern matching along with tokenization algorithms to strip punctuation, relevant cancer-related tweets were filtered from the data stream. Tweets from countries other than the
Results
The most frequently tweeted cancers were breast (n = 15,421), lung (n = 2928), prostate (n = 1036), and colon and/or rectal (n = 773; Table 1). Patients were manually extracted for each unique cancer diagnosis, with a total of 161 patients for breast cancer, although this only represented a small fraction of the total tweets (1.0%). This is in contrast to endometrial cancer, where out 43 total tweets, 10 patients were identified (23.3%). Following manually extracting patients for each cancer
Discussion
This study investigated the most commonly tweeted cancers and identified patients with active disease, as well as those in remission. The fact that the breast cancer was the top-tweeted cancer was not surprising, considering breast cancer is one of the most prevalent cancer types,19 the large public awareness surrounding the disease, and the highly publicized and endorsed October breast cancer awareness month. The national incidence of lung, prostate, and colorectal cancer is most likely
Conclusions
The most frequently tweeted cancers are breast, lung, and prostate cancer, and the most common theme of the tweets is sharing about treatment course. A hedonometric analysis of cancer-patient tweets demonstrated interdiagnosis variability, confirming the inherent natural history of the disease affects patient sentiments in unique ways. This preliminary study shows that patients do broadcast their illness through social media and that Twitter can and should be used as a source to gauge patient
Acknowledgment
This work was supported in part by the National Institutes of Health (NIH) Research Awards R01DA014028 & R01HD075669 and by Center of Biomedical Research Award P20GM103644 from the National Institute of General Medical Sciences to C.J.
Authors' contributions: W.C.C. authored the manuscript, performed the pattern matching, patient extraction, and tweet categorization. E.C. collected the Twitter data, performed the hedonometric analysis, created the word-shift graphs, and provided technical
References (26)
- et al.
Twitter as a source of vaccination information: content drivers and what they are saying
Am J Infect Control
(2013) - et al.
Early palliative care for patients with advanced cancer: a cluster-randomised controlled trial
Lancet
(2014) - et al.
The intersection between cannabis and cancer in the United States
Crit Rev Oncol Hematol
(2012) - About. Twitter. [Online] 2015. [Cited: March 2, 2015.] Available at:...
- et al.
Social media Update 2014
(2015) - et al.
The geography of happiness: connecting twitter sentiment and expression, demographics, and objective characteristics of place
PLoS One
(2013) - et al.
Climate change sentiment on twitter: an unsolicited public opinion poll
PLoS One
(2015) - et al.
A multi-level geographical study of Italian political elections from Twitter data
PLoS One
(2014) - et al.
Characterizing and modeling an electoral campaign in the context of Twitter: 2011 Spanish Presidential election as a case study
Chaos
(2012) - et al.
The canary in the coal mine tweets: social media reveals public perceptions of non-medical use of opioids
PLoS One
(2015)
Twitter sentiment predicts Affordable Care Act marketplace enrollment
J Med Internet Res
Social listening: a content analysis of e-cigarette discussions on twitter
J Med Internet Res
Cited by (0)
- 1
Present address: Oregon Health and Science University; 3181 SW Sam Jackson Park RD; Portland, OR 97239.