Research Data Leeds Repository

Headlines data for social media popularity prediction

Citation

Piotrkowicz, Alicja (2017) Headlines data for social media popularity prediction. University of Leeds. [Dataset] https://doi.org/10.5518/174

This item is part of the Social Media Popularity Prediction Using Headlines collection.

Dataset description

This dataset is part of a larger project on using headlines to predict the social media popularity of news articles. The dataset consists of two headlines corpora -- The Guardian and New York Times -- collected in 2014 using news outlet APIs. Each corpus includes a unique headline identifier (to enable recreating the corpus by querying the relevant API), the extracted features (news values, style, metadata), and the corresponding popularity on Twitter and Facebook.

Keywords: headlines, news values, style, social media popularity, prediction from text
Subjects: I000 - Computer sciences > I400 - Artificial intelligence > I410 - Speech & natural language processing
Divisions: Faculty of Engineering and Physical Sciences > School of Computing
Related resources:
LocationType
https://eprints.whiterose.ac.uk/115200/Publication
https://eprints.whiterose.ac.uk/115024/Publication
https://etheses.whiterose.ac.uk/20430/Ethesis
License: Creative Commons Attribution 4.0 International (CC BY 4.0)
Date deposited: 11 May 2017 09:50
URI: https://archive.researchdata.leeds.ac.uk/id/eprint/147

Files

Documentation

Data

Research Data Leeds Repository is powered by EPrints
Copyright © University of Leeds