Project Pitch(es)
Idea #1: Memes of Resistance: Stories from the Russia-Ukraine Conflict Problem Statement / Research Question:
- Problem: Members of polarized social groups often exist in isolated online communities where content that reinforces their views is posted and content that challenges their views may be removed by community moderators due to variations in forum rules. Furthermore, these communities are subject to confirmation and conformity bias and may constitute echo chambers where stories and experiences from perceived “others” are not shared. Tool: This project demonstrates a digital archive for aggregating stories from “different sides” of a social conflict and making those stories visible and accessible in a shared environment. This
- Research Question: Through data analysis, we may identify how the stories told in these polarized communities differ in their content, themes, affective and linguistic approaches, and views of the “other”?
- Data Sources: Data will be extracted from social media platforms including Reddit, Twitter, and Facebook, obtained through available APIs. These data (1) meme images, videos, and associated text (including emoji and hashtags); (2) post titles and text; (2) associated metadata (depending on the platform, these will include any combination of access link, community name, likes, upvotes/downvotes, viewer count, etc.); comments and their associated metadata. Memes will also be transcribed such that their written content (e.g., image text and video subtitles) is available for computational analysis.
- Annotation: Annotations will be made on these memes to extract qualitative themes, while computational analysis may be used to extract additional textual themes at-scale. Automated annotation may be performed using LIWC, empath, or similar tools.
- Technical Challenges: There are technical challenges in data extraction on Facebook in particular, but extracting data from Reddit and Twitter is simpler using available API.
- Audience: The intended audience for this project includes individuals who are polarized on the basis of socio-political issues relevant to the subject matter (e.g., Russian and Ukrainian citizens) as well as policymakers and government agencies, for whom an understanding of the individual experiences inherent in the subject matter may inform policy action.
- Inspirations: This project was not inspired by any particular project. As I was browsing Reddit in the weeks following the start of Ukrainian evacuations a few months back, I noticed that I was seeing a lot of empathetic content showing the experiences of Ukrainian refugees, but I was not being exposed to content showing the experiences and perspectives of Russians. When I looked briefly into content from “the other side,” I was shocked at the immediate and easily observable differences in affective language use, imagery, and themes.
- Skills needed: Web programming; data visualization; possibly social media data “scraping” (e.g., APIs). I can contribute design, NLP, research, and writing skills to this project.
Idea #2: “In A Word” - An Interactive Digital “Monograph”
- Problem Statement/Research Question: What can we learn about a single word by analyzing its past and present usage across a wide range of data sources? To begin to answer this question, we will create an interactive digital monograph examining available data including the use of a single word of interest (e.g., “climate”) over a time period that includes the entire historical record, as available.
- Data Sources: Data from this project will be drawn from any combination of publicly available sources including social media platforms, news reporting archives, congressional and presidential speech archives, literature and philosophy archives (e.g., Project Gutenberg), song lyric archives, video archives, and so on. Most of these data can be easily downloaded through open source tools online, but some will require APIs (e.g., social media data). I am also not sure how to extract subtitle or “spoken” data from videos using available video streaming platforms.
- Audience: Linguists and other speech-language professionals may explore the use of a word over time and throughout widely varying social and cultural contexts.
- Inspirations: I considered this idea initially when watching a talk from my advisor on the topic of “the birth of a word.” In that case, he was studying how his child began to learn the word for “water” following his birth. In this case, I begun to think about whether there have been any exhaustive studies or projects examining a single word throughout many varied contexts.
- Skills needed: Web programming; data visualization; data extraction (APIs) and manipulation.