TWITTER DOESN'T SUPPORT SEARCH FOR EMOJIS. SO WHEN USERS DEVELOP A LANGUAGE CONVENTION AROUND ONE, HOW CAN WE TRACK IT?
is a bot that tracks clap rants on Twitter.
It's somewhat of a convention on Twitter for users to insert the clap emoji between words to 👏 emphasize👏 what👏 they👏 are👏 saying👏 . The convention originated on black Twitter, but after a friend showed me the tweet below, I saw how its use had grown wildly and wanted to be able to browse these random strong convictions in one place. Twitter doesn't support search for emojis, making clap Tweets very hard to find.
I found I couldn't keep the bot from retweeting hate speech, so I turned it off after only about 6 days. Still, over 400 perfectly clean and weird retweets remain from that time. View them here: @lol_yelling.
To make the bot I ran a series of algorithms through the Twit Stream API node package.
Connecting to the Twitter "stream" API
I used the fantastic Twit node.js package to connect to the Twitter Stream API. I chose Stream because Twitter's REST API doesn't allow for any form of emoji search. Unfortunately even Twitter's Stream API can only search for emojis in a limited capacity on its own.
Searching for emojis
I plugged in an emoji standardization tool called Emoji Data (a node.js package built by Matthew Rothenberg, who made emojitracker) to search Twitter in real-time for tweets containing 👏 .
Filtering with a Regular Expression
I then ran the resulting feed of 👏 tweets through a regular expression (a kind of word filter) to keep only tweets that contain "👏 (text)👏 (text)👏 ".
THE REGULAR EXPRESSION:
It means: "Anywhere within the tweet, look for a clap emoji, followed by a word or digit, and maybe a space, followed by another clap emoji, followed by a word or digit and maybe a space, followed by a clap emoji.
Deploying to Heroku & saving data
I deployed the bot through Heroku. Deploying on a server allows the bot to autonomously retweet random clap-rants from Twitter in real time. Heroku shuts down for an hour each night, so I ran the tweets through a queue file system that:
- Skips over any tweet that the bot has already found
- Add final matching results to an array queue
- Tweet results from the array queue with Twit.post()
- Delete posted Tweets from the array queue
Language filtering and its shortcomings / why I turned off the bot
To filter out tweets containing racist, sexist or homophobic language, I turned to Darius Kazemi's wordfilter package. This is a fantastic tool, but because it operates based on keywords, I found it's not a sufficient safety net for a bot that retweets strong convictions on Twitter. My bot was susceptible to retweeting clap rant tweets that were ideologically racist in sentiment but contained no keywords that could be filtered. Each time this happened, I deleted the retweet and reported the offending account and others in its network to Twitter. After the third retweet, from an account I had already blocked, I turned the bot off. While bots commonly require supervision (see: How to Make a Bot that Isn't Racist), it seems that this bot, which by nature retweets strong convictions, would require much more supervision than I'm able to provide.
Over 400 retweets remain: @lol_yelling.
Neil Cline for help with code architecture and the regular expression, A2Z instructor Dan Shiffman, Matthew Rothenberg for emojitracker and Darius Kazemi for wordfilter.