I can imagine this being a handy tool for mods of subreddits. Find blacklisted users parading under a new account. Unlike Twitter or Facebook, finding the similarity between users in Reddit is a bit harder. From the official Reddit API, the data we have to work with is (the ones that look promising anyway):

  • GET /api/v1/user/username/trophies
  • GET /user/username/about
  • GET /user/username/where
    • → /user/username/submitted
      → /user/username/comments
      → /user/username/upvoted
      → /user/username/downvoted
      → /user/username/hidden (not accessible?)
      → /user/username/saved (not accessible?)
      → /user/username/gilded
  • GET /api/multi/user/username

A few assumptions and possible features about puppets:

  • Age - They're new - the account is barely a few days old. Not a very good indicator though in my opinion. Users could have been using the alt account for a long time. This is a better feature for detecting trolls.
  • Karma - A lot of controversial posts/links, again a better indicator for trolls. I would rarely expect the Karma for two users to match unless they've been equally active.
  • Activity - Time of submissions of posts and links would be a good way to find out a users time zone. Days when they post, could reveal posting habits.
  • Links they've submitted - find similar interests.
  • Upvoted and downvoted content - again similar interests
  • Subs they post in - similar interests

And finally their literary style or fingerprint. This could include emojis and emoticons.

Breaking up the problem:

The individual problems I see are:

  • Similar user interests
  • Find the time zone of a user
  • Troll detector
  • Literary style of users

I'll start with literary fingerprints for my next blog post in this series.