Votebot Anatomy 101 - Part 2

Blog of Reason · Jul 2, 2009

Discussion thread for the blog entry "Votebot Anatomy 101 - Part 2" by CosmicSpork.

Permalink: http://blog.leagueofreason.org.uk/youtube/votebot-anatomy-101-part-2/

Pulsar · Jul 3, 2009

Re: Votebot Anatomy 101 Ã¢â‚¬â€œ Part 2

Thanks again, Cosmic!

stefzula · Jul 4, 2009

Re: Votebot Anatomy 101 Ã¢â‚¬â€œ Part 2

Very informative, CS.....but it's kinda scary that people are able to do all this and pretty much get away with it. There's got to be something that can be done about this, but I'm pretty pessimistic at this point

CosmicSpork · Jul 4, 2009

Re: Votebot Anatomy 101 Ã¢â‚¬â€œ Part 2

It is disturbing. Unfortunately, it's down to YouTube... the only thing we can do is try to draw their attention to the problem.

Th1sWasATriumph · Jul 6, 2009

Re: Votebot Anatomy 101 Ã¢â‚¬â€œ Part 2

Once again, nice to see you posting - especially so informatively on a topic so close to all our hearts.

THE BASTARDS

AndromedasWake · Jul 11, 2009

Re: Votebot Anatomy 101 Ã¢â‚¬â€œ Part 2

CS, this is a wonderful analysis you have underway. Once you've completed this series of posts, I will make a video promoting them all directly, with links to each in order, and encourage the video to be spread/viralised, not only to attempt to get youtube's attention, but also to improve the community's understanding of votebots in general. Unfortunately, there are a lot of rumours/misinformation flying around.

Keep up this stellar work!

stratos · Jul 19, 2009

Re: Votebot Anatomy 101 Ã¢â‚¬â€œ Part 2

Interesting post, I've researched the youtube voting system myself and I found that a script must at least be reasonably sophisticated. For this I think it will be very hard for google/youtube to battle this.

On one side they are already battling spammers on a grander scale, this basically introduced the need for captcha's. I would reckon that the votebots are created by the same communities as who create spam scripts. This because the techniques will be largely the same, and it is largely the same business. The reason why it will be hard is because if they try to fight it with technology they will basically start a race which the spammers will win. For instance, simple captcha's are already broken, and even the hard ones are starting to slowly fail.

The other route google/youtube can take is to restrict the service: X amount of votes per day/ per account or 7 day "waiting" time on new accounts before they can vote. However both of these will hinder the "freedom" if you will, of legitimate youtube users, so this has quite a negative effect on their service.

The best course of action youtube can take in this respect I think, is to adapt the same technique IMDB has done. That is to use statistical trends to dilute sudden changes, and to weigh ratings based on a rating record of the user.

What I mean with that is as such.
If a video has gotten 3 to 5 star ratings for it's first 100 votes and after that gets a 100 1-2 star votes, it will not count those fully because they fall out of the range of the established pattern. They do effect the scoring of course, but perhaps only move it as if there where 50 1-2 star votes. However this type of system only works when large amounts of votes are made.

The second system is quite resource intensive but perhaps more accurate. We take the same expected rating number as in the first system. Now if the user votes and his vote is within the expected rating number, he gains accuracy points, or whatever you would like to call it. If he votes outside of the range he losses points. Now when he votes this "accuracy rating" will be taken into account. Most reasonable/normal users will often vote within the range, however a sockpuppet almost by design will never vote within range. As such, the sockpupets votes will be worthless.

The problem with this however, with both these solutions, is that it adds some more complexity to an otherwise straightforward and simple system. Which can be a problem with the scale of youtube.

hmm, well this "comment" got a bit long, but anyway, great post.

joshTheGoods · Jul 25, 2009

Re: Votebot Anatomy 101 Ã¢â‚¬â€œ Part 2

Great analysis you have going here CS, keep it up.

I've been kicking around the idea of writing some software to combat vote botting. I've asked a few tubers their opinions on the idea, and I've gotten lukewarm responses; so, tell me what you think, and if the response is decent I'll whip it up.

The problem is this: massive amounts of votes in a short period of time. We know this because of the spikes in peoples' voting stats as illustrated in several videos. We also know that the votes are coming from different accounts, how this is done is moot, but CS gave a good outline on that subject (proxies, cookie management, etc). So, assuming YouTube isn't going to touch this for a while, the question becomes: how can I prevent random individuals from voting for select periods of time? As far as I can see, the only solution is to A. as accurately as possible detect the intervals in which spambot attacks are occurring in real time, and B. disable comments for that time period. Yes, this means possibly losing positive ratings while your video is being protected, but assuming we achieve (A) the damage of missed votes will be minimal in comparison to the damage done to the votebot attack.

Here are some feature ideas, please feel free to add to the list:

1. Check video status on all of user's videos 1/X minutes where X = {5->30}
2. Customizable protection triggers for example:
trigger: if (vote rate) > (3 x average vote rate) over the last interval
response: put video on probationary (no rating) period of X minutes where X = {0 -> 1440}.
trigger: if (vote rate) > (3x average vote rate) over interval immediately following probation
response: turn off ratings until manually re-enabled by user
paranoid response: turn off ratings on all videos
3. Automatically add annotation to video while ratings are disabled with customizable message for example:
"This video was attacked by a vote bot at {attack-time}. Ratings will be re-enabled at {enable-time}."
4. Automatic notification when triggers are tripped.
5. Common tasks for example:
- enable/disable ratings/comments/sharing on all videos

Again, this is just brainstorming; so, if you have any ideas shout them out.

There are two ways that this solution could be delivered:

1. A program that, while running on a computer connected to the internet, would serve as protection. I have a computer running 24/7 at my place, so this wouldn't be a problem for me.
2. A program that runs on a central server and provides a service that users would sign up for and manages online.

Doing it in the first way would mean I could do it for free and pass it out, doing it the second way means I have to pay for the server, so you would have to pay for membership.

Thoughts?

stratos · Jul 26, 2009

Re: Votebot Anatomy 101 Ã¢â‚¬â€œ Part 2

joshTheGoods said:
.. anti-votebot description ..

The biggest problem with this would be latency. AFAIK youtube statistics are not real time, so there will be a delay between the votebot attack and youtube giving you the statistics to detect the votebot attack. Then after that once the anti-votebot disables the votes, there will probably be a delay before that kicks in as well.

I am unsure if these delays are in seconds or minutes, or perhaps for the statistics even 10's of minutes. But overall you would basically have to build a anti-votebot to find out if it could actually be effective.

joshTheGoods said:
2. A program that runs on a central server and provides a service that users would sign up for and manages online.

From the perspective of security I would object to this. In essence the service would have to store usernames and password to work. That generally is not a good idea from a security perspective. Also, I have not confirmed it, but youtube might have some rules about not giving out your login credentials, but that's a minor point.

CosmicSpork · Jul 26, 2009

Re: Votebot Anatomy 101 Ã¢â‚¬â€œ Part 2

I had had a similar idea to this actually, but it didn't involve automatically monitoring an account as I figured that's be quite intensive, at least to run off a central server. Users would instead sign in to a website, and get it to disable all the ratings on their videos for a period of time.

I much prefer this method of combatting votebots, rather than simply votebotting your own videos to counteract the 1 star votes, that would be a bit hypocritical really.

Releasing the program for people to use at home is probably a good idea, assuming it wasn't difficult to use and set up. I know that some Youtubers aren't necessarily that technically minded. Having a server would help make the process easier for certain people but I doubt that enough of them would be willing to pay for the service considering the potential costs of running it, I imagine given enough people using it, monitoring the accounts could be quite a resource hog. I like the annotation idea, that would certainly help to inform people that the votebot attack is going on even if the actual Youtuber whose account it is doesn't know it's happening.

So initially it seems like a good idea. The pitfalls might arise when trying to get the information necessary from YouTube as, as I understand it, the ratings information isn't updated in realtime.

joshTheGoods · Jul 26, 2009

Re: Votebot Anatomy 101 Ã¢â‚¬â€œ Part 2

stratos said:
The biggest problem with this would be latency.

Yes, I've considered this, but I haven't thoroughly tested it. YouTube/Google provides means for getting these statistics via their Data API, and it's just a matter of a few tests to nail the delays down. I've already got a developer's key (you can get one instantaneously!), and tonight I think I'll put together a little test suite, and proof of concept. FWIW, while writing my anti-spam bot I tested how many "mark as spam" votes is takes to make a comment get folded up as spam. It turns out it takes 5 votes, but what's relevant is that the votes are recorded and measured in real time. As soon as that fifth spam vote comes in, the next refresh of the page hides the comments. This may not prove much, but it's certainly a good sign.

stratos said:
I am unsure if these delays are in seconds or minutes, or perhaps for the statistics even 10's of minutes. But overall you would basically have to build a anti-votebot to find out if it could actually be effective.

My plan is to write methods for voting, getting the vote counts, and disabling votes. The next step would be to test each method, and to get average response times from YouTube. Once I nail down the response time for each method it's just a matter of tweaking the triggering algorithm. I would, as you've pointed out, be writing most of the underlying functionality in testing but I enjoy such things

.

stratos said:
From the perspective of security I would object to this. In essence the service would have to store usernames and password to work. That generally is not a good idea from a security perspective. Also, I have not confirmed it, but youtube might have some rules about not giving out your login credentials, but that's a minor point.

This is a common issue with account management everywhere. The YouTube/Google API provides secure means for authentication, but in the end you would have to trust me (the software provider) not to allow your credentials to get into the wild. Sadly, this is an issue with any sort of YouTube add-on including my anti-spam bot. I purposely embedded a browser instead of asking for a UN and PW and entering the information automagically to give the user the sense that I'm not eavesdropping on their credentials. Regarding the TOS on giving out credentails ... again, YouTube provides an API mechanism for account authentication, and they provide functionality for allowing 3rd party applications to access your account (like facebook).

joshTheGoods · Jul 26, 2009

Re: Votebot Anatomy 101 Ã¢â‚¬â€œ Part 2

CosmicSpork said:
I had had a similar idea to this actually, but it didn't involve automatically monitoring an account as I figured that's be quite intensive, at least to run off a central server. Users would instead sign in to a website, and get it to disable all the ratings on their videos for a period of time.

I originally thought the resource hogging nature of a real-time monitor would be an issue, but I have a few novel ideas on how to minimize the traffic and to hopefully avoid irking YouTube into paying attention to my software. Here's a short version: anytime a user signs up, I would add their videos to a playlist on a utility YouTube account. This would put all of the videos I'm monitoring into a single data feed. In other words, I could request statistics on all of the videos in question using one call. I could further break the videos down into playlists of some arbitrary maximum size (say 50 videos), and rotate which list I check in such a way as to provide each with the proper coverage resolution. Let's say, for instance, I'm monitoring 5000 videos and I want a resolution of 10 minutes. I could break the 5000 videos into 50 lists of 100 videos then I would have 10 minutes to download the data from 50 lists. That would make the average time between data requests about 12 seconds which is attainable. 10 minutes might be too high of resolution (we'll see when I test for latency issues), if it turns out the best we can do is say ... 1 hour resolution, then before getting to a 1 request per 12 seconds state I'd have to be monitoring 30k videos. We'll see aye?

.

CosmicSpork said:
Having a server would help make the process easier for certain people but I doubt that enough of them would be willing to pay for the service considering the potential costs of running it, I imagine given enough people using it, monitoring the accounts could be quite a resource hog.

Yes, I've been thinking on this. What would be a reasonable pricing model? My thoughts are currently running along the lines of saying I'll protect some arbitrary small number of videos for free, but if you want to protect more than your favorite few at a time I would charge like .50$ per video per month (whatever price would make my costs zero ... aka non-profit). I could then alternatively provide the application for free to allow people with the means to do so the ability to protect themselves.

CosmicSpork said:
So initially it seems like a good idea.

Thanks for the feedback! Since this application would be based off of the Google API, I can use basically the same code for the desktop and online versions of th application. Who knows, maybe I'll need a webmaster at some point

.

joshTheGoods · Jul 27, 2009

Re: Votebot Anatomy 101 Ã¢â‚¬â€œ Part 2

Well, I've been experimenting for the last few nights, and I thought you guys might like an update.

First, I have a few new ideas on votebot detection. Idea #1 goes like this... we know the following:

N(0) = number of votes at t = 0
N(1) = number of votes at some time t where the number of votes has change (data update)
A(0) = average rating at t = 0
A(1) = average rating at some time t where the number of votes has change (data update)

We can thus calculate the average rating of all of the ratings that have occurred between data updates using the following expression:

( [A(0) * N(0)] - [A(1) * N(1)] ) / [ N(1) - N(0) ] ..... (pardon my crazy braces/brackets ;p)

We can then effectively check the new ratings against some metric. It looks like the vote bots hit with 1 star ratings most of the time, so the closer a group of votes' average rating is to 1 (after a certain amount of new votes) the more likely it is that there is a vote bot attack on.

Idea #2 is simply to compare the number of new ratings to the number of new views. This is based on the probability that vote bots aren't using browsers to simulate clicks (too slow, too easy to make faster methods) meaning that the bot is rating videos without viewing them. Since the program will be watching your videos over a decent amount of time, it will have reasonable estimates for what kinds of numbers to expect normally.

Now, onto the more pressing issue of data resolution. I tested the YouTube data API, and it's really not bad. It does have a pretty big range for how often it updates (as low as 15 mins, and up to several hours from an action). However, the YouTube API runs separately from the data provided to the regular user throughYouTube's ajax calls. In short, I think I can achieve a data latency of ~ 30 minutes. I think that a good vote bot could easily achieve ... say 1 vote/10s, so 6/min, or 180/30mins ... meaning at worst the bot would give up ~ 180-360 votes before shutting the video down. I haven't tested the latency on disabling ratings yet ... but I'm going to go ahead and write the client version of this program (it's mostly done already) so everyone can at least have the best protection available while they run it. If people like it, and it works well ... then maybe I'll do an online version?

Check out some screenshots of what I have just from testing:

note that in this next SS, you can see how long it took to register a change after voting. I tested voting with 1 person, and voting with 5 all at once. You can also see that in 20 minutes someone could write a program that votes as quickly as this quick hack job I did, and I forced a 5 second timeout between votes!

stratos · Jul 27, 2009

Re: Votebot Anatomy 101 Ã¢â‚¬â€œ Part 2

joshTheGoods said:
.. snip...]

Well sounds good, I'm sure that if you polish it a bit quite a few youtube users would be interested.

But just as a FYI, in the screenshot with the code, is that your API key? if it is you might want to change the image

It's probably just a partial section of it, but beter to be safe then sorry.

joshTheGoods · Jul 28, 2009

Re: Votebot Anatomy 101 Ã¢â‚¬â€œ Part 2

stratos said:
But just as a FYI, in the screenshot with the code, is that your API key? if it is you might want to change the image It's probably just a partial section of it, but beter to be safe then sorry.

LOL. Good catch. Yea, the API key is pretty big ... and I can't see why someone wouldn't just grab their own (or three), but as you said: "better safe than sorry." Image edited ;p.

I hope to have a working version tonight, if anyone is interested in helping me test the app let me know. I'm not sure I'll be releasing the source for this one, but if there are any C# devs out there reading this that want examples of how to do stuff PM me and I'll more than likely give you the code you're interested in. Obviously I wont be sending out the source to the test application I used to do my original data monitoring and vote testing :x, so don't ask on that front hehe.

joshTheGoods · Jul 28, 2009

Re: Votebot Anatomy 101 Ã¢â‚¬â€œ Part 2

One more update, time to get excited (all 2 of you that paid some attention hehe)

Here is a screenshot of the program working!

Here are some close ups of the options I have implemented so far

If anyone wants to be part of this little project c'mon with the ideas! I'm terrible with interface design, so if anyone wants to help out in that regard let me know. I'm using VS 2008, and the express version is free. Anyone can download it, and make an interface ... and if I like it I'd port the code over to your forms. Any feature requests etc would be greatly appreciated.

edit:
I forgot to mention, disabling ratings on videos is instant!

CosmicSpork · Jul 28, 2009

Re: Votebot Anatomy 101 Ã¢â‚¬â€œ Part 2

This looks very promising mate.

Although I do know a good bit of C# now, my experience has only really been for front end web development with ASP.Net and some of the more hardcore things are still beyond my skill level.

I'll have a think about the interface, as that IS one of my strong points

It might be worth simplifying the process for those that aren't very tech savvy. Maybe a wizard / advanced modes approach?

Assuming it goes ok, and the program cannot in anyway be used to abuse YouTube's system I'll host it on LoR for download. There would be no harm in you putting a donate button on there if you want some sort of compensation for your time. I see no issue with people asking to get something in return for their efforts.

Something I was wondering, and you might have touched on and I missed when reading is, does the GDATA API limit you to a certain number of a API calls in a given period for a particular API key?

joshTheGoods · Jul 28, 2009

Re: Votebot Anatomy 101 Ã¢â‚¬â€œ Part 2

CosmicSpork said:
Something I was wondering, and you might have touched on and I missed when reading is, does the GDATA API limit you to a certain number of a API calls in a given period for a particular API key?

Yes, there is a quota on call frequency. This shouldn't be a problem for individual users since I will make the minimum data resolution 20 minutes. I am a little worried that it's based solely on the API key (the FAQ is vague) but I suspect that it goes by IP as well. You can see what the FAQ says about it here:

http://code.google.com/apis/youtube/faq.html#quota

CosmicSpork said:
Assuming it goes ok, and the program cannot in anyway be used to abuse YouTube's system I'll host it on LoR for download.

Yea, I was thinking on the data scraping aspect of the program, and I decided to scrap it. I don't think that the extra data resolution is all that much, and I don't want to even provide the option to do something that I know is against YouTube ToS. It's funny you brought it up, because I was just about to start a complete rewrite (the first run was proof of concept + testing, the next will be interface independent and will have error handling :x), and the first thing I thought about was how much I could narrow the framework of my source if I exclude the data scraping technique. You've just put the issue completely to bed

.

I should be able to do the whole rewrite in a couple of hours tomorrow, so expect a beta to be available tomorrow night!

djarm67 · Jul 29, 2009

Re: Votebot Anatomy 101 Ã¢â‚¬â€œ Part 2

Looks cool.

DJ

joshTheGoods · Jul 29, 2009

Re: Votebot Anatomy 101 Ã¢â‚¬â€œ Part 2

12 hours later ... the re-write is almost complete! Everything is done except for the code to actually disable ratings. I've run rudimentary test, but very shortly I'll be looking for others to play with the program, and see how it runs for them.

Here are some screenshots of the new app (I'm still looking for suggestions/help with the interface if anyone is interested).

The interface is now based on a tray icon (Yeti!) and a context menu like so:

Here is what the My Videos form looks like in action. When completed, there will be buttons for disabling and enable ratings on selected or all videos.

This is what the options form looks like, the other tabs have the same options as the old interface

Now, the core functionality... the guardian bot runs in the background, but send updates to the status form... here it is in action! Note that it shows the two statistics I'm currently using to detect attacks (vote/view ration, and average new rating)

I will probably have the program ready for beta tomorrow night with a video tutorial to introduce it. Thanks for your limited participation (you know who you are), and as always I'm looking for suggestions on features, improvements, my next project!

Votebot Anatomy 101 - Part 2

New Member

New Member

New Member

New Member

New Member

New Member

New Member

New Member

New Member

New Member

New Member

New Member

New Member

New Member

New Member

New Member

New Member

New Member

New Member

New Member