A collection of thoughts on the theoretical aspects of Outfoxed, and the whole idea of using social networks for metadata distribution.
Many things found on the internet are low-quality, false, or dangerous. Web surfers are often asked to make decisions of trust without any background information. Meta-information about what is good or trustworthy is traditionally spread by word of mouth or proactive inquiry, but the vast quantity of resources encountered online demands a faster means of distribution. This paper describes a method for using a social network of defined trust relations for individualized collection of metadata. Additionally, the applications of this method are demonstrated in a software prototype which reorders search results based on metadata, and presents trusted evaluations of web pages, downloaded files, and running processes as they are encountered. Preliminary results from test users is presented. Finally, it is argued that the chains of trust inherent in social networks should be applied at lower levels of computation, even at the level of the operating system.
Outfoxed is the implementation side of my master's thesis at the University of Osnabrück, Germany. The thesis title is Trusted Metadata Distribution Using Social Networks. In a nutshell, I'm exploring ways for you to use your network of trusted friends to determine what's good, bad, and dangerous on the internet. Outfoxed does this by adding functionality to the Firefox web browser. Coding began on Dec 27th, 2004.
This document first discusses the current features of Outfoxed: incorporating metadata information into web pages, internet searching, file downloading, and running processes. Next I give a little insight into how it works. And finally, I answer a few possible objections and criticisms and speculate about future possibilities. Note that this is an evolving document, and will certainly grow and change as my thesis proceeds.
The essential idea of Outfoxed is that people make decisions based primarily on a few people whom they trust. The average person has a set of experts whom they consult in designated areas: the computer expert, the car expert, the fashion expert, the financial expert. If the opinions of these experts can be collected, they are incredibly useful: it is this metadata (data about other data) that gives the most intelligent filtering and sorting of the information on the internet.
For example, Outfoxed lets my Mom know that I think it's okay for her to install the Flash plugin, but that she should not install anything from Claria. That's pretty good, but it gets better. The real power comes from chaining trust: Outfoxed also lets my Mom know that PC Pitstop, a company that I trust, has reported that Orbitz advertises via spyware. So if she ends up on an Orbitz page, or if an Orbitz page is returned in a search result, she will know to think twice about doing business there. (More about socially responsible shopping)
A word on Outfoxed lingo: The opinions that people give are referred to as reports. These can be about just about anything. The people whom you trust, and whose reports you want to know about, are your informers. All this information is stored in a statement. (For example, my statement can be seen here.)
[Click on "Next" below to continue]
Outfoxed augments the normal search rankings with reports from your informer network, which enhances searching by solving two important problems.
The first problem is the phenomena of Google Bombing. Days when the number of incoming links to a page could be used to estimate a page's importance are long gone. (See Google's Patent) Search engines are getting clogged with useless and irrelevant pages, but that somehow manage to get a high position in search results. With Outfoxed, these pages can be reported as low quality by others in your informer network, and these reports will cause these pages to be moved lower in search results.
The second problem for searching is that a page which is relevant for you may not be so relevent for someone else. Current search engines work on the assumption that there is one most relevant page for each query, regardless of who is making the search. Outfoxed gives priority to reported pages in proportion to how close the informer is to you. So if you do a search for "Thailand", and a close friend of yours has a blog with photos of her Thailand vacation, your friend's page moves very high in your search results. This page is very relevant to Thailand from your point of view. (Note: At present, Outfoxed can only re-order the results of a search page. Better integration with search engines is needed to achieve this ideal goal.)

While you are browsing, Outfoxed indicates the status of the current page on a button next to the location bar. Clicking on this button display of the report sidebar, which displays all reports on the currently viewed page. Additionally, whenever you navigate to a page rated as dangerous, a confirmation window gives you a the full report (including who gave the report) on the page and asks if you really want to go there.
Outfoxed can also indicate the status of all links on the page. By default it only highlights the dangerous ones, by putting a thick red border around them. For example, if you had Outfoxed installed, and had me as an informer, any link to Claria (like on the previous page) would look like this: Claria
Outfoxed can give reports on anything that can be uniquely identified. (To be precise, anything that can be specified by a URI.) Of course it's easy to change the name of files, but there exist algorithms that can generate a unique "fingerprint" for any file. Should the file change --even by a single bit-- then the fingerprint will be different. (Examples of these algorithms are MD5 and SHA-1.) These fingerprints can be used in a URI, which can then be reported.

For example, a software publisher can make a list containing the fingerprints of their products even though their software is actually distributed through a system of mirrors or via Bittorrent. If you download the file, Outfoxed checks the file's fingerprint and looks for reports in your informer network. If the publisher is one of your informers, then you'll know right away that you have the right version and that it hasn't been tampered with.
Outfoxed also adds functionality to local file browsing, allowing you to check the validity or add reports about any file on your system.

Have you ever looked at the Windows Task Manager and wondered what LTSMMSG.exe or ctmmon.exe were? Maybe spyware, or maybe some critical component of the operating system? This information can be found with a few online searches, but Outfoxed automates the process and makes it easy--even for my mom.
This feature is no replacement for good anti-virus and anti-spyware software, but it gives a good overview. It is also a good argument for why trust should be integrated at the OS level. (Described here)

There was an interesting flap when Microsoft released their first antivirus software. It seems that their software counted a certain Weatherbug program as adware, and so it is. The problem is that Weatherbug was bundled with AOL Messenger, and AOL threatened to sue Microsoft to force them to stop classifying Weatherbug as a "potential privacy risk." In the end, Microsoft backed down and users of their antivirus software no longer get warnings about Weatherbug.
My point here is not to snark Microsoft. There is a deeper issue: Who gets to decide what is malware? Do we leave it up to whoever has the most lawyers? And if even Microsoft can be pushed around, what chance to smaller malware protection programs have? A better approach, and that of Outfoxed, is to let people choose their own experts, and gather trust data from a large number of sources.
As the lines between malware and legitament software become blurry, it will become more important that these judgement calls are made by (or mediated by) people you trust, and not left to companies with unknown motives.
Note: Microsoft no longer recognizes Claria products as spyware. Claria is one of the most established spyware companies around, so this change of policy is like an exterminator who tells you they will no longer be killing some species of cockroaches.
If anyone in your informer network discovers a Phishing website and marks it as dangerous, you will know about it almost immediately. You are able to benefit from the expertise and experience of all your informers, and the actions of one person can save thousands of others from a dangerous site.
Outfoxed reads and publishes trust assessments as RSS, with a few extra tags. Such a file might say (translated into English) that xyz.com is good, while zyx.com is bad, and xxx.com is dangerous. But the key to everything is that these RSS files can talk about other RSS files. So one file may say that the file at xyz.com/mary.xml contains trustworthy information, while the file at abc.com/bob.xml should be ignored.
Just like a normal RSS reader, Outfoxed periodically downloads these files from the web, and keeps a local database of all reports.
(Dave from PC Pitstop brought up more objections)
Bad companies will just give themselves "good" ratings, and then we're back to square one.
Of course anyone is welcome to make a page giving any sort of reports they want. But it won't do them any good unless someone decides to use (i.e. trust) those pages. Which brings us to the next objection:
Some idiot friend-of-a-friend of mine might get conned into trusting some terrible company, and then I'll have bad trust data.
It's true, there are a lot of sell-outs and idiots out there. But remember, their bad trust decision affects not only you, but everyone else connected to them. (Possibly thousands of others.) All that is needed is for one of these other people to have a little sense and give a bad report to (distrust) the idiot, and then the problem is cleared. (See Keeping your network clean.)
People won't want to give out trust data.
Consider the fact that LiveJournal has 1.6 million people actively keeping blogs. That's just one company. And what are these people writing about? What music and movies they like. Products that suck. Software they love. Political views. The bottom line is that people love to tell others who and what they trust, and the internet has proven time and again that people will express themselves in any medium they find.
But isn't this private information people are giving out?
Even in the current implementation of Outfoxed, there is no requirement that any identifying information be given out. It is possible to create a page on a random server with a random filename and fill it with trust information. And this information isn't useless, because you can then give this address to your friends and thus give them your trust information. And their friends can benefit from your information too, even if they have no idea who you are.
The internet is a huge place. You can't expect people to have a local database containing reports on everything!
To an extent this is true: You can't expect every internet user to have reports on everything out there. But in reality, you don't need reports on everything. People tend to have similar interests as their friends -- that's one reason friends are trusted! So if you love collecting license plates, and some of your friends do too, then you can expect to have a lot of useful reports about websites and programs relating to your hobby. But someone might object further...
But that's still a lot of information! A user's network of friends will grow exponentially with each hop.
Here you need to run some numbers: If each person in a trust network introduces 10 new informers (which is probably too high), then with 3 hops the network is at 10,000 people. If each person gives 100 reports, that's a million reports total. If every report contians 1000 characters (also high), then our grand total is just one gigabyte. Now consider that all those desktop search applications, which are so trendy these days, can search your 100 gigabyte hard drive in a few seconds. So you can see that the data and processing requirements are a lot, but not excessive.
Every file and process should have a chain of trust leading back to the user. Any file or process without such a chain is being taken on faith, and the user should be warned accordingly.
For example, every process run by a computer should have a chain that looks something like this:
And similarly, every file should also have a chain:
Ideally, management of trust should be done at the lowest levels of computation: in the operating system or even in the microprocessor itself. This limits the ability of malicious software from disrupting the chain of trust back to the user. Outfoxed, because it is just an extension, has many vulnerabilities. Primary is the vulnerability of the locally stored trust database.
The next step would be to have trust storage implemented as a continuously running process that could be queried by other applications. [Note 22/03: The new version does this, using HTTP for queries.] So the browser, email client, and word processor could all draw trust information from the same source.
The best solution would be to have this process integrated into the operating system itself, so that the OS could also take advantage of the trust information by only running trusted applications. Trust managed at this level, combined with a good security methodology, would give us the ultimate trustworthy environment.
The history of internet searching can be divided into three phases.

The primitive search engines of the early internet trusted the metadata of documents completely. So companies hoping to show up in more searches began to fill their metadata tags with popular search phrases, often repeating popular words hundreds of times.

The second phase began with Google, who overcame this problem by not trusting the pages themselves, but by inferring "referrals" from external links. Links to a page by other pages were taken to be positive endorsements of that page. More incoming links meant a better search position. (Google called this measure of a page's importance PageRank.) The implicit logic was that these incoming links could be trusted because they presumably were made by someone other than the author of the page. This made it harder to falsely inflate your search rank, but it wasn't long before tricksters were finding ways to sneak "false links" onto pages to achieve the same inflated search rank for chosen pages. (Common techniques are googlebombing and spamdexing.)

Outfoxed is the tip of the next phase of searching. Instead of blindly assuming that every link on the web is an endorsment placed there in good faith, it allows for explicit assesments of quality to be given. These assesments are not all treated equally, but preference is given to those that are closer to the searcher within a [sort of*] social network. People closer to the searcher are more likely to share the searchers values. This system alows for high quality sites to shine, and poor quality sites to be weeded out. (By using the unique properties of small-world networks. See also Keeping your network clean.)
Every search query is a question: "What pages are most related to X?" Current search engines assume there is a single correct answer to each query. But consider a query like "Britney Spears." (The most popular Google query for 2004.) If you're a fan, you probably want to see her official site and maybe lyric pages. If you're a musician, you probably want to see reviews and music tabs. Of course, current search engines can't do this because they only consider "objective" measures like the number of links to a page. (See The good, the bad, and the subjective) What is needed is subjective, trusted ratings of the pages.
[A new search engine, Zniff, takes a step in the right direction by using publicly available social bookmarks as indicators of worth. Paradoxically, this approach is doomed to fail if it enjoys any success. If it becomes popular, it would be all to easy for tricksters to create false bookmarks for the sole purpose of inflating the ranks of chosen pages. It's the same lesson that Google is learning now with googlebombing: You can never trust random pages on the internet. Not even social bookmark pages.]
This chart shows a feature by feature comparison of existing products related to online trust. (Clicking a column will jump to the corresponding comparison text.)

These systems are “closed-world,” in that they only provide and manage primarily within their own system. (Epinions is the exception in that it also gives trust information about products.) All of them use some sort of reputation system.
Uses a reputation system, based on feedback. Description is here
These systems require that trust always be mutual. No information is distributed over the social networks; they are only used for establishing social contacts.
Friendster’s system is desribed here. (Patent)
Primarily for finding business contacts.
A very elegant web of trust system, where trusted reviewers rise to the top.
These systems attempt, through various methods, to bring an element of trust to web surfing.
More for entertainment than safety, but it is still similar in that it gives users the ability to give evaluations of pages that they view. Pages are chosen for you based on an AI algorithm that compares your evaluations with those of other people, and tries to give you pages which were rated good by people similar to you. Outfoxed provies the same functionality via its "Random Page" option, and additionally can tell how you are connected to the person who gave the good review.
StumpleUpon also has the drawback that it's reccomendations cannot be completely trusted. It is a business trying to make money, and sells page views to companies willing to pay. (A problem similar to AOL's Weatherbug controversy.)
A browser extension that allows users to report Phishing sites, and then alerts other users who attempt to access these sites. This is similar to Outfoxed, although all users are in one big pool. Users must trust Netcraft to filter out false or malicious reported data. It is worrisome that Netcraft has not ruled out the possibility of including advertising in the toolbar. Unlike Outfoxed (or Stumpleupon), this toolbar cannot share good pages. It also cannot report other sorts of dangerous pages (e.g. dialers, spyware), or simply bad pages. (e.g. too many advertisements, poorly organized data)
This company provides an image (they call it a "seal") that websites can display to prove that they are trustworthy. However, the presence of an image on a webpage gives absolutely no certainty for trust—copying an image is as easy as a right-click. Many spoofed pages prominently display the Trust-e seal, but are gone long before they can be tracked down, and long after innocent people's passwords and money are stolen. In this sense, Trust-e may actually be detrimental to online trust: if an internet user accepts the idea of an image guarunteeing trustworthiness, they especially vulnerable to these phishing attacks.
A very nice extension which displays the trust status of pages visited by users. However, only trust regarding Medical sites is provided, and communication is always one way: The user simply receives the trust information, and cannot give their own evaluations of sites. In the Outfoxed model, HON would provide a statement which people could choose to trust or not.
A browser extension that helps users to avoid spoofed sites by clearly indicating what site the user is actually on. This provides some measure of protection against phishing sites, but the display is visually annoying and users rarely use this product for long. This is merely a band-aid solution.
A community bookmark system, where people can share and tag their bookmarks. You can get RSS feeds of other's bookmarks, but sadly you cannot chain these feeds as you can in Outfoxed. For example, subscribing to someone's feed does not also subscribe you to their feeds, etc... Outfoxed also adds the added benefit of incorporting the tags into search results, and uses tags for indicating areas in which an informer has special expertise. (Detailed examination in Tagging and Folksonomy)
Digg takes the idea of del.icio.us a bit further, by adding slashdot-style community editing of bookmarks. However, this method falls into the problem of trying to distil an absolute quality rating from the opinions of the masses. But as they say, there is no accounting for taste. Consider a politically-biased page: Half of the users will rate the page good ("digg it"), while half will rate it bad. The final verdict of the system will be that the page is halfway between good and bad.
There exist many software products that filter pages for pornography and other objectionably material. This family of software is probably not replaceable by TrustBase, because of the shear volume of pornography on the internet, and because the people who would want such a filter are very unlikely to actively seek out and identify objectionable content.
These sites provide trust information, but in traditional human-readable form that must be actively sought out. For example, see Clean Software, 419.
Although originally designed to combat phishing and spyware websites, the metadata that is distributed via Outfoxed can be about anything. For example, suppose that Jane is environmentally conscious and would prefer to do her shopping at eco-friendly businesses. On a small scale, Outfoxed allows her to give reports, positive and negative, on businesses that she has encountered in the past. When a friend of hers encounters this business online, the friend will know Jane's opinion. On a larger scale, Outfoxed allows organizations (or activists) to publish massive directories of positive/negative reviews.
These screenshots show how Outfoxed is already being used to distribute this type of information. In the top screenshot, two informers have entered reports about Gap, and have included links to more information.

In the second screenshot we see information from PC Pitstop about companies that advertise via Spyware. Note the "thumbs down" next to the browser address bar. Clicking on this button opens the sidebar, where the full text of the report can be seen. In this case, the report contains a link to a page explaining how the information was gathered.
There is no limit to the variety of information that can be distributed via Outfoxed. Other immediate possibilities include information about political ties, use of open-source software, quality of construction, and price comparisons.
It is a risky move for a small or non-profit organization to classify a product or company as "bad". A company with enough money and lawyers can bully just about anyone into being quiet. (Even Microsoft can be bullied at times.) But imagine instead that their were hundreds and thousands of individuals all giving the same classification, and sharing their opinions via a social network like Outfoxed. The bully now has two problems. First, they now they have many more targets and must divide their efforts. This makes bullying too expensive. Second, the targets are now individuals and would-be customers. This makes bullying into a bad PR move.
The success of illegal peer-to-peer filesharing is due to this same quality. If a few websites are distributing pirated MP3's, it is not too much trouble for a comapany to find them and shut them down. But if hundreds of thousands of individuals are sharing, shutting them down becomes nearly impossible. This is even more true when the individuals are not engaging in piracry, but simply voicing their opinions. To use the Weatherbug example again, AOL may be able to indimidate Microsoft, but they could not bully thousands of computer users for sharing their opinion that Weatherbug is a crappy program.
[TODO: Positive example where Outfoxed promotes good-rated product to top of search results.]
The three ingredients for using metadata effectively on the internet.
You need a standard, computer-readable format & grammar for expressing metadata assertions. "This page is good", "This page is about sheep", "This file is virus-free", etc...
Outfoxed Solution: Use RSS files to express relationships.
In Practice: Use a browser extension to give the user an easy method of generating and publishing these RSS files.
However, the mere existance of these files is not enough, and leaves us only a little better off than before. How do you choose which metadata files to use? When some file says that a program is virus-free, how do you know it's telling the truth? This brings us to the next ingredient:
To select your sources of metadata information, use the features of a social network: only use (i.e. trust) people (or other sources) that you know, and people who know people you know, and so on.
Outfoxed Solution: Include the social network data in the metadata file. After all, assertions of trust are just another sort of metadata. So we have assertions saying things like "I trust/use/import the metadata in the file at location X."
In Practice: Use a browser extension to periodically spider through network of metadata files and build a database of trusted metadata information.
At this point you have a big fat database of trusted metadata, but it's of little use if you have to pro-actively search it. An automated solution is needed, which brings us to the next ingredient:
To make the metadata useful, we need to query our database whenever we encounter data that might have some relevent metadata. Here are the obvious places:
And there are many more!
Upon first learning about Outfoxed, people invariably ask the question, "But what happens when some jerk comes in and rates everything bad, or otherwise misbehaves? Doesn't that ruin the network?"
A key aspect of Outfoxed is that you can rate not only web pages and programs, but also other informers. Every informer in a user's informer network has "authority" over any informer which is less trusted by the user. (In the most simple case, trust goes down with each hop. See Trust Levels for more fine-grained approaches.) In this way, network maintenance is delegated to others, and many users can benefit from the action of a few.

This tree shows an idealized informer network. The user is at the top, with her four informers below. Each of these informers introduces four unique new informers, and so on. Only links which bring new informers into the network are shown. As indicated by the red dotted line, an informer 1 hop away has given a negative report to an informer 2 hops away. This action will remove the second informer from the user's network, and any informers (and their reports) which were only connected via this second informer.
The network of informers trusted by a user can be thought of as an exclusive club, with the user as the club’s founding member. Informers can become members of the club only if a current member is willing to sponsor them. Thus there exists a “chain of sponsorship” from any member back to the founding member. Members with shorter chains have more influence within the club. If more than one member is willing to sponsor an informer, the informer will always maximize his influence by taking the sponsorship of the member with the shortest chain.
But there is one catch: Even if an existing member is willing to sponsor, a potential new member can be barred from joining the club if there is another member closer to the user who has written a complaint about the potential member. This is his right as the more influential member. (Just like the CEO of a company can nix the hiring decision of a subordinate.)
Members may add sponsorships, revoke sponsorships, or write complaints at any time. Members who have lost their sponsor can keep their membership only if there is another member who is willing to sponsor them, and this new sponsor is closer (i.e. more influential) than any members who have written complaints.
In the next section, I'll show how this method relates to the infamous "six degrees of seperation" and the science of "small world networks".
For the moment,
imagine a world in which everyone was friends with only those people
who lived within a 5 mile radius. How would this change the degrees of
separation? It is clear that people living 1000 miles away from each
other will have at least 100 degrees of separation. (Of course, those
on different continents or islands would be completely unreachable.)
This strange example shows the importance of outlying
connections.
However,
in the real world we have friends who live very far away and who come
from different cliques. Figure 2 shows a graph where just a few nodes
were given edges at random. These 3 extra edges are enough to cut the
average degrees of separation in half. In the social world, it is these
"long-range" edges that can connect a kebab
owner in Berlin with Marlon Brando.
In the simple case, the numerical trust value of an informer (or a report) can be expressed as the inverse of the minimum number of “hops” required to reach the informer (or the informer making the report) starting from the user’s informer file. For example, if the user trusts informer X who in turn expresses trust in informer Y, then informer Y and the reports in Y’s informer file would be two hops away from the user. To prevent values of infinity for cases where the number of hops is zero (i.e. in cases where the user has made the report), we add one to the number of hops before taking the inverse.
More generally, the trust placed in target t relative to a source informer s is shown in Equation 1:

This is the model which is currently used by Outfoxed. However, looking to the future, other factors can and should be included for a more realistic evaluation of a report's relevance to the users.
In the previous section trust and distrust were discrete, and the trust value of each informer was directly related to the number of hops to the user. However, more fine-grained values of trust are possible and should be preferred. In the following examples, trust is defined as having values in the range [0,1], where 1 indicates complete trust and 0 indicates distrust. Trust values may also be undefined in cases where a numerical value cannot be determined. (Thanks to Mike Berger for help with the following formal representations.)
For non-discrete values of trust, the trust value of an informer is found in the following manner:
Trust is defined between two informers, the source s and the target t. Typically, s will be the user of the system. To calculate how much s trusts t, we consider all informers which have an edge to t (i.e. all informers with a report about t). In the figure below, these are labeled as i1 to in. (Note that these informers need not be directly trusted by s, in which case there will be a chain of intermediate informers.)

If there are no informers with an edge to t (i.e. n=0), the trust between s and t is undefined.
trust(s,t) = undefined
If there is only one such informer (i.e. n=1), we multiply the trust of this informer by how much this informer trusts the target. Note that this requires recursively determining the trust value of this informer.
trust(s,t) = trust(s,i1) • edge(i1,t)
If there is more than one such informer (i.e. n>1), we choose the one with the highest trust value, and multiply this value by how much this informer trusts the target, as shown in the equation below. As in the single informer case, trust values are determined by recursively determining each value of trust(s,in) for all n informers. In the equation, imax is the informer which had the highest trust value, max(trust(s,i)).

A report may be for a specific URI, or for a range of URIs. Typically this is the difference between rating a single web page and an entire domain.
It is clear that a report that is more specific is more relevant that one which is general.
The current implementation takes this to the extreme, with specific reports always taking priority over general. It may be advantages in future development or in certain applications for this effect to be moderated.
Tags can also be used to vary path lengths. When a user adds an informer, they can add tags indicating particular areas where this informer is trusted (or not trusted). For example, if your friend Bob is a good car mechanic but with very bad sense of humor, you might give him the tags "car repair auto-funny -humor". This means that his reports will take preference on pages tagged as auto, repair, or auto, and that his reports will be deprecated on pages tagged as humor or funny.
Althought the current standard does not support it, it may be useful in the future to include a degree attribute with each tag. This would normally represent the degree that the tag applies to its target, and for informers would represent the degree of trust placed in that informer for the given tag.
[TODO: this only slightly modified from Tagging. Need to organize this.]
Outfoxed uses tagging for several purposes. It implements some of the features of del.icio.us, for example allowing users to see all pages which have been given a certain tag.
(NOTE: The following features are not yet supported in the release versions of Outfoxed.)
Outfoxed uses tags to help resolve conflict within the database. If two equally-trusted informers give conflicting reports on a page, tags can be used to break the tie. When a user adds an informer, they can add tags indicating particular areas where this informer is trusted (or not trusted). For example, if your friend Bob is a good car mechanic but with very different political views from you, you might give him the tags "car repair auto -humor -funny". This means that his reports will take preference on pages tagged as auto, repair, or auto, and that his reports will be deprecated on pages tagged as humor or funny.
Outfoxed also uses tags to enhance searching, by checking for any pages tagged with words from the search phrase. For example, if you searched for "History of baseball", in addition to the normal search results you would also get returned a list of pages which were tagged with history, baseball, and history and baseball together.
Tagging systems are currently enjoying a wave of popularity, but can this success continue with increased growth? The current situation mirrors that of early usenet, when the relatively small body of users was quite homogeneous: computer-savvy, intelligent, young, and generally liberal. (A survey of the most popular del.icio.us tags shows that this is true.)
However, as the body of users grow, one can no longer trust the opinion of the majority, and the system becomes vulnerable to googlebomb-style attacks. Imagine if del.icio.us suddenly had a million users using it to find good pages. What do you think would be the top result for "britney spears", or "games"? The system would almost certainly be clogged with porn companies.
A social networking system like that used in Outfoxed can combat this problem by only trusting the tags of people within the network. Commerical and fake users would be weeded out.
(From an email exchange with Dave Methvin, CTO of PC Pitstop)
Here I'm relying on "the wikipedia effect." A study found that graffiti in Wikipedia remains there an average of only 5 minutes before it is corrected. Similarly, within my proposed system I'm hoping that boneheads will be quickly detected and marked as untrustworthy. So if the bonehead is (for example) 3 hops away, then I just need anyone within 2 hops of me to notice. (See Keeping your network clean.) And in a social network, there is additional social pressure that Wikipedia doesn't have: No one wants to be the guy that trusted the bonehead and messed things up for everyone downstream! (For example, imagine getting an email saying "Hey Dave, why is your friend Mary saying that Claria.com software is good stuff?!")
Within a web of trust, Googlebombing just doesn't work. If you are the would-be bomber, you have to convince a lot of people to add you as an informer. And then you have to hope that the people you have conned are informers to many other people. You must further hope that none of these other people will notice and report the bogus links. That's just too many levels of failure for googlebombing to be effective. (This also applies for straight-up hacking: Even though most of the trust pages will be presumably stored on low-security web servers, you'd have to hack a ton of pages to have any effect. And as soon as anyone notices, it's all for nothing.)
The other way of googlebombing would be to create tons of dummy users who are all trusted by one "real user". Once the real user is trusted, then all the dummies get in and screw up the trust levels. However, this only works if you have some sort of Bayesian or other distributed trust calculation system (see below) that takes account of the shear number of people who are giving their opinion. Outfoxed doesn't care about the number of votes, but only about the vote of the person who is closest.
You're absolutely right that trust is not binary. In fact, the underlying RDF structure of the trust files allows for continuous values. It was primarily a user-interface decision to go binary; I wanted the system to be usable by novices, and that meant using nice simple categories like "Good" "Bad" and "Dangerous". But there is no reason why someone couldn't write another client that uses the same RDF files but provides finer-grained information.
There are two reasons why I went with a simple hop-counting system over anything more complicated. The first is transparency; for something as important as trust, people should be able to understand the computer's system completely. Of course it's going to be wrong at times. However, my hunch is that users prefer a system which is sometimes wrong but always predictable, over a system which is less wrong but not understandable. (This is also the reason why my system only gives commentary, but never prevents the user from performing any actions.) The second reason was to provide a point of reference. When my system says a website is bad, it can tell you exactly who said it and how you know this person. It's important psychologically to be able to identify the source of trust, if only to have someone to blame if your trust turns out to be unwarranted.
A Bayesian system wouldn't meet either of these conditions. The math is so hard that the trust outcomes would seem to be simply mysterious when correct. But when a wrong trust decision is made, the people wouldn't trust the system ever again. For example, imagine that someone's daughter has said a website is bad when tons of more outlying people are unanimous in calling the website good. A Bayesian system might conclude mathematically that the site is in fact good. But this math won't mean a thing to the confused and angry father who can't understand why his computer just told him to trust some strangers more than his daughter. And secondly, a Bayesian system just doesn't give you anyone to blame; if a friend of a friend of yours recommended a company which turned out to be terrible, you know who you shouldn't trust in the future. If a similar failure of trust occured in a Bayesian system, there would be no clear path to fixing the problem.
Compared to other endeavours in the area of trust, Outfoxed is decidedly low-key.
Outfoxed does not attempt to model how or why people make trust or evaluative judgements. It doesn't try to predict anything. "What constitutes a dangerous page?" "What is 'High Quality'?" "How well should you know someone before you add them as an informer?" These are questions which are not addressed. In this regard, Outfoxed is similar to the many folksonomy projects that are currently en vouge. The beauty of not trying to answer these questions is that the users can use the system in any way they like. No one forces you to use a certain ontology for your tagging, and no one will tell you that you can't trust some informer because you don't know them well enough.
Also, Outfoxed never interferes. It never stops you from visiting something rated as dangerous, never forces you to visit someplace rated good. Its purpose is to expedite the normal mouth-to-mouth spread of metadata.