Pagerank, Content Farms, Search Quality and Black Hat SEO

Last update : June 29, 2013
PageRank is a link analysis algorithm, named after Larry Page and used by the Google Internet search engine, that assigns a numerical weighting to each element of a hyperlinked set of documents. PageRank has been patented; the patent is assigned to Stanford University and exclusively licensed to Google.

The PageRank of a website is shown in the Google toolbar. The PageRank score was however removed by Google in October 2009 from the Webmaster Tools, because Google has been telling people for a long time that they shouldn’t focus on PageRank so much. PageRank is not an important metric for search engine optimisation.

Several providers continue to find the  PageRank data useful and offer free PageRank checkers, for instance at the website www.prchecker.info.

The term content farm is used to describe a company that employs large numbers of often freelance writers to generate large amounts of textual content which is specifically designed to satisfy algorithms for maximal retrieval by automated search engines. Their main goal is to generate advertising revenue through attracting reader page views. Critics allege that content farms provide relatively low quality content.

Search engines see content farms as a problem, as they tend to bring the user to less relevant and lower quality results of the search. On February 24th, 2011, Google announced that they were making a substantial change in their ranking algorithms to purge low-quality informations in the search results.

A black hat is the bad guy in a western movie. In computing slang it refers to a computer hacker. Black Hat search engine optimization is defined as techniques that are used to get higher search rankings in an unethical manner. Some of these techniques are keyword stuffing, invisible (hidden) text, cloaking, duplicate content and doorway pages.

Further informations about the Google search algorithms are available at the scriptol website.

Trackbacks and Pingbacks

Trackbacks were originally developed by SixApart, creators of the MovableType blog package. It’s a notification method between websites working as follows :

  • Person A writes something on their blog.
  • Person B wants to comment on Person A’s blog, but wants her own readers to see what she had to say, and be able to comment on her own blog
  • Person B posts on her own blog and sends a trackback to Person A’s blog
  • Person A’s blog receives the trackback, and displays it as a comment to the original post. This comment contains a link to Person B’s post

Most trackbacks send to Person A only a small portion (a teaser called an “excerpt”) of what Person B had to say. One problem is that there is no actual verification performed on the incoming trackback, and indeed they can even be faked.

Pingbacks were designed to solve some of the problems of trackbacks. The official pingback documentation is available on the website www.hixie.ch.

The best way to think about pingbacks is as remote comments:

  • Person A posts something on his blog.
  • Person B posts on her own blog, linking to Person A’s post. This automatically sends a pingback to Person A when both have pingback enabled blogs.
  • Person A’s blog receives the pingback, then automatically goes to Person B’s post to confirm that the pingback did, in fact, originate there.

There are two significant differences between pingbacks and trackbacks : pingbacks and trackbacks use drastically different communication technologies (XML-RPC and HTTP POST, respectively) and pingbacks do not send any content.

A useful guide “Introduction to Blogging” with more details about trackbacks and pingbacks is published by WordPress.

Roles and Capabilities in WordPress Blogs

WordPress uses a concept of Roles, designed to give the blog owner the ability to control and assign what users can and cannot do in the blog.

WordPress has five pre-defined Roles: Administrator, Editor, Author, Contributor and Subscriber. Each Role is allowed to perform a set of tasks called Capabilities. There are many Capabilities including publish_posts, moderate_comments, and edit_users. The default Capabilities are pre-assigned to each Role.

The summary is given here :

  • Administrator –  has access to all the administration features
  • Editor -can publish and manage posts and pages as well as manage other users’ posts, etc.
  • Author – can publish and manage his own posts
  • Contributor – can write and manage his posts without uploading file,  but not publish them
  • Subscriber – can only read posts and  manage his profile

Wavatars

Wavatars is a wordpress plugin that will generate and assign icons to the visitors leaving comments at your site. The icons are based on email, so a given visitor will get the same icon each time they comment. It livens up comment threads and gives people memorable “faces” to aid in following conversation threads. It’s also fun.

Wavatars

McAfee SiteAdvisor

Each day, McAfee visits Websites and tests them for a comprehensive set of security threats. From annoying pop-ups to back door Trojans that can steal your identity, they find the danger zones before you stumble on them. The test computers look for  downloadable files like screensavers, toolbars and file sharing programs which may be bundled with adware, spyware, viruses and other malicious computer code. They also examine browser exploits, e-mail, phishing, e-commerce, pop-ups and cookies and affiliations with other sites.

McAfee uses proprietary techniques to visit and test sites  and to analyse data.  The results are presented to the users in the form of colored icons.

  • green : very low or no risk issues found
  • yellow : minor risk issues found
  • red : serious risk issues found
  • black : not yet rated, use caution

SiteAdvisor software is an award-winning, free browser plug-in that gives safety advice about Websites before you click on a risky site.

Facebook Graph API and access tokens

The Facebook Graph API enables you to read and write objects and connections (relationships)  in The Facebook social graph.

There are 14 graph objects available :

  1. user
  2. page
  3. group
  4. application
  5. post
  6. status message
  7. note
  8. event
  9. link
  10. checkin
  11. album
  12. photo
  13. video
  14. subscription

Each object has a collection of properties. The numer of properties is ranging from 4 (min) for the “status message” object to 23 (max) for the ” user” object.

Besides the listed objects, which can be connected to other objects, the following additional connections are defined :

  1. comments
  2. feed
  3. picture
  4. tagged
  5. statuses
  6. insights
  7. maybe
  8. invited
  9. attending
  10. declined
  11. members
  12. likes
  13. source
  14. home
  15. friends
  16. activities
  17. interests
  18. music
  19. books
  20. movies
  21. television
  22. inbox
  23. outbox
  24. updates
  25. accounts

Each object has an individual ID (xxxxxxx) and can be accessed with the URL :

http://graph.facebook.com/xxxxxx

A field query parameter can be used to filter the returned data, for instance :

http://graph.facebook.com/xxxxxx?fields=id,name, picture

Alternatively, the ID can be a name, if defined. The connections, if available, are returned in the same request if the parameter “metadata=1” is added to the request (Introspection). Multiple objects can be fetched in the same request by adding the “?ids=” parameter. A special identifier  “me” refers to the current user.

To fetch a specific connection, for instance who is attending the event zzzzzz, the URL is structured as follows :

http://graph.facebook.com/zzzzzz/attending

All responses are JSON (Javascript Object Notation) objects, a lightweight data-interchange format.

If an object is private, you will receive only the public part of the data or the following error message :

{
   "error": {
      "type": "OAuthAccessTokenException",
      "message": "An access token is required to request this resource."
   }
}

To access a graph object with an active access token (yyyyyy), the following method is used :

https://graph.facebook.com/xxxxxx?access_token=yyyyyy

All calls with access tokens are required to go over HTTPS.

An access token is granted by the concerned user, by the page or by the application. Access tokens are based on OAuth 2.0, an open protocol providing specific authorization flows for web applications, desktop applications, mobile phones, and living room devices.

In an initial launch, Facebook supports three ways of getting an access token :

  • The default authorization flow is the web server flow for use by server-side developers. The whole flow works by redirecting the user to the authorization server (Facebook) and back to the developer site.  A “Connect URL” with the domain and path of the site must be preregistered.
  • The second method is the user-agent flow in a Javascript based application. Because the code actually runs on the client device, it can’t really rely on embedded secret keys for security – in JavaScript, anyone can look at the source code and trivially extract the secret. The access token is just returned directly in the redirect response instead of requiring an extra server call with specific care for handling security issues.
  • The third method, client credentials flow, is the simplest flow – just exchange your client_id and secret for an access token, no user is involved. It’s mainly supported for accessing application-only resources.

Open Graph Protocol

Last update : June 29, 2013

Open Graph Protocol Logo

The Open Graph protocol enables any web page to become a rich object in a social graph. For instance, this is used on Facebook to enable any web page to have the same functionality as a Facebook Page.

While many different technologies and schemas exist and could be combined together, there isn’t a single technology which provides enough information to richly represent any web page within the social graph. The Open Graph protocol builds on these existing technologies and gives developers one thing to implement.

To turn a web page into a graph object, you need to add basic metadata to the page. The initial version of the protocol is based on RDFa (Resource Description Framework – in – attributes,  W3C Recommendation), which means that you’ll place additional < meta>-tags in the <head> of your the page.

The Open Graph protocol was originally created at Facebook and is inspired by Dublin Core, link-rel canonical, Microformats, and RDFa. It is related to the semantic web.

IYOUIT – Share Life Blog Play

IYOUIT gathers data around users and about users. This data is called context. Context is centered on places user visit and people they meet and can grow to include all kinds of things that surround users.

IYOUIT is a mobile application developed in Python, and running on Nokia Series 60 phones. Its aim is to make it easy for an end-user to automatically record, store, and use context information, e.g. for personalization purposes, as input parameter to information services, or to share with family, friends, colleagues or other relations, or just to log them for future use or to perform statistics on their own life.

Personal context : Location – Place – Experience – Photo – Sound – Observation – Books and Products – Weather – Marker