Url Canonicalization in Rails
In one of my last posts I showed how I was able to create completely custom urls for SEO, but there is an issue that sometimes comes up when creating custom urls or when migrating urls, etc.
Here is a simple way to ensure that urls that are being requested are valid. Google and Yahoo! (and others) crawl your sites links and can on occasion come across an incorrect ink from someone else’s site that may be old or mistyped. There are some stiff penalties associated with having two different urls pointing to the same page. There may also be a need to retire certain urls or to change the way they are formated.
Here is an example, the URL:http://domain.com/d-123456-mountain_viewering
Should be redirected to:
http://domain.com/d-123456-mountain_view
Here is the simple solution:
I created a module that looked like the following in the lib directory and included it into the ActionController class.
include ActiveRecord
module MY
module URL
def page_code_object_map
{
'd' => Destination, 'p' => Photo
}
end
def execute_url_post_process
canonicalize if params[:canonicalize]
end
def canonicalize
whole_url = request.request_uri().split('?')[0].split('#')[0]
url_pieces = current_url.split('-')
page_type = url_pieces[0].gsub(/\//, '')
type_id = url_pieces[1]
begin
object = page_code_object_map[page_type].find(type_id)
canonical_url = send "custom_#{page_type}_path", object, params
rescue RecordNotFound => e
render :file => File.join(RAILS_ROOT, 'public', '404.html'), :status => 404
return
end
if canonical_url and canonical_url != whole_url
headers['Status'] = '301 Moved Permanently'
redirect_to("#{http_base}#{canonical_url}", :status => 301)
end
end
end
end
ActionController::Base.send :include, MY::URL
ActionView::Base.send :include, MY::URL
In the route below, notice that I am passing a parameter named :canonicalize with the value of true. This parameter is passed through to the controller as a request parameter and can be accessed in the params hash.
map.d '/d-:destination_global_id-:name*other_params', :controller => 'destinations', :action => 'show', :canonicalize => true, :destination_global_id => /\d{1,20}/, :name => /[^-]+/
How does this all work you say? Simple. In your application controller (controllers/application.rb) you need to include something like this:
before_filter :execute_url_post_process
This will start the checking process by calling the execute_url_post_process() method defined above in my module. If the route that matches passes the :cononicalize parameter, the conanicalize() method will get the current url and certain important pieces. Then depending on the object that is mapped to the page code (d) it will reconstruct the url of the destination object that should match the existing url. If it matches then were golden, if it doesn’t then we redirect to the new/correct url ensuring that we do not loose page rank or be counted as spam (duplicate content).
There are many things that you can do within this code. Some of them include managing authorization, hiding pages, etc.
I hope you enjoyed this tip. If you have any suggestions, please post them, I am sure some genius will have something to add. :)
Comments
-
Great! Thanks! You made my day
-
Glad that I was able to help.
