Url Canonicalization in Rails

written by Paul on May 8th, 2008 @ 09:43 PM

In one of my last posts I showed how I was able to create completely custom urls for SEO, but there is an issue that sometimes comes up when creating custom urls or when migrating urls, etc.

Here is a simple way to ensure that urls that are being requested are valid. Google and Yahoo! (and others) crawl your sites links and can on occasion come across an incorrect ink from someone else’s site that may be old or mistyped. There are some stiff penalties associated with having two different urls pointing to the same page. There may also be a need to retire certain urls or to change the way they are formated.

Here is an example, the URL:
http://domain.com/d-123456-mountain_viewering

Should be redirected to:

http://domain.com/d-123456-mountain_view

Here is the simple solution:

I created a module that looked like the following in the lib directory and included it into the ActionController class.

include ActiveRecord

module MY
  module URL

    def page_code_object_map
      { 
        'd' => Destination, 'p' => Photo
      }
    end

    def execute_url_post_process
      canonicalize if params[:canonicalize]
    end

    def canonicalize

      whole_url   = request.request_uri().split('?')[0].split('#')[0]
      url_pieces  = current_url.split('-')
      page_type   = url_pieces[0].gsub(/\//, '')
      type_id     = url_pieces[1]

      begin
        object = page_code_object_map[page_type].find(type_id)
        canonical_url = send "custom_#{page_type}_path", object, params
      rescue RecordNotFound => e
        render :file => File.join(RAILS_ROOT, 'public', '404.html'), :status => 404
        return
      end

      if canonical_url and canonical_url != whole_url
        headers['Status'] = '301 Moved Permanently'
        redirect_to("#{http_base}#{canonical_url}", :status => 301)
      end

    end

  end
end

ActionController::Base.send :include, MY::URL
ActionView::Base.send :include, MY::URL

In the route below, notice that I am passing a parameter named :canonicalize with the value of true. This parameter is passed through to the controller as a request parameter and can be accessed in the params hash.

map.d '/d-:destination_global_id-:name*other_params', :controller => 'destinations', :action => 'show', :canonicalize => true, :destination_global_id => /\d{1,20}/, :name => /[^-]+/

How does this all work you say? Simple. In your application controller (controllers/application.rb) you need to include something like this:

before_filter :execute_url_post_process

This will start the checking process by calling the execute_url_post_process() method defined above in my module. If the route that matches passes the :cononicalize parameter, the conanicalize() method will get the current url and certain important pieces. Then depending on the object that is mapped to the page code (d) it will reconstruct the url of the destination object that should match the existing url. If it matches then were golden, if it doesn’t then we redirect to the new/correct url ensuring that we do not loose page rank or be counted as spam (duplicate content).

There are many things that you can do within this code. Some of them include managing authorization, hiding pages, etc.

I hope you enjoyed this tip. If you have any suggestions, please post them, I am sure some genius will have something to add. :)

Really Customized Urls for SEO in Rails

written by Paul on May 5th, 2008 @ 10:35 PM

I needed to build urls that were packed with keywords for SEO. I needed to make sure that the url more fully described the contents of the page.

This default rails url does not cut it.

/destinations/12345

This does cut it.

/d-12345-mountain_view

So here is the hack that I did to get the desired affect. (Suggestions or insults on my approach are welcomed!)

First, I added this code into a plugin that I was using for our custom routes stuff. You can probably add this to the environment.rb file or better yet to a a file within lib and just make sure that you require the file from within environment.rb. I really needed to add the ’-’ as a delimiter.

This is step is important because by default rails uses slashes (/) as a dilimeter for parts of the url, but by adding a dash (-) to the array things work the way they should.

module ActionController
  module Routing
    SEPARATORS = %w( / ; . , ? -)  
  end
end

Then I added a named route (config/routes.rb) that looked something like this:

map.d '/d-:destination_global_id-:name*other_params', :controller => 'destinations', :action => 'show', :canonicalize => true, :destination_global_id => /\d{1,20}/, :name => /[^-]+/

Now we can create helper methods that take all of these wonderful parameters.

def custom_d_path(destination, params={})
  d_path(
    destination.global_id, 
    string_for_url(destination.name)
  ) + (params.size > 0 ? create_other_parameters(params) : "")
end                                                

The method string_for_url() just replaced spaces with underscored and removed illegal characters.

The create_other_parameters() appended parameters in a subtle way that ensured that Google and Yahoo! wouldn’t get prejudice about dynamic pages with parameters. (This is another topic for another time.)

In short, now we can simply call custom_d_path(destination) from any view (or controller if we included the helper in both ActionView and ActionController classes).

I realize that there may be a better way to do this to make it simpler to code, but this is a simple example of a way to solve this problem.

Now for a couple of caveats:

  1. For those who have OCRD (obsesive compulsive REST disorder) the urls may not suite your style. I use them for the read only pages of a site.
  2. You may not need to go to this extreme to keyword pack your urls… there are many other approaches that may be more robust and easier to implement.

Hopefully this example helps someone. :)

Merging Branches with Subversion using CLI and FileMerge

written by Paul on March 21st, 2008 @ 12:52 AM

On small projects I usually work right out of trunk to avoid the need to merge, but when working in teams to implement features that will be released separately creating a branch or two is the way to go. The only problem with working with branches is that you have to merge your code periodically in order to avoid nightmares. Here are the steps that I use to to a simple merge between a development branch and trunk. If there are better ways or if I missed something please let me know, but these is what worked for me.

1) First svn update local working copy (both trunk and branches)

2) Change directory to the branch (branches/development)

cd /Users/Paul/Documents/test_svn/repo/

3) Run a merge command similar to the one below as a dry-run to see if everything looks OK:

svn merge --dry-run -r 4:HEAD file:///Users/Paul/Documents/test_svn/repo/trunk

4) Then if you are satisfied with what you see, you can run the real command which will actually update your working copy with the merged files from trunk.

svn merge -r 4:HEAD file:///Users/Paul/Documents/test_svn/repo/trunk

5) If you have conflicts (lines that start with “C”,) then its time to merge the changes. I use FileMerge and merging the right version with the working version and then I save the merged file and then

6) Checkin all of the merges files by doing a svn commit.

7) No change directories to the Trunk working copy and run the following as a dry run.

svn merge --dry-run -r 4:HEAD file:///Users/Paul/Documents/test_svn/repo/branches/development

8) Then if everything checks out, you do the real merge:

svn merge -r 4:HEAD file:///Users/Paul/Documents/test_svn/repo/branches/development

NOTE: when merging the branch back into trunk, you must use the same revision number as the you did when you merged trunk into the branch, or the revision number of the commit made after the last merge from trunk to your branch.

If you not done any regular merges, which you should do BTW, to avoid really hairy merges, then your revision numbers for both merges will be the same.

9) Resolve any conflicts.

10) Checkin all of the merges files by doing a svn commit.

Now your two branches are synced up! Yeah! Happy merging!

The key is making sure you keep track of revision numbers and merging, one way to do that is to create a tag with a date or sequence number. Also, you can look into the history by using the svn log .

FileMerge Command Line Tools for Subversion

written by Paul on March 21st, 2008 @ 12:44 AM

Here are some pretty useful tools for using FileMerge on OSX with the command line subversion.

I stumbled across these tools while I was looking around at how I can better use FileMerge with Subversion

Here is what I did to fully set them up.


# sudo su
# cd /usr/bin
# svn export http://ssel.vub.ac.be/svn-gen/bdefrain/fmscripts/fmdiff .
# svn export http://ssel.vub.ac.be/svn-gen/bdefrain/fmscripts/fmdiff3 .
# svn export http://ssel.vub.ac.be/svn-gen/bdefrain/fmscripts/fmresolve .
# exit
# vim ~/.subversion/config

Be sure to set your diff and diff3 tools to use the fmdiff and fmdiff3

Thanks Bruno De Fraine for publishing these great tools and making my life a little easier. :)

Bulk Zone file Serial Number Increment

written by admin on January 18th, 2008 @ 01:25 AM

I have way too many domain names, so that means that when I want to make a change to my zone template files including a search and replace for certain ips or just changing the email in the zone like I do below. (Or whatever you need to do.)

I first backed up my zone files with a basic but effective cp command:

blah@server ~# cp /var/named /var/named-backup

Then I replaced my email with one that would handle the spam and put it in the right mailbox (/dev/null.) :)

blah@server ~# for file in $(ls /var/named/*.db); do sed -i "s/paul.mydomain.com/dns.omniop.com/g" $file; done

Now that all of the zone files are updated, even if I were to restart my named, the files would not update my slave DNS servers because the serial number in the zones have not changed.

...
2008011502    ; serial, todays date+todays
...

So here is a quick little shell script that I wrote that increments all of my BIND zone files for my DNS server.

#!/bin/bash
for file in $(ls /var/named/*.db);
do
  if [ -f $file ];
  then
    OLD=`egrep -ho "2008[0-9]*" $file`
    NEW=$(($OLD + 1))
    sed -i "s/$OLD/$NEW/g" $file
    echo "fixed $file" 
  fi
done

There may be a better way of doing this, but I found this very quick and painless.

Now I will hopefully get less spam now that the DNS email scrapers won’t get my email from my zone files.

Hope this helps someone!

Google's FREE blog backup service!

written by Paul on January 14th, 2008 @ 12:47 AM

A month ago this blogs MySQL database with InnoDB tables got totally hosed due to some unfortunate events and because it was just this blog I did not have a recent sql dump backup. Anyways, thanks to Google’s index cache I was able to recover most of my blog articles. I still didn’t get all of my comments back, but hey what do you expect from a FREE blog backup service from Google. It was still better than the one that I was paying for that didn’t work. ;)

VPS restoration from backup kills your InnoDB database -- don't let it happen to you!

written by Paul on January 11th, 2008 @ 12:56 AM

A month ago my then VPS provider, JaguarPC, has some really freaky hardware issues, that to this day I have no idea what happened and they ended up restoring a two week old backup of the whole server which included my VPS. When I fired up this blog and a couple of other sites they failed due to mysql table corruption. The corrupt databases that used Myisam tables seemed to repair just fine, but all of my InnoDB databases (Rails uses InnoDB by default when you use migrations) were unrecoverable and I ended up having to try other means for getting my data back or at least as much of my data that I could get back.

Here is what I learned:

  1. Never assume that your hosts backups of your VPS will work when they are restored because they perform backups while the server is running and databases don’t like that too much.
  2. Always keep backups of your databases, especially the ones that use the InnoDB table engine, in a SQL dump format.

So here is what I do now to prevent this from happening again:

  1. Perform your own backups of your databases using the methods that are suggested for your db and db table engines.
  2. Get the data into SQL so when your VPS is backed up it will properly backup a dump.

Assuming that you have a file that contains a list of databases with one per line, you can do something like the following and then hook up your script t cron.

#!/bin/bash

cd /var/lib/mysql

if [ ! -d sql_backup ]; then 
  mkdir sql_backup
fi

for db in $(cat databases.txt); do echo $db; mysqldump --single-transaction $db > sql_backup/${db}.sql; done

Good luck, and oh BTW, you might want to get this running on your VPS before your host does the restore. ;)

Subject: please cancel my account

written by Paul on January 9th, 2008 @ 12:51 AM

I am looking to change VPS providers from JaguarPC to a company that has a higher level of service and uptime since I have had over 4 or 5 days of downtime and even lost some data due to the restoration form backup that killed my InnoDB tables in MySQL. (enough about that)

Anyways, i found a company called InMotion and after just a few days, here is my last trouble ticket.


To whomever it may concern,

Please cancel my account and refund the charges to my credit card as I would like to take you up on your 30-day money back guarantee.

In case you have questions about why, here is my best effort at giving you both an explanation and some customer feedback that I hope will be helpful to you as you improve your business.

Over the past couple of days I attempted to get going with you guys in hopes that I could move my current sites/customers over to a more responsive VPS provider with better support, etc. I am sure you are a great company and provide great services to many customers, but to be honest I was not impressed with the process of signing up and how my initial tickets were handled.

1) I was surprised by the $2/ip as there is no mention of this pricing before signup and the sales page implies that you do not charge for ips as long as they are justified. Believe it or not, that was one reason why I considered you guys.

2) The support seemed very reactive overall and although I asked questions that could imply that I was relatively experienced with system administration, it seemed as though i was talked down to and was practically told to be grateful for your help and that I wasn’t charged for you services. (The “service” that I didn’t ask for of changing the port that sshd was going to run on. I only told you that I was going to be changing it in order to have you open up a hole in your firewall to accommodate my customization.)

3) By default I didn’t have root access, and after I was “approved” to have it, I was still unable to ssh into my VPS due to your network firewall. Then once I inquired about it, I had to explain that I needed ssh access and was asked for the last 4-digits of my cc number, and then the ticket was closed by a technician, as if my issue had been resolved. After opening the ticket again and providing my cc info, I was asked for the ip address that I was going to be sshing from so you could add a firewall rule that would allow me access, to which I replied that I accessed the server from many locations.

Long story short: I appreciate your help but I don’t feel like InMotion is where I should be.

Good luck and have a most pleasant day!


I don’t like posting negative stuff (unless its about Microsoft and Visual Studio,) but before I signed up with InMotion I searched google for stuff like “InMotion sucks”, and “InMotion bad”, etc hoping that I could see what other peoples experience were and couldn’t find much. I hope this helps other hackers who will do the same and maybe save them some grief.

And no, this was not a paid advertisement by InMotion. ;)

UPDATE: InMotion sent me an email apologizing for the miscommunications and asking me to reconsider my decision. I will admit that their email was very nice and that they wanted to keep my business and make sure that I am happy. I am sure that their services will be great for some people, personally, I am looking for a provider who is a little bit more interested in the big picture of what it is that I want to accomplish and get setup. When I have issues or problems, I need to be able to ask a specific question and get taken care of. Each support cycle is expensive for me. I think the issues that I am explaining are due to VPS companies whose primary business is shared hosting and their secondary business is VPS—their techs are trained to deal with the shared hosting folks. If I decided to stay it wouldn’t be a horrible thing I suppose, but I have already found an alternative VPS provider ServInt.

I am sure that InMotion is a good company, but it didn’t work out for me.

Active Merchent Support for the Verifi Payment Gateway

written by Paul on August 11th, 2007 @ 09:26 AM

A few months ago I was working on a project that required me to setup credit card processing. My client already had a merchant account using the Verifi gateway and instead of talking my client into using a different gateway or doing something from scratch, i figured i would just add support for Verifi for the Active Merchant Rails plug-in and then give back the code to the plug-in project. So if you are using the Verifi payment gateway you have a great rails plug-in with support for it.

Found any good Rails resources... well, here are 74 great ones

written by Paul on May 3rd, 2007 @ 12:54 PM

Lets be open about this, Rich McIver sent me an email informing me that he had written an article on "74 Quality Ruby on Rails Resources and Tutorials." My immediate thought was "what? who? why?" but then I went to the article and after reading through it I found myself interested in the articles that he had put together. Good job Rich for the great article and for the 74 handpicked resources -- even if the number 74 is one article shy of 75. Okay, who is going to write the 75th resource now?

Oh, yeah! before I forget, here is the link to Richs article.

74 Quality Ruby on Rails Resources and Tutorials

With all of the resources out there on Rails, its nice to have a quality list.

MySQL on the move from Latin1 to UTF8

written by Paul on January 25th, 2007 @ 12:45 PM

A few days a go I had to move a Wordpress blog from one server to another and it turned out to be a bigger project than I had originally thought due to the character set being set to Latin1 on the old server and about 180+ posts that were copied in from Microsoft Word containing strange opening and closing quotes and hyphens. When I did a dump of the database and then reimported the data in to a utf8 database man strange characters showed up in the post. I did what I usually do in situations and started to Google for an explanation. I found this article and it referenced this article and here is what I ended up doing to get the issue solved.

I opened up the raw sql dump file in less and saw the strange characters in the test and they looked something like this:
Don<C3><A2><E2><82><AC><E2><84><A2>t

I looked at the context of the skewed characters and saw immediately that it was an apostrophe that was made "special" by Word and then copied into Wordpress. I removed the "< " and ">" and got C3A2E282ACE284A2 which I then put in the queries that were posted on the articles that I read (links above.)

I repeated the above steps until all of the strange characters were fixed, If you are reading this because you are trying to do the same fix you may find the below helpful.



-- C3A2E282ACE284A2 = ' (apostrophe)
UPDATE wp_posts SET post_content = REPLACE(post_content, UNHEX\('C3A2E282ACE284A2'), "’") WHERE post_content REGEXP UNHEX('C3A2E282ACE284A2');


-- C3A2E282ACC29D = " (close quote)
UPDATE wp_posts SET post_content = REPLACE(post_content, UNHEX('C3A2E282ACC29D'), "\"") WHERE post_content REGEXP UNHEX('C3A2E282ACC29D');


-- E28099 = ' (another form of a singe quote)
UPDATE wp_posts SET post_content = REPLACE(post_content, UNHEX('E28099'), "'") WHERE post_content REGEXP UNHEX('E28099');


-- C382C2B4 = ' (yet another quote)
UPDATE wp_posts SET post_content = REPLACE(post_content, UNHEX('C382C2B4'), "'") WHERE post_content REGEXP UNHEX('C382C2B4');


-- C3A2E282ACC593 = " (open quote)
UPDATE wp_posts SET post_content = REPLACE(post_content, UNHEX('C3A2E282ACC593'), "\"") WHERE post_content REGEXP UNHEX('C3A2E282ACC593');


-- C3A2E282ACE2809C = - (dash/hyphen)
UPDATE wp_posts SET post_content = REPLACE(post_content, UNHEX('C3A2E282ACE2809C'), "-") WHERE post_content REGEXP UNHEX('C3A2E282ACE2809C');


I hope posting this helps someone save a few hours of hunting around. :)

Installing RMagick on OSX

written by Paul on September 2nd, 2006 @ 12:26 AM

I am working on a little app (link coming soon) with a friend of mine in an effort to practice my rails and now Rmagick skills since my day job doesn't allow me the opportunity.

One of the things that I am building is an logo generator so I need to have an image manipulator/generator of some sort. I have used ImageMagick on many projects in the past so I looked forward to spitting out the classy logos uswing Rmagick.

Like most open-souce installs on OSX and Linux there were some issues that came up along the way.

I first ran the following command on my OSX terminal but got a couple of errors.

# sudo gem install RMagick
...
Can't find Magick-config or GraphicsMagick-config program.
...

I fixed this error by installing the imagemagick-dev version as opposed to imagemagick.

Then when I tried it again I received this error:

...
Can't install RMagick. Can't find libMagick or one of the dependent libraries
...

I resolved this error by searching google and finding this thread so I told fink (one of my osx package managers) that I wanted it to build imagemagick from source with the following command:

# fink --no-use-binary-dist install imagemagick-dev

After I rebuilt ImageMagick form source and inclused all of the dependent libraries i was able to successfully run the following command with no problems:

# sudo gem install RMagick

It worked! Yeah!

Now I will get back to the Rmagick docs. :)

My OSX Package Managers

written by Paul on July 31st, 2006 @ 12:54 AM

I am no OSX guru, but I will way I love how I can benefit from a operating system that has popular software developed for it yet still contains the *nix like terminal with its maturity and vast offering of open source programs. I use both Fink and Port and use one or the other depending on the versions of packages they have available. If there is a bug in one package and use the other package, it works just great.

Moving from Windows to OSX is a No-Brainer

written by Paul on July 24th, 2006 @ 01:25 AM

A year ago I was considering the move from Windows to Linux Desktop, but was a little bit concerned about not being able to run a few apps that I use day to day. The thought came to me... "why don't I just go to OSX since it is unix and has all of the software that I need?", so here I am today finally moved over to OSX. Yeah!

Last week I decided to purchase a MacBook Pro and start my move from Windows XP. I thought I would report a few of the things that I noticed during the transition.

1) Installing new software is so easy... where did all of the wizards go? :)

I noticed was when I downloaded the software, in many cases it was just a matter of dragging the application from the mounted package into my applications folder. I was expecting a windowesk experience where a wizard always comes up and I have to click next, next, next, yes/agree, next... you get the point, but it only happens when you install larger apps and even then it is fast and easy.

2) The hardware and software just work!

It came time for me to move my Palm Desktop stuff to OSX and setup the bluetooth sync. Now to give you some background, this process took me two days to setup on my windows xp laptop. I installed the palm desktop and turned on my treo 650 and went to the bluetooth sync settings. It quickly saw my macbook and I clicked next. Guess what? Yes, it worked. I was beside myself and wondered what I would do with all of the extra time I had expected to spend. :)

3) There are so many really cool apps out there.

I asked two of my good friends for suggestions for OSX since between the two of them they covered my software needs. It took me a couple of hours or so to install all of the software they recommended. It was easy and very fast.

I love my new MacBook Pro and look forward to becoming a more mature OSX user. When expressing my concerns about not having the time to learn a new OS, all of the Mac users that I new said that it would be an easy transition -- they were right.

My Treo Over Dinner

written by Paul on October 20th, 2005 @ 03:00 PM

Not too long ago I was on vacation, out of state, eating dinner with wife, sister and brother-in-law when my co-worker called me and told me that our mail server was not sending mail from app. Well, I was not by my computer, nor a wireless network, or si I thought. Just then, I remembered I had installed tuSSH (a ssh client for PalmOS) so I fired it up and was able to poke around our linux mail server to find everything in order. I called my co-worker back and told him that the emails that he was testing were in his mail servers queue. He called me 10 minutes later to confirm that he had received the email. I love my treo because of how useful it is.

Options:

Size

Colors