On Translations

July 19, 2011 § Leave a comment

Languages have never really been a strong point of mine.  I took Spanish through high school and probably retain enough to order some beers and find the bath room in an emergency.  It comes down to the fact that I am not a huge fan of memorization.  Luckily there are amazing some tools out there to help.

Google Translate

The first one is probably one of the most used translation engines out there, Google Translate.  Sadly as many have probably already read a while ago, Google isn’t going to be offering it as a free service anymore citing extensive abuse. Luckily for those with real needs, there will be a paid version.  [link]

Since I spend a lot of my day on a command line I figured that it would be nice to have a little translate tool at my disposal.  One where I didn’t have to keep opening up a web page.  So I wrote a little bash function call the Google Translate’s API from the command line and print the results right there.

translate() {
  wget -qO- "http://ajax.googleapis.com/ajax/services/language/translate?v=1.0&q=$1&langpair=${3:-}|${2:-en}" | sed -E -n 's/[[:alnum:]": {}]+"translatedText":"([^"]+)".*/\1/p';
echo ''
  return 0;
}

You just drop this little guy in your .bashrc file and… Boom! Translations on the command line.

USAGE

translate  [] [<source language="" />]

The destination language is assumed to be English unless otherwise specified, and the source language is auto-detected if not specified.

$ translate hola
hello
$ translate hello es
hola
$ translate hello fr
bonjour
$ translate hola de es
Hallo
$ translate hola de
Hallo

I’m keeping my fingers crossed that Google keeps some sort of free version of the service available.  Check out the source in a more readable format over on github.

Word Lens

One of the other tools that had my jaw on the floor the first time I saw it was the iPhone app Word Lens.  Go over to the site and watch the demo video.  I’ll wait.  I love this thing.  It is an Augmented Reality (AR) app that uses the camera to read text and automatically replace it in image… in realtime.  Absolutely ridiculous.  It was very interesting to read a lot of the comments about the app when it first came out.  People we complaining about the language packs be $5 then the price getting upped to $10.  They were whining that it was too slow.  The couldn’t believe that it only did word by word translation and didn’t work on phrases.

Seriously?  Your telling me that being able point the camera in your phone at any sign in a foreign country (well, any Spanish speaking foreign country at the anyways) and immediately have it become understandable isn’t worth $10?  That you’d rather spend more on a phrase book and have to thumb through it by hand?  It is a proof of concept and a very good one at that.  Even in it’s current form it has the ability to change travel forever.  At the moment any English speaker has the ability to at least get the main point of any printed sign in Spanish, instantaneously.

Game changer.  It’s like your iPhone becomes a universal translator from Star Trek, or your mind is tweaked by the TARDIS and suddenly everything is comprehendible.

The Future is Now.

Advertisement

On When Create Is Not Like Create

July 14, 2011 § Leave a comment

I recently ran into an interesting issue in ActiveRecord while trying to set default values on an object using the after_initialize callback.  One would think the following blocks of code would be equivalent:

# new, save
Product.new(:name => "Awesome Product").save

# new w/ block, save
product = Product.new do |p|
  p.title = "Awesome Product"
end
product.save

# create
Product.create(:title => "Awesome Product")

# create w/ block
Product.create do |p|
  p.title = "Awesome Product"
end

In 99.9% of cases this is going to be true.  The last one, create with a block, however, can potentially cause you some problems if you are using after_initialize. The problem arises when you use after_initialize to set default values for attributes are are dependent on other attributes. Let us consider our Awesome Product has two more attributes, msrp and wholesale_price, that are tied to each other. If we have one of them we can always determine what the other should be. In this case, there wouldn’t really be a reason set both of them when creating a new object. Just set one and let the other one get set automatically.

For our example we’ll say, msrp = 2 * wholesale_price. You might use an after_initialize that looks something like this:

def after_initialize
  # set wholesale_price based on msrp
  if !msrp.nil? && wholesale_price.nil?
    self.wholesale_price = msrp / 2
  # set msrp based on wholesale_price
  elsif msrp.nil? && !wholesale_price.nil?
    self.msrp = wholesale_price * 2
  end
end

We can instantiate an object like this:

product = Product.new(:name => "Awesome Product", :msrp => 20)
 => <Product ...>
product.save
 => true
product.msrp
 => 20
product.wholesale
 => 10

Everything is working as it should. Now let’s use create instead of new and save.

product = Product.create(:name => "Awesome Product", :msrp => 20)
 => true
product.msrp
 => 20
product.wholesale
 => 10

Still works just fine. Now create with a block:

product = Product.create do |p|
  p.name = "Awesome Product"
  p.msrp => 20
end
 => true
product.msrp
 => 20
product.wholesale
 => nil

Uh, oh… Why didn’t wholesale_price didn’t get set? Take a look at the implementation of create in ActiveRecord::Base.

def create(attributes = nil, &block)
  if attributes.is_a?(Array)
    attributes.collect { |attr| create(attr, &block) }
  else
    object = new(attributes)
    yield(object) if block_given?
    object.save
    object
  end
end

Notice in the else block that a new object is created and then the block is yielded. This means that the after_initialize callback is run on the instantiated object BEFORE the block code is run. msrp is not set yet when after_initialize is run, so wholesale_price can’t be set. create without a block work fine because it is literally the same as using new and save.

TL;DR – after_initialize runs before the block code when using create and a block. Be careful when using after_initialize to set default values for attributes that depend on other attributes.

On Cross Subdomain Cookies

July 12, 2011 § 1 Comment

The first Ruby gem I ever wrote was tld-cookies.  While it is very poorly named, probably should have been called root-domain-cookies or something like that, it adds a nice little bit of functionality to the Rails 3 cookie jars.

One of the things about Rails 3 that I thought was really cool, was the way cookies were accessed. It’s not a big and fancy piece of code, but to me it is just a slick way to do things. The chaining of the different cookie jars makes it trivial to create the cookies you want and need.

cookies.permanent.signed[:awesome_cookie] = "cookies awesomeness"
cookies.signed[:awesome_cookie]
 => "cookies awesomeness"

At the time I was working on a project at work that required the use of a lot of dynamic subdomains, and we wanted to be able to write cookies across all of the subdomains as well as for individual subdomains. In Rails 3 you could set the domain when you write to the cookie like:

cookies.signed[:awesome_cookie]     = { :value => "cookies awesomeness",           :domain => "example.com" }
cookies.signed[:awesome_cookie_sub] = { :value => "cookies awesomeness subdomain", :domain => "sub.example.com" }

Now that is a lot of extra work and looks pretty ugly. You could set the default domain for you cookies like this:

Rails.application.config.session_store :cookie_store, :key => '_app_name_session', :domain => :all

But I guess I’d rather explicitly say when a cookie is to be used across all subdomains. To this point I tld-cookies add a tld cookie jar to your Rails 3 app which sets the domain for the cookie to be the root domain, i.e. example.com.

cookies.tld.signed[:tld_cookie] = "ACROSS ALL SUBDOMAINS!!!"
cookies.signed[:tld_cookie]
 => "ACROSS ALL SUBDOMAINS!!!"

As you can see above, you use it similarly to how you would use the permanent cookie jar. The slight difference is when you want to delete the cookie you have to use the tld accessor.

So yeah, first Ruby gem. Poorly named, fun little learning project.

On Selecting A Single Column

July 9, 2011 § 1 Comment

Many times when we are selecting a rows out of the database we just want a single column and have no need for the entire object. There are a number of ways to accomplish this with ActiveRecord. One can get all the records from the database and then collect the attribute needed:

Posts.where(:status => 'published').collect(&:id)
=> [ 1, 5, 8, 10 ]

This has the benefit of being able to us any overwritten accessors, but has a lot of overhead associated with generating the objects. Another way to do it is to go directly to the database:

ActiveRecord::Base.connection.select_values("SELECT id FROM posts WHERE status = 'published'")
 => [ 1, 5, 8, 10 ]

This is much faster, but requires one to use the direct connection to the database and have the SQL literal prepared. Not particularly user friendly even if you can get the SQL literal using the to_sql:

ActiveRecord::Base.connection.select_values(Posts.where(:status => 'published').select(:id).to_sql)
 => [ 1, 5, 8, 10 ]

Wouldn’t it be nicer if you could just do the following:

Post.where(:status => 'published').select_column(:id)
 => [ 1, 5, 8, 10 ]

The select-column gem provides the above functionality above.  You can you it in your Rails 3 app or checkout the source code over on github.

Usage

select_column accepts a single optional argument. This is the column that you want to have returned in an array. The returned column can also be specified using the select query method.

If neither a select nor an argument is given, :id is assumed to be the column to be returned. If multiple select query methods are present, the first one defined will be the column returned.

Some examples:

# selects an array of ids
Post.select_column

# selects an array of titles
Post.select_column(:title)

# selects an array of ids
Post.where(:status => 'published').select_column

# selects an array of titles
Post.where(:status => 'published').select_column(:title)

# selects an array of titles
Post.select(:title).where(:status => 'published').select_column

Update (Jan 21, 2012): It’s like they keep looking at my gems and integrating them into Rails. As of Rails 3.2 this gem’s functionality has been replicated by ActiveRecord::Relation#pluck. Check it out in the release notes.

On Postal Abbreviations

June 30, 2011 § Leave a comment

While working on a Rails form that needed a drop down select box for state postal abbreviations I put together a few Ruby hashes to look things up.  Below are a few snippets from the different hashes so you can see if they work for what you need.  You can check out the full things here.

STATE_ABBR_TO_NAME = {
  'AL' => 'Alabama',
  'AK' => 'Alaska',
  'AS' => 'America Samoa',
  'AZ' => 'Arizona',
  'AR' => 'Arkansas',
  ...
STATE_NAME_TO_ABBR = {
  'Alabama'       => 'AL',
  'Alaska'        => 'AK',
  'America Samoa' => 'AS',
  'Arizona'       => 'AZ',
  'Arkansas'      => 'AR',
  ...
STATE_NAME_TO_ABBR_LOWER = {
  'alabama'       => 'AL',
  'alaska'        => 'AK',
  'america samoa' => 'AS',
  'arizona'       => 'AZ',
  'arkansas'      => 'AR',
  ...

On MySQL Partial Indexes

June 28, 2011 § 1 Comment

MySQL partial indexes are a great way to reduce the size of your indexes.  In Rails apps, the default string column is a VARCHAR(255) and adding an index to it can create large indexes.  Since very few of the columns you use will ever actually be 255 characters in length, and many everyday attributes and columns have high entropy in some prefix substring, partial indexes make for great compromises.

Another quick thing to note is that if you are using the InnoDB storage you can’t use full indexes on VARCHAR(255) columns in compound indexes because of the 767 byte limit on the index key size.

When working with partial indexes it can be helpful to know exactly how much of the column is covered uniquely by an index of a given size.  Fernando Ipar has a pretty nifty little SQL query that will give you a rudimentary peek into how well a partial index will perform.  The query will tell you what percentage of rows are uniquely identified by the index.  You can check out his blog post about it over here.  Here is the general form of the query:

-- SELECT COUNT(DISTNICT(SUBSTR(<column>,1,<partial index length>))) / COUNT(DISTINCT(<column>)) * 100 FROM <table>;
SELECT COUNT(DISTNICT(SUBSTR(name,1,10))) / COUNT(DISTINCT(name)) * 100 FROM customers;

A Little Problem

With all the goodness that partial indexes offer, I have found at least one draw back. It seems that partial indexes cannot be used with aggregation functions like GROUP BY.  Even if the partial index does not uniquely identify each row in the table, one would think that MySQL would be able to use the partial index to at least help the GROUP BY.

Update (11/8/2011): Someone posted an interesting answer to my question about this problem on stackoverflow. They made the point that using an index for a hint can’t really buy you anything when doing grouping operations. If the index doesn’t cover the entire string then the partial index might be able to tell if they are different, but it can’t tell for sure if they are the same. It’ll have to go to the table itself for confirmation, and if it is having to go to the table a bunch for confirmation then it might as well just to a table scan. The table scan will be more likely to have the nicer properties of a sequential read while using a partial index for hints and then going to the table for confirmation could create a bunch of random reads. There is probably some tipping point here that would make using the partial index’s hints favorable, but one would probably be better served shrinking the size of the column and indexing the full thing if you want to use the index with grouping operations.

Update (10/30/2011):  Turns out this post shows up when some searches for mysql partial index in Google. Figured I might want to make it a little more helpful for those who end up here.

-- The most basic way to create a new partial index on a column

-- CREATE INDEX <index name> ON <table name> (<column name>(<number of characters to index>));
CREATE INDEX part_of_name ON customers (name(10));
# To create a partial index with a Rails migration
# add_index(<table name>, <column name>, :name => <index name>, :length => <partial length>)
add_index(:customers, :name, :name => 'part_of_name', :length => 10)

On Rails irb Logging

June 26, 2011 § Leave a comment

When dealing with Rails apps logging debug statements is one of first methods turned to when trying to diagnose a problem.  A lot of times logger.debug is used and will write its output to log/development.log.  This is great when you are running the server, but when you fire up the console and want to work there, you previously had two options. One is to change all of the logger.debug statements into puts so they show up in the console’s terminal window.  The other is two just keep switching back and forth between the console and a tail of the file.

Now you can toggle the app’s logger so that it writes to STDOUT.  Just throw this in your .irbrc file and call toggle_console_logging from the console’s command line.

def toggle_console_logging
  if ActiveRecord::Base.logger == Rails.logger
    l = Logger.new(STDOUT)
    l.level = Rails.logger.level
    set_logger l and return "console"
  else
    set_logger Rails.logger and return "log file"
  end
end

def set_logger(logger)
  ActiveRecord::Base.logger = logger
  ActiveRecord::Base.clear_active_connections!
end

You can check out the gist here.

On Encrypted Cookie Sessions

June 16, 2011 § 1 Comment

Once I had finished up with the encrypted-cookie gem, it seemed like a natural extension to convert it into a Rails 3 session store.  It operates just like the basic cookie session store, just using an encrypted cookie instead of a signed cookie.  It uses the encrypted-cookie gem, so all the encryption is provided by ActiveSupport::MessageEncryptor.  To start using it add the following to your Gemfile:

gem 'encrypted-cookie-store'

And change your session store in config/initializers/session_store.rb

AppName::Application.config.session_store :encrypted_cookie_store, :key => '_app_name_session'

The dependencies will include the encrypted-cookie gem for you.  Accessing the session is the same as always:

session[:tid_bit] = "of information"
session[:tid_bit] # => "of information"

You can check out the source over on github.

Currently this only works with Rails 3.0.*. All of the session code got switched up for Rails 3.1, so it’s going to take some extra work to get it working for the new release of Rails. Update June 18: Got it working with Rails 3.1.  Yay conditional method definitions!!! Sigh…

On Encrypted Cookies

June 16, 2011 § 2 Comments

I love cookies.  Well, despite obviously loving the cookies of the baked goods variety, I also love Rails 3 cookies. I know it’s weird, but if you are looking for things that aren’t weird, you probably are in the wrong place. The specific thing I really like about the Rails 3 cookie system is that it is chainable.  So cool.

They are three types of built-in cookies.  Your basic everyday cookie, a permanent cookie and a signed cookie.  The basic cookie just saves something to a cookie in the user’s browser.  You can add an expiration date if you’d like.  The permanent cookie sets the expiration date 20 years from now and the signed cookie cryptographically signs the contents of the cookie so that people can’t tamper with it.  The way you interact with them is slick.

cookies[:things] = "stuff" # assign
cookies[:things] # read
cookies.permanent[:perm] = "Never going away" # write to a permanent cookie
cookies[:perm] # read a permanent cookie
cookies.signed[:tamper_proof] = "fragile info" # write to a signed cookie
cookies.signed[:tamper_proof] # read a signed cookie
cookies.permanent.signed[:perm_tamper_proof] = # write to a permanent signed cookie
cookies.signed[:perm_tamper_proof] # read the permanent signed cookie

Now it is best practice to not put highly sensitive information in cookies. There is just no telling what is going on with a user’s browser and their computer.  But there are times where you might have some pseudo-sensitive info that you want to store in a cookie.  Not things like social security number, but maybe some app info that you would rather people not see, but it wouldn’t be the end of the world if they did.

Enter encrypted-cookies.

encrytped-cookies is a ruby gem that provides access to encrypted cookies for Rails 3 apps.  It’s built to work exactly like signed cookies, so there isn’t really anything else to say about the usage except to show the two line example:

cookies.encrypted[:secret_cookie] = "nothing to see here" # write to a signed cookie
cookies.encrypted[:secret_cookie] # read a signed cookie

Piece of cake.

I should probably mention how I do the encryption, since that is something you are probably curious about. The great thing about is that I don’t do the encryption!  ActiveSupport has this nifty little module called ActiveSupport::MessageEncryptor.  It does exactly what you think it does.  It encrypts and decrypts messages.  I didn’t want to take a chance writing the encryption code myself.  Waaaaay to many places to make a mistake and really screw over someone using this gem.  The signed cookies in Rails 3 use ActiveSupport::MessageVerifier to make sure the cookie payload hasn’t been messed with.  In reality, encrypted-cookies does little more than swap out verifier code with the encryptor code.  But simple code has fewer places for bugs to pop up.

Happy encrypting.

On Serialized Accessors

June 16, 2011 § 1 Comment

One of the nice things about brand new Rails apps is that all of your database tables are nice, small and manageable.  You have relatively few columns in any given table and you are probably querying on most of the columns at some point in the app.  As your app grows in size and complexity, so do your tables.  A table that once upon a time had 5 or 10 columns now has 20 or 30.  Not unreasonable number, but you are starting accumulate a number of columns that aren’t queried against.  What should we do with these columns?

One option is to just let them build up.  It doesn’t really hurt anything does it?  I am sure that some database experts out there can weigh in on how number of columns affects performance.  Option two is where I’d to show off something a little more interesting.  Check out the data-attributes ruby gem over on github.

The premise is pretty simple.  ActiveRecord allows for easy serialization of objects to a text field in the database. data-attributes makes use of this and adds attributes that read from and written to a serialized hash that is stored in a text field in the database. From the developers perspective, once a data attribute is defined, it can be used just like any other attribute. Validations work just like they do on column based attributes. To use this gem just add the following to your Gemfile in your Rails 3 project.

gem 'data-attributes'

Let’s take a look at it in action.  We start with a user model with a serialized attribute called data.

class User < ActiveRecord::Base
  serialize :data
end

Now, let’s say we have some piece of information that we want to include in the user record, but it isn’t something that we are going to have every query on.  We define a data attribute like so:

data_attribute :favorite_food

Now how do we use this?  We can see the results of adding the above to our Userobject.

u = User.new
u.favorite_food = "watermelon"
puts u.favorite_food
=> watermelon
puts u.data.inspect
=> { "favorite_food" => "watermelon" }

Pretty easy.  The default field that the gem tries to save everything to is data, but that is just based on personal convention.  If you want to change this, it’s as easy as adding the following line to your model:

data_attribute_column :food_preferences

Or if you have multiple serialized attributes and you want to send different data attributes to different serialized attributes you can do the following:

data_attribute :favorite_food, { :serialized_column => :food_preferences }

You can also set the default value to be returned for the data attribute:

data_attribute :favorite_food, { :default => "peanut butter" }

One of the things that makes all of this possible is that fact that ActiveRecord has multiple layers of accessor that happen when you call something like user.name.  What happens is that the method name gets called which in turns calls read_attribute(:name).  Similarly name=(val) calls write_attribute(:name, val).

data-attributes contains similar under-the-hood method read_data_attribute and write_data_attribute.  This way you can have a little more control over the values that are read and written to your object.

Some things to be noted. If your ActiveRecord object has a serialized attribute then that attribute will be saved to database every time you call save.  This is because it doesn’t know if the serialized object has been edited in place or not, so it just writes it to the database every time for good measure.

Update 2011-12-26: ActiveRecord 3.2 now has some of this basic functionality built in by way of ActiveRecord::Store.

%d bloggers like this: