Ruby Performance :: Use Double Quotes vs. Single Quotes

Posted by Chris Blackburn Tue, 10 Jun 2008 17:31:00 GMT

Regarding Ruby strings, surprisingly, embedding substitutions in double quoted strings perform better than using the single-quoted strings and the array append operator

require 'profilings'
include PeepcodeProfiler

MAX = 900000

###
# Quoted Strings
###

time_this("Single Quotes (append):") {
  x = ''
  (0..MAX).each do |i|
    x = 'This is a test ' << i.to_s
  end
}
Timings for Single Quotes (append):
Thread ID: 218880
Total: 11.150243

 %self     total     self     wait    child    calls  name
 58.47     11.15     6.52     0.00     4.63        1  Range#each (ruby_runtime:0}
 20.99      2.34     2.34     0.00     0.00   900001  String#<< (ruby_runtime:0}
 20.55      2.29     2.29     0.00     0.00   900001  Fixnum#to_s (ruby_runtime:0}
  0.00      0.00     0.00     0.00     0.00        1  <Class::Object>#allocate (ruby_runtime:0}
  0.00     11.15     0.00     0.00    11.15        0  PeepcodeProfiler#time_this (./profilings.rb:8}

... and the faster double-quotes append:

time_this("Double Quotes (append):") {
  x = ""
  (0..MAX).each do |i|
    x = "This is a test #{i}"
  end
}
Timings for Double Quotes (append):
Thread ID: 218880
Total: 7.385500

 %self     total     self     wait    child    calls  name
 66.64      7.39     4.92     0.00     2.46        1  Range#each (ruby_runtime:0}
 33.36      2.46     2.46     0.00     0.00   900001  Fixnum#to_s (ruby_runtime:0}
  0.00      0.00     0.00     0.00     0.00        1  <Class::Object>#allocate (ruby_runtime:0}
  0.00      7.39     0.00     0.00     7.39        0  PeepcodeProfiler#time_this (./profilings.rb:8}

Even more interesting, is how the string substitution barely blinks when it is in the middle of the string vs. at the end. To get the same effect using 2 appends with single-quoted strings, takes twice as long as double-quoted strings with substitution in the middle:

# Very slow
# BEGIN single_quotes_middle
time_this("Single Quotes: (in middle)") {
  x = ''
  (0..MAX).each do |i|
    x = 'This is a test ' << i.to_s << 'x'
  end
}
# END single_quotes_middle
Timings for Single Quotes: (in middle)
Thread ID: 218880
Total: 15.146328

 %self     total     self     wait    child    calls  name
 56.51     15.15     8.56     0.00     6.59        1  Range#each (ruby_runtime:0}
 28.00      4.24     4.24     0.00     0.00  1800002  String#<< (ruby_runtime:0}
 15.49      2.35     2.35     0.00     0.00   900001  Fixnum#to_s (ruby_runtime:0}
  0.00      0.00     0.00     0.00     0.00        1  <Class::Object>#allocate (ruby_runtime:0}
  0.00     15.15     0.00     0.00    15.15        0  PeepcodeProfiler#time_this (./profilings.rb:8}
# BEGIN double_quotes_middle
time_this("Double Quotes (in middle):") {
  x = ""
  (0..MAX).each do |i|
    x = "This is a test #{i}x"
  end
}
# END double_quotes_middle
Timings for Double Quotes (in middle):
Thread ID: 218880
Total: 7.410465

 %self     total     self     wait    child    calls  name
 65.51      7.41     4.85     0.00     2.56        1  Range#each (ruby_runtime:0}
 34.49      2.56     2.56     0.00     0.00   900001  Fixnum#to_s (ruby_runtime:0}
  0.00      0.00     0.00     0.00     0.00        1  <Class::Object>#allocate (ruby_runtime:0}
  0.00      7.41     0.00     0.00     7.41        0  PeepcodeProfiler#time_this (./profilings.rb:8}

Look for my, soon-to-be-published, Peepcode book for more tips and recipes on how to scale Ruby on Rails.

Posted in  | Tags , , , ,  | 10 comments | no trackbacks

Comments

  1. Avatar Ryan Tomayko said about 1 hour later:

    I haven’t checked the MRI sources but it seems reasonable. You could do quite a bit more at compile/parse time with double-quote interpolation, like setting the string up as a buffer or whatever. Creating a String and then using `#<<` is likely significantly more work for the interpreter.

    I’ve run similar benchmarks to see whether string interpolation is faster than `Array#join` (in Python joining an list is much faster than building a string w/ concatenation):

    bling = 'bling'

    vs.

    bling = 'bling'
    "foo#{bling}bar"

    String interpolation was faster in that case as well, IIRC. I imagine for the same reasons as in your case – it’s just less object creation and interpreter work.

  2. Avatar Eric Larson said about 1 hour later:

    It really isn’t very surprising. When you are creating the parsed string (double quote), the interpolation requirement is relatively simple. When you use the ”<<” operator, you essentially have a string that needs to be copied to a new place in memory where the extra value can be then added. It is then no surprise that doing the operation twice is slower.

    A better test would be to use the C-like interpolation:

    puts 'Some string %s' % 'hello world'

    I think that would provide a better comparison.

  3. Avatar website design said about 4 hours later:

    Which version of Ruby? I code primarily in Python, but when these “foo is faster than bar” discussions come up on comp.lang.python, I tend to point out that these things are subject to change over different versions of the language. Thus, it’s not all that useful to bend your code to use the “fast” idioms if everything changes in the next version of the interpreter.

  4. Avatar Brenton said about 19 hours later:

    It would be more correct for you rename this entry “String concatenation vs. interpolation”. Double vs single quotes is not the issue here (except that interpolation isn’t an option for single quotes).

  5. Avatar Eric Anderson said about 20 hours later:

    I am not sure why you find this surprising? A quick look at the Ruby source code show why.

    When the source code is parsed the double quoted string will be turned into a list of nodes that are either literal strings or code to be evaled.

    During execution it will iterate over this list of nodes. If the node is a string it will just be appended to the resulting string it is building. If it is code to eval then it will be evaled and then appended. The logic is something like this:

    node = node_list.first result = node while node = node.next result << if node.is_a?(String) node else eval(node) end end

    Obviously this is a simplified version in Ruby. If you want to see the actual C code just search for “case NODE_DSTR” in eval.c in the Ruby source code.

    Now why is this faster than using a single quoted string and <<. Well because the above process happens in C and directly calles rb_str_append to do the concationation so there is no overhead. Calling the << Ruby method will have to be dispatched to a C method (rb_str_concat) which then finally calls the rb_str_append method. So in the end rb_str_append is called either way to concat the parts but the << method has a lot more overhead.

    Obviously if the evaled portion is in the middle it is even more pronounced because you now have two calls to the Ruby method << which all that overhead while the double quoted string has the exact same logic (iterate over a list of nodes and concat internally). It would have a slightly slower performance just because there are now three nodes to iterate over instead of two.

    My bigger question is does any of this matter in real applications? Even if you use << and it is twice as slow as double quoted strings it is still amazingly fast. You have to execute Integer::MAX times before you start to see the performance difference. So this benchmark says nothing about real world performance. In real world applications either would be acceptable from a performance standpoint (because I doubt you are going to run your << code that many times) so it really just comes back to code clarity. Does:

    ‘foo’ << bar << ‘baz’

    look more readable in the specific block of code you are working with or does

    “foo#{bar}baz”

    look more readable? Performance is not a consideration because either is more than adequate for your application.

  6. Avatar Eric Anderson said about 20 hours later:

    Comment markup struck again. The code example is supposed to be:

    node = node_list.first
    result = node
    while node = node.next
      result << if node.is_a?(String)
        node
      else
        eval(node)
      end
    end
  7. Avatar Karl said 1 day later:

    Ok, I’m a little weird like like this… the whole single quite/double quote thing keeps me up at night (but not too much). I have long tried to make all quoted strings bound by single quotes, unless they absolutely required evaluation. But it’s hard because my pinkies just love to work in concert banging out double quotes. Old habits are hard to kill.

    Very interesting results, actually has me a bit surprised.

    Next question: what about a pure and simple string manipulation using double quotes vs. single quotes, ala “this” + “that” vs ‘this’ + ‘that’?

  8. Avatar Chris Blackburn said 5 days later:

    Thanks for the comments everyone. I appreciate the feedback. I do need to clarify my comment about finding it surprising. A look at the call stack after running these tests makes it painfully obvious why interpolation is faster. My surprise was not that I didn’t know why one was faster than the other. It was that coming from the PHP world of web development, several years ago, using double-quoted strings were much slower because they were scanned for interpolation. Obviously it is no surprise when you look at how the different operations are being done.

    Kinda like when you are a kid and find a present under the Christmas tree. Well, it is certainly no surprise when you find out how it really got there.

    The surprising thing indeed is that, if single-quoted strings using the append operator are slower… why? Why wouldn’t they be as fast as "This is my #{interpolation}"? Yes the obvious answer is because interpolation is written differently than the append operator. However, I am surprised that there is not more done in the interpreter to minimize the differentiation. It could and should be smarter. That surprises me.

  9. Avatar marks said about 1 month later:

    So what is faster:

    “this” + “that”

    or

    ‘this’ + ‘that’

    ?

    I would say the ’’ because the ”” indicates that it would have to be searched for special variables inside whereas in ’’ this is not the case.

  10. Avatar Chris Blackburn said 3 months later:

    Marks, that is a good question. I have not timed that particular operation, but my guess is they will be very similar timings. I’ll post what we come up with when time permits.

Trackbacks

Use the following link to trackback from your own site:
http://blog.cbciweb.com/trackbacks?article_id=ruby-performance-use-double-quotes-vs-single-quotes&day;=10&month;=06&year;=2008

(leave url/email »)

   Comment Markup Help Preview comment