Python: a Technique to Append String in a Loop

Perm URL with updates:

Google's Python style guide has this interesting advice:

Avoid using the + and += operators to accumulate a string within a loop. Since strings are immutable, this creates unnecessary temporary objects and results in quadratic rather than linear running time. Instead, add each substring to a list and ''.join the list after the loop terminates (or, write each substring to a cStringIO.StringIO buffer).

They gave 2 examples, one using string append and the other using list append. Here's their examples, slightly modified to be runnable code:

# python
# append string in a loop

employee_list = [["Mary", "Jane"], ["Jenny", "Doe"], ["Alice", "Johnson"]]

employee_table = '<table>'
for last_name, first_name in employee_list:
    employee_table += '<tr><td>%s, %s</td></tr>' % (last_name, first_name)
employee_table += '</table>'

print employee_table
# python
# append string in a loop, but using list instead

employee_list = [["Mary", "Jane"], ["Jenny", "Doe"], ["Alice", "Johnson"]]

items = ['<table>']
for last_name, first_name in employee_list:
    items.append('<tr><td>%s, %s</td></tr>' % (last_name, first_name))
employee_table = ''.join(items)

print employee_table

This is interesting in 2 aspects:

  • ① It is a nice python trick to know. It makes your code a order of magnitude faster. (when your list has large number of items)
  • ② It shows that the python language and compiler combination is not smart enough. Clearly, using list to append string as a intermediate step to increase speed, is a hack. The direct string append is clear and is what programer want.

Popular posts from this blog

11 Years of Writing About Emacs

does md5 creates more randomness?

Google Code shutting down, future of ErgoEmacs