Server Side Dynamic Elements

I am a big fan of Ruby and most of the things that come with it. The exception to this is the overhead generated by dynamic websites built upon it. Server side dynamic content generation is a big issue that I have run up against many, many times. In light of these issues, I had originally planned to do client side parsing of the Twitter, Flickr, and Delicious streams that I integrate into amdavidson.com.

This worked fine until I left the country on a business trip to China and realized that the Great Firewall of China would not block the client side scripts from getting at Twitter, but just let them time out. This led to awful page loading times.

I have been looking to switch amdavidson.com back to Ruby for some time as I don’t much like working with PHP. This gave me a good opportunity for a rewrite and here’s some server side parsing that I worked out.

For Twitter I wanted to pull the JSON data stream and do some basic formatting and linkify the usernames and URLs in any of the tweets. I came up with the following code:

    require 'open-uri'
    require 'json/ext'

    twitter_url = "https://api.twitter.com/1/statuses/user_timeline.json?screen_name=amdavidson"
    response = open(twitter_url, 'User-agent' => 'amdavidson.com').read
    tweets = JSON.parse(response)

    def linkify(text)
      text = text.gsub(/(?i)b((?:[a-z][w-]+:(?:/{1,3}|[a-z0-9%])|wwwd{0,3}[.]|[a-z0-9.-]+[.][a-z]{2,4}/)(?:[^s()<>]+|(([^s()<>]+|(([^s()<>]+)))*))+(?:(([^s()<>]+|(([^s()<>]+)))*)|[^s`!()[]{};:'".,<>?]))/, '<a href="\1">\1</a>')
      text = text.gsub(/@([A-Za-z0-9]*)/, '<a href="http://twitter.com/\1">@\1</a>');
      text
    end


    for t in tweets[0...5] do
  %>

      <div class="tweet tweet-<%= t["id"] %>">
        <a href="http://twitter.com/<%= t["user"]["screen_name"] %>"><img width="48" height="48" src="<%= t["user"]["profile_image_url"]%>" rel="<%= t["user"]["profile_image_url"] %>" /></a>
        <p class="text">
          <span class="username"><a href="http://twitter.com/<%= t["user"]["screen_name"] %>"><%= t["user"]["screen_name"] %></a>:</span>
          <%= linkify(t["text"]) %>
          <% if t["in_reply_to_screen_name"] then %>
            <span class="time"><%= DateTime.parse(t["created_at"]).strftime("%B %e at %l:%m") %> in reply to 
            <a href="http://twitter.com/<%= t["in_reply_to_screen_name"] %>/status/<%= t["in_reply_to_status_id"]%>"><%= t["in_reply_to_screen_name"] %></a></span>
          <% else %>
            <span class="time"><%= DateTime.parse(t["created_at"]).strftime("%B %e at %l:%m") %></span>                           
          <% end %>
        </p>
      </div>

  <% end
  end %>

Breaking that down a little, I pulled the stream using open-uri, then parsed it using JSON.parse, then linkified it using John Gruber’s excellent (and extremely long) url matching regex and a regex of my own design for linkifying the twitter usernames that are mentioned in a tweet. The rest of the code is just formatting.

Here’s a bit simpler code for my 12 most recent Flickr images:

    <%   if flickr_enabled == true then

      require 'open-uri'
      require 'json/ext'

      flickr_url = "http://api.flickr.com/services/rest/?&method=flickr.people.getPublicPhotos&format=json&nojsoncallback=1&api_key=#{ENV['flickr_key']}&user_id=#{ENV['flickr_id']}&per_page=12"
      response = open(flickr_url, 'User-agent' => 'amdavidson.com').read
      photos = JSON.parse(response)["photos"]["photo"]

      for p in photos[0...12] do
        square = "http://farm#{p["farm"]}.static.flickr.com/#{p["server"]}/#{p["id"]}_#{p["secret"]}_t.jpg"
        medium = "http://farm#{p["farm"]}.static.flickr.com/#{p["server"]}/#{p["id"]}_#{p["secret"]}.jpg"
        url = "http://flickr.com/photos/#{p["owner"]}/#{p["id"]}"
    %>

    <a class="preview" href="<%= url %>" rel="<%= medium %>">
      <img class="flickr-img" src="<%= square %>" alt="" />
    </a>

    <%   end
    end %>

And my code for Delicious:

    <%

    if delicious_enabled == true then

      require 'open-uri'
      require 'json/ext'

      url = "http://feeds.delicious.com/v2/json/#{ENV["delicious_name"]}"
      response = open(url, 'User-agent' => 'amdavidson.com').read
      links = JSON.parse(response)

      for l in links[0...5] do      

    %>       
        <li>
          <h2><a href="<%= l["u"] %>" title="<%= l["d"]%>" target="_blank"><%= l["d"]%></a></h2>
          <p><%= l["n"] %></p>
        </li>


      <% end 
      end %>     

None of this code is very light on the server, if you have lighter methods. Please let me know. I would love to lighten the loads, but in the mean time I plan to try to mitigate the load with the Varnish HTTP caching that is built into Heroku.

WordPress XML to Toto

In my efforts to convert my blog at amdavidson.com I wrote a little script to convert the xml file that WordPress can export into text files that toto understands.

It’s extremely hackish and will likely not generate 100% solid data, I had to edit ~10 of my 140 posts. Do not use this on a production system and check your posts before hand.

If you’re still inclined, here’s the gist:

    #!/usr/bin/ruby

    require 'rubygems'
    require 'nokogiri'

    puts 'parsing xml file'
    parsed = Nokogiri::XML(open("./wordpress.2010-10-06.xml"))

    puts 'pulling titles'
    i = 0
    title = Array.new
    parsed.xpath('//item/title').each do |n|
    title[i] = n.text
    i += 1
    end

    puts 'pulling dates'
    i = 0
    date = Array.new
    parsed.xpath('//item/pubDate').each do |n|
    date[i] = n.text
    i += 1
    end

    puts 'pulling content'
    i = 0
    content = Array.new
    parsed.xpath('//item/content:encoded').each do |n|
    content[i] = n.text
    i += 1
    end

    puts 'pulling name'
    i = 0
    name = Array.new
    parsed.xpath('//item/wp:post_name').each do |n|
    name[i] = n.text
    i += 1
    end


    puts 'muxing arrays'
    if title.length == date.length and date.length == content.length  and content.length == name.length then
    posts = [title, date, content, name]
    else 
    puts 'length broken!'
    end

    puts 'printing'
    i = 0
    while i < title.length do
    filename = "articles/" + DateTime.parse(posts[1][i]).strftime("%Y-%m") + "-" + posts[3][i] + ".txt"

    file = File.new(filename, "w")

    # puts "filename: " + filename
    file.puts "title: " + posts[0][i]
    file.puts "date: " + DateTime.parse(posts[1][i]).strftime("%Y/%m/%d")
    file.puts "author: Andrew"
    file.puts "n"
    file.puts "#{posts[2][i]}"

    i += 1
    end

Note that the filenames and directories are hard coded… be sure to update them before running.