Server Side Dynamic Elements

posted on 2010-10-06 - amd.im/9PCQ

I am a big fan of Ruby and most of the things that come with it. The exception to this is the overhead generated by dynamic websites built upon it. Server side dynamic content generation is a big issue that I have run up against many, many times. In light of these issues, I had originally planned to do client side parsing of the Twitter, Flickr, and Delicious streams that I integrate into amdavidson.com.

This worked fine until I left the country on a business trip to China and realized that the Great Firewall of China would not block the client side scripts from getting at Twitter, but just let them time out. This led to awful page loading times.

I have been looking to switch amdavidson.com back to Ruby for some time as I don't much like working with PHP. This gave me a good opportunity for a rewrite and here's some server side parsing that I worked out.

For Twitter I wanted to pull the JSON data stream and do some basic formatting and linkify the usernames and URLs in any of the tweets. I came up with the following code:

<%  if twitter_enabled == true then
    require 'open-uri'
    require 'json/ext'

    twitter_url = "https://api.twitter.com/1/statuses/user_timeline.json?screen_name=amdavidson"
    response = open(twitter_url, 'User-agent' => 'amdavidson.com').read
    tweets = JSON.parse(response)

    def linkify(text)
      text = text.gsub(/(?i)\b((?:[a-z][\w-]+:(?:\/{1,3}|[a-z0-9%])|www\d{0,3}[.]|[a-z0-9.\-]+[.][a-z]{2,4}\/)(?:[^\s()<>]+|\(([^\s()<>]+|(\([^\s()<>]+\)))*\))+(?:\(([^\s()<>]+|(\([^\s()<>]+\)))*\)|[^\s`!()\[\]{};:'".,<>?]))/, '<a href="\\1">\\1</a>')
      text = text.gsub(/@([A-Za-z0-9]*)/, '<a href="http://twitter.com/\\1">@\\1</a>');
      text
    end


    for t in tweets[0...5] do
  %>

      <div class="tweet tweet-<%= t["id"] %>">
        <a href="http://twitter.com/<%= t["user"]["screen_name"] %>"><img width="48" height="48" src="<%= t["user"]["profile_image_url"]%>" rel="<%= t["user"]["profile_image_url"] %>" /></a>
        <p class="text">
          <span class="username"><a href="http://twitter.com/<%= t["user"]["screen_name"] %>"><%= t["user"]["screen_name"] %></a>:</span>
          <%= linkify(t["text"]) %>
          <% if t["in_reply_to_screen_name"] then %>
            <span class="time"><%= DateTime.parse(t["created_at"]).strftime("%B %e at %l:%m") %> in reply to 
            <a href="http://twitter.com/<%= t["in_reply_to_screen_name"] %>/status/<%= t["in_reply_to_status_id"]%>"><%= t["in_reply_to_screen_name"] %></a></span>
          <% else %>
            <span class="time"><%= DateTime.parse(t["created_at"]).strftime("%B %e at %l:%m") %></span>                         
          <% end %>
        </p>
      </div>

  <%    end
  end %>

Breaking that down a little, I pulled the stream using open-uri, then parsed it using JSON.parse, then linkified it using John Gruber's excellent (and extremely long) url matching regex and a regex of my own design for linkifying the twitter usernames that are mentioned in a tweet. The rest of the code is just formatting.

Here's a bit simpler code for my 12 most recent Flickr images:

<%  if flickr_enabled == true then

  require 'open-uri'
  require 'json/ext'

  flickr_url = "http://api.flickr.com/services/rest/?&method=flickr.people.getPublicPhotos&format=json&nojsoncallback=1&api_key=#{ENV['flickr_key']}&user_id=#{ENV['flickr_id']}&per_page=12"
  response = open(flickr_url, 'User-agent' => 'amdavidson.com').read
  photos = JSON.parse(response)["photos"]["photo"]

  for p in photos[0...12] do
    square = "http://farm#{p["farm"]}.static.flickr.com/#{p["server"]}/#{p["id"]}_#{p["secret"]}_t.jpg"
    medium = "http://farm#{p["farm"]}.static.flickr.com/#{p["server"]}/#{p["id"]}_#{p["secret"]}.jpg"
    url = "http://flickr.com/photos/#{p["owner"]}/#{p["id"]}"
%>

<a class="preview" href="<%= url %>" rel="<%= medium %>">
  <img class="flickr-img" src="<%= square %>" alt="" />
</a>

<%  end
end %>

And my code for Delicious:

<%

if delicious_enabled == true then

  require 'open-uri'
  require 'json/ext'

  url = "http://feeds.delicious.com/v2/json/#{ENV["delicious_name"]}"
  response = open(url, 'User-agent' => 'amdavidson.com').read
  links = JSON.parse(response)

  for l in links[0...5] do      

%>      
    <li>
      <h2><a href="<%= l["u"] %>" title="<%= l["d"]%>" target="_blank"><%= l["d"]%></a></h2>
      <p><%= l["n"] %></p>
    </li>


  <%    end 
  end %>

None of this code is very light on the server, if you have lighter methods. Please let me know. I would love to lighten the loads, but in the mean time I plan to try to mitigate the load with the Varnish HTTP caching that is built into Heroku.

Wordpress XML to toto

posted on 2010-10-06 - amd.im/Iphx

In my efforts to convert my blog at amdavidson.com I wrote a little script to convert the xml file that Wordpress can export into text files that toto understands.

It's extremely hackish and will likely not generate 100% solid data, I had to edit ~10 of my 140 posts. Do not use this on a production system and check your posts before hand.

If you're still inclined, here's the gist:

#!/usr/bin/ruby

require 'rubygems'
require 'nokogiri'

puts 'parsing xml file'
parsed = Nokogiri::XML(open("./wordpress.2010-10-06.xml"))

puts 'pulling titles'
i = 0
title = Array.new
parsed.xpath('//item/title').each do |n|
title[i] = n.text
i += 1
end

puts 'pulling dates'
i = 0
date = Array.new
parsed.xpath('//item/pubDate').each do |n|
date[i] = n.text
i += 1
end

puts 'pulling content'
i = 0
content = Array.new
parsed.xpath('//item/content:encoded').each do |n|
content[i] = n.text
i += 1
end

puts 'pulling name'
i = 0
name = Array.new
parsed.xpath('//item/wp:post_name').each do |n|
name[i] = n.text
i += 1
end


puts 'muxing arrays'
if title.length == date.length and date.length == content.length  and content.length == name.length then
posts = [title, date, content, name]
else 
puts 'length broken!'
end

puts 'printing'
i = 0
while i < title.length do
filename = "articles/" + DateTime.parse(posts[1][i]).strftime("%Y-%m") + "-" + posts[3][i] + ".txt"

file = File.new(filename, "w")

# puts "filename: " + filename
file.puts "title: " + posts[0][i]
file.puts "date: " + DateTime.parse(posts[1][i]).strftime("%Y/%m/%d")
file.puts "author: Andrew"
file.puts "\n"
file.puts "#{posts[2][i]}"

i += 1
end

Note that the filenames and directories are hard coded... be sure to update them before running.

about

amdavidson.com is a simple blog run by Andrew Davidson, a manufacturing engineer with a blogging habit. He sometimes posts 140 character tidbits, shares photos, and saves links. You can also see posts dating back to 2005.

Search