Update on the .zshrc

I needed a little more information from my prompt so I extended it a bit

setopt autocd
setopt promptsubst

autoload -U colors && colors
autoload -Uz vcs_info && vcs_info

precmd() { vcs_info }
zstyle ‘:vcs_info:*’ enable git hg bzr
zstyle ‘:vcs_info:*’ check-for-changes true
zstyle ‘:vcs_info:*’ get-unapplied true
zstyle ‘:vcs_info:*’ unstagedstr “!”
zstyle ‘:vcs_info:*’ formats “%F{5}[%s:%r|%b]%u”
zstyle ‘:vcs_info:*’ actionformats “%F{5}[%s:%r|%b-%a]”

PROMPT=”%F{2}%n@%M:%F{6}%d%F{11}» “

The result is a shell that lets me switch into a directory without typing cd and if the dir is version controlled it shows me the versioning system, the repo name, the branch I’m on and whether there are unstaged changes (indicated by !)

Datamapper – Padrino – warden

I took a break from coding, but was still looking for a useful set of tools for developing web applications. And I think I found a solution that fits my needs (small core but extensible, modular, reasonable features, usable documentation or active user groups at least).

The goal was to create a backend that would output json objects that could be processed in an independent frontend.

First step was to generate a project following the guide

padrino g project -d datamapper -a mysql -e none

I set renderer (-e option) to none because I am using rabl for templating the json output.
For authentication I chose warden. So I added these to the Gemfile

gem 'warden'
gem 'rabl'

Then turned to the app/app.rb and added

use Warden::Manager do |manager|
manager.default_strategies :password
manager.failure_app = myApp

Warden::Manager.serialize_into_session do |user|

Warden::Manager.serialize_from_session do |id|

For creating the model I used the padrino generator again, since the user model is pretty straight forward (extend as needed)

padrino g model User username:string password:string email:string

After setting up config/database.rb you can create the database by using

padrino rake dm:create

To have some entries in the database to work with I costumize the db/seeds.rb which is mentioned in the padrino blog tutorial

Having done this warden should be in the system but is not working yet, since we have to define at least one strategy:

For now I like to use a common username/password login, which is already defined as default in manager.default_strategies. (You could add others if you wanted to, look at the warden-wiki for details)

Warden::Strategies.add(:password) do
def valid?
... code goes here ...
def authenticate!
... code goes here ...
? success!(user) : fail!("Invalid")

So in valid? you would define the requirements that have to be met to go on with the authentication process. In this case checking params[“username”] && params[“password”] would make sense.
After creating a usable authentication! function request to a controller can be authenticated via adding env[‘warden’].authenticate! before the login controller code.
If authentication was successful you can add env[‘warden’].authenticated? to following controllers and get the user (or what you decided to return for success) by calling env[‘warden’].user.

I tested this with curl, since the frontend is intended to be independent. I put the login process in a post route, so
after starting padrino

curl -d "username=...&password=..." localhost:3000/login

gave me the defined output of a successful login.

One pitfall when testing a subsequent controller with curl is that in contrast to a browser you have to add the cookie information. In order to get it you could call

curl -vvv -d "username=...&password=..." localhost:3000/login

and can extract the rack.session=… …; and call the controller with

curl --cookie "rack.session=... ...;" localhost:3000/subsequent_controller

Sinatra, Datamapper and CouchDB – a trainwreck

Just for purely educational reasons (why not) i decide to combine these three on heroku. The spoiler is already in the title: it didn’t work at all. Datamapper is quite cool and works great with the shared Postgres database, but don’t even bother to get the mongo or couch adapter working. Datamapper seems to gets massively reworked, maybe wait for this to be done and then try again. As for Postgres or MySQL it looks nice.

Sinatra, Mustache and Heroku – change of plans

Okay, I really like Mustache, it’s clean, it has a reasonably small set of options (conditionals, lists etc.) and has many implementations. One of them being javascript (Mustache.js).
So, i thought what about using Sinatra to generate JSON responses and serving templates, while the client is responsible for the representation. The benefit being a clean RESTful application and a deferred rendering process allowing to play with different options on caching JSON objects and exploring AJAX features.
And above all a fun project.
So, the basic idea:

  1. the root path “/” reads an index html file and hands it out
  2. navigation elements trigger to asynchronous AJAX requests (one for the template, one for the view)
  3. when both are finished mustache.js kicks in and renders the page (or parts of it)


get "/" do
  response = File.open("index.html").read


var template = null;
var view = null;
var templateFinished = false;
var viewFinished = false;

function getTemplate(template){
  [...] xhRequest [...]
  if(request.readyState == 4){
            template = req.responseText;
            templateFinished = true;

function getView(view){
  [...] xhRequest [...]
  if(request.readyState == 4){
            view = req.responseText;
            viewFinished = true;

function process(){
  if(templateFinished == true && viewFinished == true){
	document.getElementById("content").innerHTML = Mustache.to_html(template,view);

So, in my case the rendered stuff replaces the content in the div with the id “content”. To elaborate on that one could pass the id of the element to be replaced as an option to the function and really play with this.

Sinatra, Mustache and Heroku

I’ve been playing around with Sinatra, in order to see how it would do in the wild I decided to use Heroku as a comfortable hosting solution.


Begin creating your Sinatra app. Add a Gemfile, since Heroku runs a bundle install.

Commit to git.

To deploy to heroku install the heroku gem and create an instance for your app by calling “heroku create”. By default, this is enough to get going if you want the application to be run under a subdomain (e.g. http://codebrigade.heroku.com) you can do so by adding the desired name to the create.

If you push your app to &quote;heroku master&quote; it should boot and be running. If it doesn’t look into the log-files by typing &quote;heroku logs&quote; in your Command line.


Mustache is a logic-less templating language derived from c-templates. Adding it to Sinatra is quite simple, install the mustache gem and add to your app.rb

require 'mustache/sinatra'
set :public => './public/'
register Mustache::Sinatra
require_relative 'views/layout'
set :mustache, {:views => './views/', :templates => './templates/'}

&quote;public&quote; holds static files like css. It works without explicitly setting it on my local machine but won’t do with Heroku, adding this fixes it.

It is important to include and provide the layout file since the engine looks for it.
The layout file includes the frame for all the views. For example:

<!DOCTYPE html>
<link rel="stylesheet" type="text/css" href="layout.css">
<div id="header">{{> _header}}</div>
<div id="main">{{{yield}}}</div>
<div id="footer">{{> _footer}}</div>

My layout shows a custom Title for each view, header and footer are partials.
You will need a view and a layout for every page. All views extend the layout if they use it:

class Codebrigade
  module Views
    class Index < Layout
       def title
           "Hello there"

Next steps are CouchDB and Sinatra integration…

Elastic Mapreduce streaming job with elasticity

Requires elasticity (https://github.com/rslifka/elasticity) and a Registration with Amazon AWS but works like a charm 🙂

This mainly does the following: Make a new bucket for every day the script runs, do the map-reduce job, get the result.

@new_bucket = "run-" + Time.now.strftime("%Y%m%d")
@new_job = "job-" + Time.now.strftime("%Y%m%d")
# Create a new result bucket in results
newdir = connection.directories.create(
  :key    => @new_bucket,
  :public => false
puts "Results are thrown into bucket" + newdir.key

emr = Elasticity::EMR.new(@key_id,@secret_key)
jobflow_id = emr.run_job_flow({
    :name => @new_job,
    :instances => {
      :ec2_key_name => "test",
      :hadoop_version => "0.20",
      :instance_count => 2,
      :master_instance_type => "m1.small",
      :placement => {
        :availability_zone => "us-east-1a"
      :slave_instance_type => "m1.small",
    :steps => [
        :action_on_failure => "TERMINATE_JOB_FLOW",
        :hadoop_jar_step => {
          :args => [
            "-input",   "s3n://input/",
            "-output",  "s3n://" + @new_bucket + "/",
            "-mapper",  "s3n:/mapper/mapper1.rb",
            "-reducer", "s3n://reducer/reducer1.rb",
          :jar => "/home/hadoop/contrib/streaming/hadoop-streaming.jar"
        :name => "mr1"

puts jobflow_id + " started"

jobflows = emr.describe_jobflows
state = jobflows[0].state
puts jobflows[0].name + " " + state + "nn"

if state == 'COMPLETED'
  result = storage.directories.get(@new_bucket).files.get("part-00000").body
  result.each_line do |line|
    puts line[0]


The last if-statement make no sense, as long as we don’t add a routine to check for changes in the state of the job…

while(state != 'COMPLETED' && state != 'FAILED')
  jobflows = emr.describe_jobflows
  state = jobflows[0].state
  puts jobflows[0].name + " " + state + " (" + Time.now.strftime("%H:%M:%S") + ")n"
  sleep 300

Amazon S3 and ruby with fog

This little script lists all keys of directories (buckets in S3) and files (objects). Just to get a grip with fog (https://github.com/geemus/fog)…

connection = Fog::Storage.new(
  :provider                 => 'AWS',
  :aws_secret_access_key    => "",
  :aws_access_key_id        => ""

dirs = connection.directories
dirs.each do |dir|
  puts "+ " + dirname = dir.key
  files = dirs.get(dirname).files
  files.each do |file|
    puts "  - " + file.key

ZSH and versioning systems

Since oh-my-zsh didn’t work properly with ruby I had do remove it and all the nifty stuff went with it… So i was looking for a quick fix:

First I wanted to have the directory visible, i wrote this to the left:
PROMPT=”%n@%m:%F{6}%~ %F{11}%# “

Then I loaded vcs_info to get the information from git and put it to the right, that’s all I need:
autoload -Uz vcs_info
precmd() { vcs_info }
RPROMPT=’%F{4}${vcs_info_msg_0_} ${vcs_info_msg_1_}’
setopt AUTO_CD

It’s enough for me to work.

Hadoop on Cloudera VM

To start the daemons do
$ for service in /etc/init.d/hadoop-0.20-*; do sudo $service start; done

to see if daemons are up

Installation is in

I then uploaded a file into HDFS
hadoop fs -mkdir input
hadoop dfs -put /.../.../test.file input

and ran a local ruby-Script as mapper on that input
hadoop jar /usr/lib/hadoop-0.20/contrib/streaming/hadoop-streaming-0.20.2-CDH3B4.jar
-input input/*
-output output
-mapper /home/cloudera/mapper1.rb
-reducer /home/cloudera/reducer1.rb
-file /home/cloudera/mapper1.rb
-file /home/cloudera/reducer1.rb

Afterwards I looked at the result
hadoop fs -cat /user/cloudera/output/part-00000