Bundling in Python

There is a nice way to deal with requirements in Python, which is

pip freeze    

with

pip freeze > requirements.txt     

you can easily stor eyour dependencies in a simple txt-file and with

pip install -r requirements.txt

get pip to install requirements as specified in the file in a different environment.

Advertisements

Indexing with Elasticsearch and Django

So, every decent webapp needs a search feature? Okay, here we go.

All starts with downloading elasticsearch
After extracting start it with

bin/elasticsearch -f

The -f paramter gives you a little output, especially the port and host. By standard this would be localhost:9200.

So let’s get to the Django bit.
First thing to check is whether the model object you want to index for search has one or more foreign key fields.
If so, you might not want to index the ids (it is very unlikely that some user would search for an id).
So what to do? Since data is passed to elasticsearch as a JSON object we will use djangos built in serializer to convert our model object into a JSON object and then pass that on. The serializer provides an option to use something called natural keys, which is called by adding

use_natural_keys = True

to the serializers.serialize(‘json’, modelObject) as a third element. The successfully use this, the model which the foreign key field references has to be extended by a method natural_key.

As an example let’s say, we got to model classes one is product which has a foreign key field manufacturer which references a model of said name:

Manufacturer
    name
    address
    website...

Product
    prod_id
    name
    manufacturer <- there it is, a foreign key to the above
    price...

So if we want to index products for search we may want the manufacturer field to be a name (or a name and address combination etc.). Therefore we define a method “natural_key” in the Manufacturer class i.e.:

def natural_key(self):
  return (self.name)

Thus when serializing a Product the “unsearchable” ID is converted to the manufacturer’s name.

The general idea now is to pass the object as an serialized string to a function that then does the indexing on its own. Doing something ike this:

...
new_product = Product(...)
new_product.save()
myIndexModule.add_to_index(serializers.serialize('json', [new_product], use_natural_keys=True))

So, now to the indexing itself. I use pyelasticsearch for no special reason except that its documentation seemed decent.
The indexer is located in a module since I wanted it to be separated from the rest of the application and it is pretty short.

from pyelasticsearch import ElasticSearch
import json

ES = ElasticSearch('http://localhost:9200')

def add_to_index(string):
    deserialized = json.loads(string)
    for element in deserialized:
        element_id=element["pk"]
        name = element["model"].split('.')[1] <- (this is to get rid of the module prefix but this is just cosmetics)
        index = name + "-index"
        element_type = name
        data = element["fields"]
        ES.index(index, element_type, data, id=element_id)

That’s it. One could certainly do more sophisticated stuff (like plural for the index and singular for the element type and than do something clever about irregular plurals…) but it does the job.

Now let’s use ElasticSearc as a datastore for an application.

But why should we do this. Let’s assume we have an application with a member and a non-member area. Members can do stuff on a database and non-members can not. Since you want to keep the database load from user that do not add anything to your service to a minimum to provide a snappy experience for your members you don’t want them to clog the connection with database requests and decide to let ElasticSearch handle that.
And anyway, it’s just for fun 🙂

So the idea is to make an ajax call to elasticsearch and show a list of the last ten products added to the index to the user. In one of your views for non-members you put a javascript function like this:

$.getJSON('http://localhost:9200/product-index/_search?sort=added&order=asc&from=0&size=10', function(response){....})

and in the function you can now start to play around with the fields like

$.each(response.hits.hits, function(i, item){
     item._source.name
     ...
}

and present them to the users.

Custom authentication in Django

After fiddling with Djangos auth-app for a while I decided t rather have my own (I know, why should one do this? Answer: To learn).
It consists of several steps:

  1. registration
  2. activation
  3. adding a password
  4. login

First I created an app for user-management

 $python manage.py startapp user_management    

This gave me the structure to work with.
First I created the usermodel:

 from django.db import models    
 import bcrypt    

 class User(models.Model):

    email = models.CharField(max_length=100, unique=True)
    firstname = models.CharField(max_length=30)
    lastname = models.CharField(max_length=30)
    password = models.CharField(max_length=128)
    last_login = models.DateTimeField(auto_now=True)
    registered_at = models.DateTimeField(auto_now_add=True)
    core_member = models.BooleanField()
    activation_key = models.CharField(max_length=50, null=True)    

The idea here was to have email as username and to have that unique. I don’t consider usernameshis is a good choice for logins but rather a feature for profiles, but that depends on one’s taste I think.

The registration view is pretty straight forward . I create a RegistrationForm object with fields for email, first and last name.
The activation_key is simply a string of randomly chosen ASCII characters and digits.
Activation itself is just creating a link, sending it and comparing the random part of the link and the stored string. If they match is_active is set to True and the user can set his/her password. For passwords I normally store bcrypt hashes in the database (NEVER! store plaintext passwords in a database!). This is quite simple and can be done by following this description.

The function for setting the password goes into the model. For this to work I use a classmethod. As the name suggests, this is a method bound to the class, not an instance of said class which allows to get objects as in “cls.objects.get()” which is the classmethod’s equivalent to self.something in instance methods.

@classmethod
def set_password(cls, user_id, plain_pass):    
    secret = bcrypt.hashpw(plain_pass, bcrypt.gensalt())
    user = cls.objects.get(pk=user_id)
    user.password = secret
    user.save()
    return True

The login process itself is done via another classmethod which I named authenticate:

@classmethod
def authenticate(cls, email, password, request):
    user = cls.objects.get(email__exact=email)
    if bcrypt.hashpw(password, user.password) == user.password:
        request.session['user_id'] = user.id
        user.save() # this is to get last_login updated
        return user
    else:
        return None

(In order for this to work you have to enable the session middleware and the session app in settings.py.)

So, a quick rundown.

Since I use email as an unique identifier for the login the function expects an email address which is used to find the person to authenticate, the plaintext password (e.g. as given from a inputfield) and the request object to make use of a session. (I use database session handling for development but there are alternatives described in the django docs.)

The bcrypt function returns True if given plaintext password hashed and the stored hash match False if not.

After haveing checkd that the user has given the right credentials I’m going to store the user_id in the session which allows me to get the full set of user information should I need it.

I save the user to trigger the auto_now function of the user model in which updates the last_login field to the actual time.

Now with

User.authenticate(email, password, request) 

the user is logged in.

Datamapper – Padrino – warden

I took a break from coding, but was still looking for a useful set of tools for developing web applications. And I think I found a solution that fits my needs (small core but extensible, modular, reasonable features, usable documentation or active user groups at least).

The goal was to create a backend that would output json objects that could be processed in an independent frontend.

First step was to generate a project following the guide

padrino g project -d datamapper -a mysql -e none

I set renderer (-e option) to none because I am using rabl for templating the json output.
For authentication I chose warden. So I added these to the Gemfile

gem 'warden'
gem 'rabl'

Then turned to the app/app.rb and added

use Warden::Manager do |manager|
manager.default_strategies :password
manager.failure_app = myApp
end

Warden::Manager.serialize_into_session do |user|
user.id
end

Warden::Manager.serialize_from_session do |id|
User.get(id)
end

For creating the model I used the padrino generator again, since the user model is pretty straight forward (extend as needed)

padrino g model User username:string password:string email:string

After setting up config/database.rb you can create the database by using

padrino rake dm:create

To have some entries in the database to work with I costumize the db/seeds.rb which is mentioned in the padrino blog tutorial

Having done this warden should be in the system but is not working yet, since we have to define at least one strategy:

For now I like to use a common username/password login, which is already defined as default in manager.default_strategies. (You could add others if you wanted to, look at the warden-wiki for details)

Warden::Strategies.add(:password) do
def valid?
... code goes here ...
end
def authenticate!
... code goes here ...
? success!(user) : fail!("Invalid")
end
end

So in valid? you would define the requirements that have to be met to go on with the authentication process. In this case checking params[“username”] && params[“password”] would make sense.
After creating a usable authentication! function request to a controller can be authenticated via adding env[‘warden’].authenticate! before the login controller code.
If authentication was successful you can add env[‘warden’].authenticated? to following controllers and get the user (or what you decided to return for success) by calling env[‘warden’].user.

I tested this with curl, since the frontend is intended to be independent. I put the login process in a post route, so
after starting padrino

curl -d "username=...&password=..." localhost:3000/login

gave me the defined output of a successful login.

One pitfall when testing a subsequent controller with curl is that in contrast to a browser you have to add the cookie information. In order to get it you could call

curl -vvv -d "username=...&password=..." localhost:3000/login

and can extract the rack.session=… …; and call the controller with

curl --cookie "rack.session=... ...;" localhost:3000/subsequent_controller

Sinatra, Mustache and Heroku

I’ve been playing around with Sinatra, in order to see how it would do in the wild I decided to use Heroku as a comfortable hosting solution.

Heroku

Begin creating your Sinatra app. Add a Gemfile, since Heroku runs a bundle install.

Commit to git.

To deploy to heroku install the heroku gem and create an instance for your app by calling “heroku create”. By default, this is enough to get going if you want the application to be run under a subdomain (e.g. http://codebrigade.heroku.com) you can do so by adding the desired name to the create.

If you push your app to &quote;heroku master&quote; it should boot and be running. If it doesn’t look into the log-files by typing &quote;heroku logs&quote; in your Command line.

Mustache

Mustache is a logic-less templating language derived from c-templates. Adding it to Sinatra is quite simple, install the mustache gem and add to your app.rb

require 'mustache/sinatra'
set :public => './public/'
register Mustache::Sinatra
require_relative 'views/layout'
set :mustache, {:views => './views/', :templates => './templates/'}

&quote;public&quote; holds static files like css. It works without explicitly setting it on my local machine but won’t do with Heroku, adding this fixes it.

It is important to include and provide the layout file since the engine looks for it.
The layout file includes the frame for all the views. For example:


<!DOCTYPE html>
<html>
<head>
<title>{{title}}</title>
<link rel="stylesheet" type="text/css" href="layout.css">
</head>
<body>
<div id="header">{{> _header}}</div>
<div id="main">{{{yield}}}</div>
<div id="footer">{{> _footer}}</div>
</body>
</html>

My layout shows a custom Title for each view, header and footer are partials.
You will need a view and a layout for every page. All views extend the layout if they use it:

class Codebrigade
  module Views
    class Index < Layout
       def title
           "Hello there"
       end
    end
  end
end

Next steps are CouchDB and Sinatra integration…