flask, alembic and blueprints

For some time I could easily do without autogenerated migrations. Now I wanted them and I wanted to use Flask and not Django. I started, very naively, by installing and importing either flask-alembic and flask-migrate but they all seemed (at that time) to support patterns that I didn’t want (e.g. manager, single models.py) or couldn’t understand. At some points I didnt’t get migrations to work at all or they were empty or blueprints wouldn’t work or…

What I wanted was
* a folder “models” containing all models with a file for each model
* plain alembic
* a single start file with my setup and configs

After installing alembic via pip migrations didn’t work and even importing model in env.py didn’t solve it, fiddeling with target_metadata didn’t help as well as several other solutions outlined in StackOverflow. So here is what worked for me:

In my start/setup file (start.py in my case) has a function:


start_app():
app = Flask(__name__)
# config stuff
db.init_app(app)
return app

and


if __name__ == "__main__":
app.start_app()
app.run()

The app is started by just running python start.py without need of a manager.

I created a my shared_model that all model import:


from flask_sqlalchemy import SQLAlchemy
db = SQLAlchemy()

This makes it easier since all the models just import this shared model and I can also put some other stuff in here that I want to have access to in my models.

The last thing to do is editing the alembic env.py:
1. Import the start_app function and start the app
2. Import the db from the shared model and initialize it
3. Configure and set target_metadata


from start import start_app
app = start_app()
from models.shared_model import db
db.init_app(app)
config.set_main_option("sqlalchemy.url", app.config["SQLALCHEMY_DATABASE_URI"])
target_metadata = db.metadata

That’s about it models go into the models folder and can be used in blueprints, alembic revision –autogenerate produces more than “pass” and the app starts like usual.

Advertisements

A sip from Flask

Lately I came to find Django a bit top heavy for one of my projects, so I chose Flask as a lighter and smaller alternative.
After fiddling with the tutorials for a bit I wanted to have a setup with several modules. Suprisingly that wasn’t as easy to do as the snippets and examples showed several options and configurations and… So, this is what worked for me. May not be the true gospel but I wanted modules to be set to certain urls like mounted apps in padrino.

This is what I came up with:

    + Project
      -- start.py
      + module1
         -- __init__.py
         -- app.py
      + module2
         -- __init__.py
         -- app.py

So module1 and 2 are two functional units which should answer to specific prefixes (localhost:5000/module1 and localhost:5000/module2) and start.py is the file to run the whole show.

I used flask-blueprint to get it all under the roof.

First let’s get the modules to behave like modules. In module1/app.py I added:

     from flask import Blueprint
     app1 = Blueprint('app1', __name__)
     ...
         @app1.route
     ...

For module2 app.py looks similar except that app1 is changed to app2.

So, now we have the blueprints, of which the project does not know yet. In fact we don’t have any app so far. All the nutrs and bolts go into start.py:

    from flask import Flask
    from module1.app import app1 
    from module2.app import app2 

     project = Flask(__name__)
     project.register_blueprint(app1, url_prefix='/path1')
     project.register_blueprint(app2. url_prefix='/path2')

     if __name__ == '__main__':
         project.run()

This is the beauty of blueprint (imho). Import the blueprint, register it and pu t it on a dedicated path.

Done. To modules in a flask-application.

Indexing with Elasticsearch and Django

So, every decent webapp needs a search feature? Okay, here we go.

All starts with downloading elasticsearch
After extracting start it with

bin/elasticsearch -f

The -f paramter gives you a little output, especially the port and host. By standard this would be localhost:9200.

So let’s get to the Django bit.
First thing to check is whether the model object you want to index for search has one or more foreign key fields.
If so, you might not want to index the ids (it is very unlikely that some user would search for an id).
So what to do? Since data is passed to elasticsearch as a JSON object we will use djangos built in serializer to convert our model object into a JSON object and then pass that on. The serializer provides an option to use something called natural keys, which is called by adding

use_natural_keys = True

to the serializers.serialize(‘json’, modelObject) as a third element. The successfully use this, the model which the foreign key field references has to be extended by a method natural_key.

As an example let’s say, we got to model classes one is product which has a foreign key field manufacturer which references a model of said name:

Manufacturer
    name
    address
    website...

Product
    prod_id
    name
    manufacturer <- there it is, a foreign key to the above
    price...

So if we want to index products for search we may want the manufacturer field to be a name (or a name and address combination etc.). Therefore we define a method “natural_key” in the Manufacturer class i.e.:

def natural_key(self):
  return (self.name)

Thus when serializing a Product the “unsearchable” ID is converted to the manufacturer’s name.

The general idea now is to pass the object as an serialized string to a function that then does the indexing on its own. Doing something ike this:

...
new_product = Product(...)
new_product.save()
myIndexModule.add_to_index(serializers.serialize('json', [new_product], use_natural_keys=True))

So, now to the indexing itself. I use pyelasticsearch for no special reason except that its documentation seemed decent.
The indexer is located in a module since I wanted it to be separated from the rest of the application and it is pretty short.

from pyelasticsearch import ElasticSearch
import json

ES = ElasticSearch('http://localhost:9200')

def add_to_index(string):
    deserialized = json.loads(string)
    for element in deserialized:
        element_id=element["pk"]
        name = element["model"].split('.')[1] <- (this is to get rid of the module prefix but this is just cosmetics)
        index = name + "-index"
        element_type = name
        data = element["fields"]
        ES.index(index, element_type, data, id=element_id)

That’s it. One could certainly do more sophisticated stuff (like plural for the index and singular for the element type and than do something clever about irregular plurals…) but it does the job.

Now let’s use ElasticSearc as a datastore for an application.

But why should we do this. Let’s assume we have an application with a member and a non-member area. Members can do stuff on a database and non-members can not. Since you want to keep the database load from user that do not add anything to your service to a minimum to provide a snappy experience for your members you don’t want them to clog the connection with database requests and decide to let ElasticSearch handle that.
And anyway, it’s just for fun 🙂

So the idea is to make an ajax call to elasticsearch and show a list of the last ten products added to the index to the user. In one of your views for non-members you put a javascript function like this:

$.getJSON('http://localhost:9200/product-index/_search?sort=added&order=asc&from=0&size=10', function(response){....})

and in the function you can now start to play around with the fields like

$.each(response.hits.hits, function(i, item){
     item._source.name
     ...
}

and present them to the users.

Custom authentication in Django

After fiddling with Djangos auth-app for a while I decided t rather have my own (I know, why should one do this? Answer: To learn).
It consists of several steps:

  1. registration
  2. activation
  3. adding a password
  4. login

First I created an app for user-management

 $python manage.py startapp user_management    

This gave me the structure to work with.
First I created the usermodel:

 from django.db import models    
 import bcrypt    

 class User(models.Model):

    email = models.CharField(max_length=100, unique=True)
    firstname = models.CharField(max_length=30)
    lastname = models.CharField(max_length=30)
    password = models.CharField(max_length=128)
    last_login = models.DateTimeField(auto_now=True)
    registered_at = models.DateTimeField(auto_now_add=True)
    core_member = models.BooleanField()
    activation_key = models.CharField(max_length=50, null=True)    

The idea here was to have email as username and to have that unique. I don’t consider usernameshis is a good choice for logins but rather a feature for profiles, but that depends on one’s taste I think.

The registration view is pretty straight forward . I create a RegistrationForm object with fields for email, first and last name.
The activation_key is simply a string of randomly chosen ASCII characters and digits.
Activation itself is just creating a link, sending it and comparing the random part of the link and the stored string. If they match is_active is set to True and the user can set his/her password. For passwords I normally store bcrypt hashes in the database (NEVER! store plaintext passwords in a database!). This is quite simple and can be done by following this description.

The function for setting the password goes into the model. For this to work I use a classmethod. As the name suggests, this is a method bound to the class, not an instance of said class which allows to get objects as in “cls.objects.get()” which is the classmethod’s equivalent to self.something in instance methods.

@classmethod
def set_password(cls, user_id, plain_pass):    
    secret = bcrypt.hashpw(plain_pass, bcrypt.gensalt())
    user = cls.objects.get(pk=user_id)
    user.password = secret
    user.save()
    return True

The login process itself is done via another classmethod which I named authenticate:

@classmethod
def authenticate(cls, email, password, request):
    user = cls.objects.get(email__exact=email)
    if bcrypt.hashpw(password, user.password) == user.password:
        request.session['user_id'] = user.id
        user.save() # this is to get last_login updated
        return user
    else:
        return None

(In order for this to work you have to enable the session middleware and the session app in settings.py.)

So, a quick rundown.

Since I use email as an unique identifier for the login the function expects an email address which is used to find the person to authenticate, the plaintext password (e.g. as given from a inputfield) and the request object to make use of a session. (I use database session handling for development but there are alternatives described in the django docs.)

The bcrypt function returns True if given plaintext password hashed and the stored hash match False if not.

After haveing checkd that the user has given the right credentials I’m going to store the user_id in the session which allows me to get the full set of user information should I need it.

I save the user to trigger the auto_now function of the user model in which updates the last_login field to the actual time.

Now with

User.authenticate(email, password, request) 

the user is logged in.

Setting up my own flavour of Django

Okay, so I started doing stuff in python and of course stated playing around with django. And beeing used to padrinorb‘s convenient generators, I had to figure out how to get to my preferred setup. This is what I do:

  1. Run

    django-admin.py startproject projectname

  2. in settings.py
    add

    import os.path
    and add

    os.path.join(os.path.dirname(__file__), ('templates'))
    to TEMPLATE_DIRS

  3. Make dir templates/ in the project folder
  4. Make dir views/ in the project folder
  5. Add an __init__.py file
  6. import your views in __init__ (e.g.

    from index import hello
    if you have a view file called index.py containing a function hello())

  7. In templates I put subdirs for all sites and a base.html which holds the frame for all sites.
  8. Now in urls.py import all views via

    from views import *

So, this gives me a view and a template dir as well as a frame for the sites.

Now that I got the views and template going I would like to have a seperate dir for static contents. Django’s static dir is simply /static whih is fine by me, but making a directory named static and putting stuff in won’t do. You have to put


STATICFILES_DIRS = (os.path.join(os.path.dirname(__file__), 'static/'),)

After putting


{% load staticfiles %}

into the base.html. You can insert static files like css, image and so on by putting


{% static foo/bar.ext %}

into the template tag.

Legacy code and the “SuperProgrammer”

I started an online python course some days ago and part of the assignment is to peer evaluate other peoples code. The task was to print a message on the screen. Yes, I know, a boring task.
There I came upon something like this:

string = "xdxlxrxoxW xoxlxlxexH"
string = string[::-2]
print string

And this, in three lines, is the essence of problems I’ve encountered over the years with big complex projects and legacy code. Remarkably it seems to be a trap each projects “Super-Programmer” falls in…

1. Show-off programming
It’s okay to be proud of ones knowledge but, come on, this is about the job, not your ego.

2. The code is the documentation
NO, definitely not, code is just a small part of any bigger or more complex project. There ususally are configuration, directory structures, external dependencies (libraries) etc. Put it somewhere to be seen, the init file, a readme, a getting-started txt file but don’t assume.

3. Don’t oversmart
You found this very cool, super cryptic looking function that does unexpected thing… Yeah, probably use something that can be understood right away or at least leave a comment about what it does.

4. Modularize to death
Especially in ruby (but any other language as well) I found many people building modules around simple functions, meta programming things to bits and doing stuff they found in years old posts somewhere.
Those techniques are all good and useful at times but not every function is predestined to be reused in another project, so why not declare it a helper function?

In short:
1. Write code that can be read with by an average coder, not just by the “Super-Programmer”. Projects or companies dev teams seldom have an even knowledge distribution. (And in most cases you don’t even want that.)
2. Documentation!
3. Comments!
4. Put your ego aside. I rarely think stuff like “Oh my, he/she came up with a fancy solution”, mostly it is along the line of “WTF! Why didn’t he use the obvious solution?” So if there is a reason for doing it differently go back to point 2 or 3.