Posts Tagged ‘ python

Request for comments/volunteers for the Aristotle Metadata Registry

This is a request for comments and volunteers for an open source ISO 11179 metadata registry I have been working on called the Aristotle Metadata Registry (Aristotle-MDR). Aristotle-MDR is a Django/Python application that provides an authoring environment for a wide variety of 11179 compliant metadata objects with a focus to being multilingual. As such, I’m hoping to raise interest around bug checkers, translators, experienced HTML and Python programmers and data modelers for mapping of ISO 11179 to DDI3.2 (and potentially other formats).

For the eager:

Background

Aristotle-MDR is based on the Australian Institute of Health and Welfare’s METeOR Registry, an ISO 11179 compliant authoring tool that manages several thousand metadata items for tracking health, community services, hospital and primary care statistics. I have undertaken the Aristotle-MDR project to build upon the ideas behind Meteor, and extend it to improve compliance with 11179, but to also allow for access and discovery using other standards, including DDI and GSIM.

Aristotle-MDR is build on a number of existing open source frameworks, including Django, Haystack, Bootstrap and jQuery which allows it to easily scale from mobile to desktop on the client side, and scale from small shared hosting to full-scale enterprise environments on the server side. Along with the in-built authoring suite is the Haystack search platform which allows for a range of searching solutions from enterprise search such as Solr or Elastisearch, to smaller scale search engines.

The goal of the Aristotle-MDR is to conform to the ISO/IEC 11179 standard as closely as possible, so while it has a limited range of metadata objects, much like the 11179 standard it allows for the easy extension and inclusion of additional items. Among those already available, are extensions for:

Information on how to create custom objects can be found in the documentation: http://aristotle-metadata-registry.readthedocs.org/en/latest/extensions/index.html

Due to the wide variety of needs for users to access information, there is a download extension API that allows for the creation of a wide variety of download formats. Included is the ability to generate PDF versions of content from simple HTML templates, but an additional module allows for the creation of DDI3.2 (at the moment this supports a small number of objects only): https://github.com/aristotle-mdr/aristotle-ddi-utils

As mentioned, this is a call for comments and volunteers. First and foremost I’d appreciate as much help as possible with my mapping of 11179 objects in DDI3.2 (or earlier versions), but also with the translations for the user interface – which is currently available in English and Swedish (thanks to Olof Olsson). Partial translations into other languages are available thanks to translations in the Django source code, but additional translations around technical terms would be appreciated. More information on how to contribute to translating is available on the wiki: https://github.com/aristotle-mdr/aristotle-metadata-registry/wiki/Providing-translations.

To aid with this I’ve added a few blank translation files in common languages. Once the repository is forked, it should be relatively straightforward to edit these in Github and send a pull request back without having to pull down the entire codebase. These are listed by ISO 639-1 code, and if you don’t see your own listed let me know and I can quickly pop a boilerplate translation file in.

https://github.com/aristotle-mdr/aristotle-metadata-registry/tree/master/aristotle_mdr/locale

If you find bugs or identify areas of work, feel free to raise them either by emailing me or by raising a bug on Github: https://github.com/aristotle-mdr/aristotle-metadata-registry/issues

Beginning the soft launch of SQBL and Canard

Over the past week I’ve start finalising a version of Canard and SQBL ready for early-Beta testing and public review ahead of IASSIST2013. While I’ll be putting together more documentation later in the week, the first of a series of short tutorials on how Canard will eventually be used.

Also, later this week will see the source code for Canard as shown in the below video released on GitHub, as well as a beta binary for easy of use during testing. For now the SQBL schemas can be seen on GitHub and the main SQBL website contains more information. For now, enjoy the two videos below to see how a strict structure can make questionnaire design easier than ever before!

How you can and why you should learn to program.

Often people ask me how I learned to program and why I did. The answer is simple, lots of practice because its a useful skill – and it also helps that I enjoy it.

Not everyone will find programming fun or interesting, however at more than one time people will come up against a problem that computers were made to do. Contrary to popular believe computers are dumb – at least in the sense that they can only do what they are told. What they can do, is they can do these dumb things very, very quickly. So much so, that they can fool you into believing they are smart – more than smart, magic even. In fact, if you are even a mediocre programmer people can become convinced you are a magician.

So why should you learn to program?

Mostly, because your time is valuable. Not to me, but to you. Unlike a computer your time is finite, and if you can make a machine that can give you more time to do something, isn’t that in your interest? Even if it isn’t in work, if its just sorting your taxes, or writing a script to check your email for you, there are plenty of small, repetative tasks that you probably do that a machine can do quicker. If you enjoy doing repetitive tasks, then there isn’t much I can do for you. But if you want to spend more time understanding why you do these things, then read on…

Where do you begin?

Well, I think, there are 3 programs every one must be able to write. Because if you can write these 3 programs and adapt them to your needs, you can do most big, boring tasks that will come your way.

There are 3 programs you need to learn to start to become a programmer and as I explain them, I’ll show you a brief example to edit and play with and ultimately understand what is going on. These examples are written in Python, a free programming language with a very user friendly syntax.

Hello, World!

“Hello world” is traditionally the first program many new programmers will write. It is simple, when the program is run the computer prints “Hello, World!”. In essence, this simple program introduces programming syntax and demonstrates how to display text to a user.

print "Hello, World!"

There isn’t much to this, but its a starting point. It teaches you some basic syntax and with a lot of languages understanding syntax is important – computers don’t speak English and to make them useful you need to learn to talk to them, more than the other way around.

Simple user interaction

The second is a simple string manipulator. This goal is to create a simple, persistent user interface, with some error checking that fulfills a task. Here we see an example that does actions on a given string based on a command given to it.

while True:
        input = raw_input("> ")
        try:
                cmd,text = input.split(":",1)
                if cmd == "uc":
                        print text.upper()
                elif cmd == "lc":
                        print text.lower()
                elif cmd == "rev":
                        text.reverse()
                        print text
                elif cmd == "quit":
                        print "Bye"
                        break
                else:
                        print "Command not recognised"
        except:
                print "Syntax error: enter a command, a colon (:), then a string"

Firstly, we start the loop and set it to never stop looping. As long as the user wants to play with strings this program will keep going. Next we ask for some input from the user.
Now things get a little more complex, first we try and split the string around a colon, into a command and the text. If the user doesn’t enter a colon, we throw an error, and give them some help text (after the line that says “except:”.
If they do enter a command and text, we set the text to upper case, lower case or reverse it. If they tell us to “quit:” we quit by breaking out of the loop (the break command) or we tell them we didn’t recognise the command.

Its not perfect, but it gives us an understanding of errors, handling user input and basic user interaction – not bad for 18 lines of code.

File manipulation

The last and most important program is a simple file manipulation tool. Again, what the tool does to the file is irrelevant- it might merge files, look for spelling errors, count lines, anything. Perhaps, we are looking for entries in a diary that start with numbers (like dates) in a large file, and only want to view these.

file = open('test')
for line in file:
    if line[0].isdigit():
        print line
file.close()

Here we open the file, and then line by line we search through it. When we find a line whose first character is a digit we print the line (line[0] essentially means the 0th character from the start – its complicated but its how almost everyone deals with subscripting in lists). Lastly, we clean everything up by closing the file.

Again, by no means the best implementation, but easy enough to read and alter. This time in 5 lines we have a simple script that could help use find our tax information, search our diary, lookup phone numbers, or anything like this.

I’ve done this, what now?

Do whatever task it is that needs doing. Odds are having read these short code snippets you can get an idea of what can be done. Its just a matter of getting out and doing it. You may not be the next Bill Gates/Mark Zuckerburg/whoever, but if you learn to paint a wall you won’t be the next Da Vinci either. Learn how to do what you need to do, and keep plugging away. Programming is about pulling together pieces of logic, to help us do simple tasks easily and reproducibly. So think lazy, and find all the tasks that you can automate and get something else to do them for you.

Virgil UI 0.0.1 Beta now live!!

After months of development, testing, coding and crying…. Virgil UI version 0.0.1b is now available for public beta testing.

This release sees the first public testing of a full-functional, classification and codelist specific editor based on and supporting the DDI Lifecycle XML format (DLML).

Features in this release of Virgil include:

Known issues in the 0.0.1b release that  will be fixed in a future release:

  • Codes or languages cannot be removed once added.
  • New CodeSchemes cannot be added manually, only when importing from CSV.
Also new is an updated version of the standalone CSV to DDI converter tool that fixes some outstanding bugs in multilingual imports and corrects a few mistakes when writing the DLML.
For more information on Virgil-UI there is a list of blog post outlining the development process, or you can checkout the Google Code page, view all the downloads, or submit bugs.

Updates to the Virgil CSV to DDI Converter

A short and sweet update:

There was an oversight with the CSV converter not converting coded values to the proper place in the created DDI XML. This has been fixed and the changes have been pushed into SVN and a new version (0.0.2b) of the executable has been released on Google Code.