Archive for the ‘ Drivel ’ Category

Why I will not be renewing my CCNA

A little over three years ago I was a programmer in the Network Services department of a very successful internet services provider. As a part of my career advancement it was recommended that I study pursue some certification. Working with very skilled network engineers the obvious choice was to study for a Cisco Certification – one of the most common IT certifications available. Over two months of extremely intense study I was able to pass both exams and become a Cisco Certified Networking Associate. I was able to better understand my collegues and improve my understanding of fundamental networking.

Flash forward to now – I’ve long since left that job, now working as an Information Analyst for the Australian Government. Most of my work in business analysis and theoretical computer science. I barely even administer my own home network and have forgotten most of what I know about networking.

So why am I not renewing my CCNA?

Because it is intellectually dishonest. At the time, when I got certified, I got it as a proof of my understanding or at least ability to understand. While my career was still in its infacy, this helped demonstrate my capability. Now however, I have experience as a programmer and analyst, but no longer working with networking.

Certification is in essence two things: a resume substitute for experience and proof of ability. If I was to renew my certification (which would require quite a bit of relearning) it would only be a disadvantage, for me and for every other person with a CCNA accreditation. If in the future I was hired on the basis of my accreditation I would soon be found out as someone who is now relatively unskilled with networks, which would only make me look bad. Worse though is that by me, or anyone, who holds an accreditation without appropriate experience or skill only make everyone else who holds that certification look bad by comparison, by the reducing the surety of the accreditation as an indicator of that skill.

As such, holding any certificate without the appropriate skill to back it up is both not in my own self-interest and morally bankrupt.

DDI Examples repository now available

One of the topics for discussion during the DDI Developers meeting was a lack of space for examples of actual DDI code. I hope to fix this problem through a new Google Code repository for DDI Examples. While a little sparse at the moment, I know there are few other good examples of DDI metadata available for people to contribute.

At the moment there are two ways that this can be done: I can either make volunteers permanent contributors to the repository so they can share files directly, or people are free to raise tasks and files can be uploaded by myself or another contributor.

I have also split the repository at the moment into three sections:

  1. Simple Examples: used to illustrate a simple concept or use case – such as a Question, Concept or Study – but not ideal for actual implementation.
  2. Complex Examples: to demonstrate more complex ideas, such as referencing, packaging or schemes.
  3. Practical Examples: highly complex examples that replicate an actual problem or real world scenario completely.

At the moment, there are only two files up on the site – the Australian and New Zealand Standard Industry Classification and the Australian Drugs of Concern Classification. Both of these were generated using Virgil-UI and have had some minor tweaks applied to they are DDI3.1 valid.

With time, I hope this can be a comprehensive resource allowing users easy access to DDI example to lower the barrier to entry to the standard.

Always double check the standard before writing code

A few weeks ago, I had the privilege of presenting at a collection of DDI Developers in Gothenburg at EDDI. There I presented one of my larger pieces of work, the Virgil-UI DDI Codelist Editor, for critique. While there I received advice, praise and most importantly constructive criticism for which I am grateful. However, this has brought to light a rather large problem.

It was pointed out that I made a small error when dealing with <Code> elements in DDI and accidentally gave them @id attributes, and it was noted that this should be an easy fix. Unfortunately, due to my missing this very early on in the development of Virgil the underlying model relies on Codes having ids to be able to easily make connections between the hierarchical user interface, the <Code>s and the <Category>s that give them meaning.

What this means is that both the DDI coming out of Virgil is invalid, and any valid DDI would not actually be able to be read by Virgil. Essentially, the Virgil model for handling DDI is broken and needs to be almost entirely rewritten and this might take quite a while.

Unfortunately, at this stage rewriting also means re-examining a lot of the initial ideas about what Virgil should be and has highlighted some interesting questions about the DDI model and DDI software, such as:

  1. Is abstracting the DDI model away from a user a good approach to software design? Yes.
    This was the crux of my talk at EDDI, and I still feel that abstracting the DDI model away from day-to-day users is necessary. The DDI model is complex and covers a wide range of tasks. I believe that designing software that helps users relate the model to specific tasks they are trying to do is a key to getting people to use DDI and think about how they can make their metadata support themselves and those around them.
  2. Is DDI a standard that is suitable to use for day to day management of information? Probably.
    In practice, the DDI standard needs to be able to be passed between software if it is to move from an archival standard to a practical statistical metadata standard. One of the things I wanted to achieve with Virgil, was a tool that not only produced DDI, but could also consume it from other sources. In the simplest case this to me meant being able to take a DDI file, and edit the contents of part of it, leaving the rest untouched, and in a lot of cases this is possible with DDI. However, since having to rethink how to manage classifications using DDI, I have realised that there are some objects that are not captured well within DDI and unfortunately classifications are one such example.
  3. Is the DDI model for managing codelists and classifications good enough? Sadly not.
    One of the reasons I relied so heavily on the invalid <Code> @ids was that I needed a hook to tie codes and categories together and without this it becomes very difficult to manage what a ‘classification’ is in DDI. Furthermore, classifications don’t exist in DDI per se, but are a rather loose agreement that if you combine <CodeScheme>s and <CategoryScheme>s you get a good approximation. However, this falls apart when we try to document the classification itself.
    For example, where do you store the name of a whole classification? There are three viable places (excuse the XPath) – as a //CodeScheme/Label (being the label of the hierarchy), as a //CategoryScheme/Label (being the label of the collection of classifying categories) or as a //LogicalProduct/Label (the label of the immediate parent that contains both the hierarchies and the categories).
    However, each of these approaches has inherent issues, as neither of these are the documented way to manage this information, and if 3 different agencies approached the problem in different ways, then their metadata becomes incomparable. This needs to be discussed further, as it will become a bigger issue as more tools start to try and manage such an important, and conceptually early in the lifecycle piece of metadata.

It should be noted that these issues don’t excuse overlooking the actual standard leading to this predicament. However, given the chance to re-examine how to correct the problem in Virgil, also gives me a chance to examine some of the issues I came across while trying to maintain classifications within DDI. Over the coming month or so while I am going to continue writing up some of the issues I identified with classifications within DDI3.1, how to work around these in the short term, and look at ways to correct the problem in future versions of the standard.

Lastly, in the short-term there will be an update to correct the Code/id problem in the CSV to DDI conversion, so the original use case of being able to mine legacy systems to produce valid DDI will still be filled.

Thanks again to everyone at EDDI for their input and company.

Farewell to Europe (and EDDI) for another year

Here I sit in Helsinki Airport, awaiting a bitter sweet flight home. While it is always good to go home and be with my family and friends, I know I am leaving quite a few behind here in Europe and beyond.

By all accounts, the European DDI Users Group meetings were a great success. Along with seeing all the work people have done of the last year, we were able to sit and discuss and debate for several days and have a solid plan for future work.

While I was only at the Developers meetings, we covered improvements to the website, new ways of managing large DDI instances in relational and non-relational databases, examined new (and forgotten) ways to design software, debated the best ways to handle automated ID creation, listened to the results of the semantic DDI workshops, learned about the DDI Agency Registry, debated reducing or removing namespaces from DDI, raised the possibility of a shared DDI Blog/News aggregator and started the creation of not one, but two major additions to the DDI community – a new transport element nicknamed “The DDI Bucket” and started laying the groundwork for a DDI RESTful web interface standard.

And that was in just 3 days! And I am still eagerly awaiting to see how the “Data Without Borders” and “Longitudinal DDI” workshops went.

The week was made even more productive by the use of Google Docs to create a single, living recollection of the event. Watching everyone type up their notes in real time was great. Over the next few week I (and hopefully the rest of the DDI Developers community) will continue to clean up our collaborative notes and look forward to presenting information and recommendations to the whole DDI community in the new year.

We also discussed upcoming meetings for the DDI Developers group and 3 possibilities were raised, at IASSIST in June, RC33 in July and EDDI next December. While events will most likely go on at all of these events, I strongly encourage those who can come to RC33 to be held in Sydney next July to speak up or at the least contact me in private. There is a wealth of talent in Australia and New Zealand who are well worth getting in contact with and with a large enough group of DDI members in Australia I think a “DDI Developers Down-under” would be well attended and well worth the trip.

So with that in mind thankyou to everyone in the DDI Community for a great week – and especially to Olof Olsson of SND for kindly offering me a place to stay during the week. It was a fantastic week, and served to remind me how if you work hard you can contribute to a community, being called upon to answer questions during the meetings (and once during the question time of someone else’s talk!) was especially flattering. This has truly re-invigorated my love of metadata (I spent the better part of my evenings in Rome madly writing ideas for tutorials and examples I foolishly volunteered for during the meetings)

So, with that I wish the entire DDI Community a Merry Christmas, Happy Holidays and Happy New Year and look forward to seeing everyone again in the new year, be it in Washington for IASSIST 2012, Sydney for RC33 or wonderful Bergen for EDDI 2012!!!

Arrivederci

Arrivederci

https://lh6.googleusercontent.com/-Bt7M3EmQOVo/TuY2PWCS8hI/AAAAAAAAEOc/nmKN-M6tK-M/s512/IMG_20111211_184959.jpg

Sing a song of software, bubbles full of lies; 4 and 20 years of stocks audio-lised with Py

I’ve been recently toying with the idea of using music as a format of exploratory data analysis. While the use of sound to monitor data isn’t new, its still relatively uncommon. As I occasionally find myself trying to make sense of large data sets finding a way to quickly analyse them, to find the points of interest can be quite tedious. So I thought about ways someone with no music skill could generate sound from data, and produce something relatively melodic, and useful for highlighting patterns and anomalies in the data.

Sing a song of software,

To test this out I put together a little tune, that covers the past 24 years of stock information from Microsoft (Piano), Apple(Clarinet) and Google(Xylophone). The pitch is proportional to the price of the stock with low tones being low prices and high be high. While the volume of each instrument is proportional to the volume of sales over the period, so when you hear a quiet sound that is a low volume day, while loud note is a period of higher trading volume.

There are two versions of the music available:

A shorter 2 minute, up-tempo version using weekly stock prices: OGG, Midi – This one is short and to the point, but some of the nuances, like big daily trade spikes are missed.

A longer, 17 minute, version using daily prices: OGG, Midi – This one is a little monotonous at the start, but you can hear Apple come from a tiny instrument in the background to a larger force much better. It also lets you hear some off Apples big trading days.

Bubbles full of lies;

A few things to listen out for:

  • Early on, listen for Microsoft’s speedy accent during the 2000′s tech boom, and an even quicker decline. (About 1:00 in on the quicker version)
  • Apple, has for a long time a consistently low trade volume, however occasionally you will hear loud piano strikes starting from the early 2000′s. (About x minutes in.) These are peaks of stock sale, probably around MacWorld and iPod/Phone/Pad announcements.
  • After about 2005, you can hear Apple and Google slowly rise in volume and stock price, while Microsoft remains in a consistent range throughout the same period. (After 2:00 in the short version)

4 and 20 years of stocks,

The data that all of this was pulled from was the historical stock prices data sets available on Google Finance. Why 24 years worth – because it fit with the theme of the nursery rhyme I was trying to mimic. Its pretty touch and go as to what data you can download from Google Finance, but to be fair, from my understanding this is an issue with the exchanges rather than with Google.

Audio-lised with Py(thon).

So the nitty gritty on how it works:

Its a python script that loops through a set of files of output data from Google Finance and using midiutil creates a Midi file. Each day (or weeks) datapoint is weighted so the values remain within a specific range for a specific instrument and the volumes are adjusted so that each instrument can be detected. Without either of these it really is quite a mish-mash of sound.

This output Midi file is then run through Timidity++ to create an Ogg/Vorbis file. Converting to Ogg is only necessary for consistency, but both the Midi and the Ogg are available.

Future work and ideas

Well the goal is to be able to use a technique like this to listen to large multi-variate datasets, that have either a time dimension, or a continuous dependent variable (heights, weights, etc…). As long as one dimension has values that are relatively evenly and closely distributed with few overlaps and a wide enough spread it should be possible to ‘graph’ probably any dataset meaningfully as audio.

Erdos – A javascript interface to create Graphviz charts

So I found myself needing to be able to generate large graphs and I’ve found that Graphviz and its DOT language is the easiest way to produce large graph diagrams. Unfortunately, its not always easy to get access to the tools you need immediately, so to resolve this, I set about creating an easy online Graphviz compiler.

In looking I did find a few existing tools, but those I found did a lot of the processing themselves, and limited the number of possible nodes. After some searching though, I found the Google Chart API which has a Graphviz option that renders up to 200 nodes and 400 edges.

using a little Jquery, a little HTML, and very little CSS I was quickly able to put together a tool for drawing graphs on the go.

To see it in action check it out in the sandbox.

Autocompletion using PyQt4 and QScintilla

I’m currently toying with the idea of creating a specialised code editor for questionnaires in Python using PyQt4. One of the best plugins for Qt for this job is QScintilla, a wrapper around Scintilla - an open source editing component. Scintilla to its credit has a huge amount of great features, including code-folding, syntax highlighting and auto-completion of words. But the documentation can be a little lacking at times, and there aren’t a whole lot of examples on how to use some of the features.

The one feature, that I was trying to implement was auto-completion, but I had no luck finding python examples that demonstrated everything you needed to get it working. So, after some searching, some hacking, some crying and then finally some reading for the documentation, I came up with the following minimal example with auto-completion in a basic editor:

#!/usr/bin/env python
# -*- coding: latin1 -*-
 
"""
Basic use of the QScintilla2 widget
 
Note : name this file "qt4_sci_ac_test.py"
Base code originally from: http://kib2.free.fr/tutos/PyQt4/QScintilla2.html
"""
 
import sys
from PyQt4.QtGui import QApplication
from PyQt4 import QtCore, QtGui, Qsci
from PyQt4.Qsci import QsciScintilla, QsciScintillaBase, QsciLexerPython
 
if __name__ == "__main__":
    app = QApplication(sys.argv)
    editor = QsciScintilla()
 
    ## Choose a lexer
    ## This can be any Scintilla lexer, but the original example used Python
    lexer = QsciLexerPython()
 
    ## Create an API for us to populate with our autocomplete terms
    api = Qsci.QsciAPIs(lexer)
    ## Add autocompletion strings
    api.add("aLongString")
    api.add("aLongerString")
    api.add("aDifferentString")
    api.add("sOmethingElse")
    ## Compile the api for use in the lexer
    api.prepare()
 
    editor.setLexer(lexer)
 
    ## Set the length of the string before the editor tries to autocomplete
    ## In practise this would be higher than 1
    ## But its set lower here to make the autocompletion more obvious
    editor.setAutoCompletionThreshold(1)
    ## Tell the editor we are using a QsciAPI for the autocompletion
    editor.setAutoCompletionSource(QsciScintilla.AcsAPIs)
 
    ## Render on screen
    editor.show()
 
    ## Show this file in the editor
    editor.setText(open("qt4_sci_ac_test.py").read())
    sys.exit(app.exec_())

Thanks to Kib2 and his example on code for QScintilla2 that this example is based on.

Virgil UI – Beta demo video

Just a quick update that was supposed to have gone up last night. There is a video up on youtube now, showing of some of the more finalised features of Virgil-UI.

This shows three big features – CSV import, drag-and-drop reordering of classifications and multilingual support for editing. This means a classification with a multilingual component, for example a Canadian Industry Classification could have the English and French components edited simultaneously.

As stated in the last post, there should be a Windows binary release of a beta version of Virgil-UI and an updated version of the convertor tool should be released early September.

Simple steps to better public speaking

A short while ago, I gave one of the more interesting presentations I’ve ever given. I stood up in front of over 50 of my peers and spoke about an area I was a passionate about, talked about how my organisation needed to change and most importantly about how people in the field had doing things wrong for too long. It was tough and I was anxious and scared.

The thing is afterwards people told me how well I did and how much they enjoyed hearing me talk.

Now this is a surprise, as I have never considered myself a good public speaker, and leading up until today the thought of getting on stage was enough to send my heart racing. Granted, I have no problem getting up in front of people and making a fool of myself, but doing a structured talk with all eyes on me – yuck. So what I wanted to do was share a few of the tips I picked up over the last year that helped turn me from a nervous wreck in front of a crowd, into a confident looking nervous wreck.

So in accordance with the idea that you never really know anything well until you explain it in your own words, here are the top 5 tips I took away that helped me deliver a killer performance.

  • Open with a bang – This is a tip from The Naked Presenter, and is all about opening your talk in a way that catches people attention. Whether its a joke, an anecdote, an inspiring quote or a controversial sentence that catches people off guard, start off on a big, positive note. More than anything else, people will remember the start and the end of your talk so make sure there is something to remember!
  • Ditch the bullet points – One of the big points that Tufte makes is that Bullet points are for you, the presenter, and no one else – they are your presenter notes. Feel free to consult them to help refresh your mind, but don’t make them visible. People can read faster than they can talk, so while you are still on the first point they have read to the end and are bored. Not only have you distracted them from your talking, but they are forming their own opinions about your notes – and they might not be coming to the ‘correct conclusions’. Use pictures that illustrate the ideas you are talking about,  graphs, quotes (as long as you don’t read them verbatim) or even don’t have a slide at all to distract them.
  • Leave the lectern – The lectern puts a physical and emotional barrier between you and the audience and it makes it harder to connect with them. Where possible, Reynolds suggests moving away from the lectern and getting up close to the audience. This drives home that you want to connect with the audience, which is something you should want. The move the audience feels connected to your talk, the more they take away. The big tip I can offer here is if you do rely on slides, which is often the case, get a presenters wand as it gives you the freedom to walk around the stage and frees you from the lectern. One of the big things I found, was it allowed me to motion in time with the skipping between slides to emphasis change, rather than having to reach for a keyboard.
  • Make a handout – Another suggestion from Tufte, is in place of the more usual handout of printed copies of the slides, is to make a well written handout – no more than a double-sided A4 sheet. Firstly, this gives you more freedom on what people take away, as instead of having to remember how your bullet points worked together, they have a solid document with sentences that they can refer back to. In my own talk, I also used it to provide a practical example of what I was talking about for them to take away (homework if you will) and encouraged listeners to read ahead or catch up by consulting the handout. Lastly, it allowed me to include a detailed diagram that would have been to illegible on a slide, for listeners to closely examine at their own pace as I covered smaller parts of it.
  • Love your topic – This last point is one of my own design, but has been said by many people before. If you aren’t passionate about what you are talking about, perhaps you should rethink why you are presenting it. If you are, then this will show in your talk, and you will share that passion with the audience. One of the biggest compliments I routinely receive is from people  who talk about how obvious it is how enthusiastic I am about what I do, and it is true. People love to hear from people who are love their work, and will genuinely want to understand why you feel that way. Even to the point where, if nothing else, they will overlook all your flaws if they feel that they share that passion too.

The next step to becoming a better present can be summed up as read these two books (Note: The links to these books use my Amazon affiliates code):

The Naked Presenter: Delivering Powerful Presentations With or Without Slides (Voices That Matter) by Garr Renyolds.

The Naked Presenter draws comparisons between public presenting and Japanese nude baths. Garr travelled to Japan prior to writing this book, and had to occasionally attend Japanese baths with colleagues. In the book he talks a lot about the social and physical barriers we put between ourselves and our audience that limit out ability to freely interact. He talks about removing lecterns, approaching the audience before during and after a talk, and focuses on how to structure talks and the appropriate slides that go with them. People likened my talk to a TED talk, and whether or not that is an apt comparison, a lot of what they meant came down to me following the approach of this book, and actively trying to engage with the audience.

The Cognitive Style of PowerPoint: Pitching Out Corrupts Within, Second Edition by Edward Tufte

Edward Tufte is considered one of the masters in the field of information visualisation and a huge promoter of content and truth in graphics over pizazz. This article comes out strongly against the traditional use of Powerpoint (and other slideware) as a tool for displaying bland bullet points that are often read verbatim by unprepared presenters. Tufte talks about the information presented in slides and how they are often used as standalone documents, rather than presentation aids. Instead Tufte promotes the use of short reports and handouts to augment talks, as well as reexamining the use of slideware to situations where visual aids enhance a presentation rather than dominate it.

However, not everyone may have time to read these books – although they will be a great benefit – but hopefully the abridged version of these books from someone who used to be uncomfortable in front of crowds can help others to also improve their public speaking skills.

Updates to the Virgil CSV to DDI Converter

A short and sweet update:

There was an oversight with the CSV converter not converting coded values to the proper place in the created DDI XML. This has been fixed and the changes have been pushed into SVN and a new version (0.0.2b) of the executable has been released on Google Code.