Opster

Two months ago I’ve released little command line parsing library for Python called opster (actually it was called finaloption then, but I’ve renamed it because of remark from native speaker ;-)). What’s the reason to write one more command line parser when Python already has getopt and optparse in standard library and not so long time ago argparse and optfunc were released?

Well… as usually, because I think that they are going wrong way and doing wrong things. Of course, IMO (but what matters if not opinion? :P).

It started to happen when Zed Shaw wrote big article on Python warts and mentioned that Lamson has much better command line parsing solution than argparse/optparse. I was interested in this topic a little at the time and I looked at the code. It would be lie to say that I liked it. In fact I thought that this is a heresy of the same level as optparse. ;-)

I’ve written in Twitter that it’s funny to say that Lamson has superior command system and got some amount of sarcasm from Zed and clear understanding that Zed see nothing bad when:

  • command functions should be defined in a single module
  • default settings are defined by calling separate function inside a command function
  • specifying option in command line with mistake wouldn’t raise error
  • formatting of help text on options is done by hands

So I thought that world needs Mercurial’s command system. ;) And I’ve rewritten it as library, keeping main idea.

Small example of usage:

from opster import command

@command(usage='[-l HOST] DIR')
def main(dirname,
         listen=('l', 'localhost', 'ip to listen on'),
         port=('p', 8000, 'port to listen on'),
         daemonize=('d', False, 'daemonize process'),
         pid_file=('', '', 'name of file to write process ID to')):
    '''Command with option declaration as keyword arguments

    Otherwise it's the same as previous command
    '''
    print locals()

if __name__ == '__main__':
    main()

I think that you should understand what’s going on here. For example, option is required to have long name (keyword argument name), possibly short name (using '' will discard short name), some default value and help string. Default value determines what should be done to incoming data:

  • string — nothing happens, incoming value will remain as string
  • int — incoming value is parsed by calling int()
  • list — incoming value is appended to the list
  • function is called with incoming value and output is used
  • True/False/None — option needs no argument, just switching default value in opposite value

After wrapping with @command your function main() can be called:

  • without arguments at all; it will parse sys.argv in this case
  • with argument named argv, which needs to be list of strings (same as sys.argv[1:])
  • with usual arguments/keyword arguments, which are defined in function

I think it may be not obvious that you will get clean values in your function (for example, port will contain value 8000), and not some strange three-tuples.

And you get such help for free:

piranha@gto ~/dev/misc/opster>./test_opts.py --help
test_opts.py [-l HOST] DIR

Command with option declaration as keyword arguments

    Otherwise it's the same as previous command

options:

 -l --listen     ip to listen on (default: localhost)
 -p --port       port to listen on (default: 8000)
 -d --daemonize  daemonize process
    --pid-file   name of file to write process ID to
 -h --help       show help

Furthermore, underscores in argument names are converted to hyphens to support conventions of command line. ;)

I should mention that option names (and subcommand names, if you’re using them) can be shortened: i.e. you can say --pi instead of --pid-file.

If I’m going to compare opster with something, this should be optfunc by Simon Willison. Most noticeable differences are syntax of command definitions and subcommand support. Actually optfunc has subcommand support, but it’s pretty incomplete.

Opster uses getopt inside to parse options and that’s the reason why it’s somewhat bigger than optfunc (which is essentially optparse wrapper). Opster’s internal API — options are list of four-tuples (short name, long name, default value, help string) — is exactly the same as Mercurial’s API for defining options. This means that such code will work (taken from test_cmd.py):

import opster

config_opts=[('c', 'config', 'webshops.ini', 'config file to use')]

@opster.command(options=config_opts)
def initdb(config):
    """Initialize database"""
    pass

@opster.command(options=config_opts)
def runserver(listen=('l', 'localhost', 'ip to listen on'),
              port=('p', 5000, 'port to listen on'),
              **opts):
    """Run development server"""
    print locals()

if __name__ == '__main__':
    opster.dispatch()

Interesting thing happens in definition of runserver, help and output of which looks like this:

piranha@gto ~/dev/misc/opster> ./test_cmd.py help runs
test_cmd.py runserver [OPTIONS]

Run development server

options:

 -l --listen  ip to listen on (default: localhost)
 -p --port    port to listen on (default: 5000)
 -c --config  config file to use (default: webshops.ini)
 -h --help    display help

piranha@gto ~/dev/misc/opster> ./test_cmd.py runs
{'port': 5000, 'opts': {'config': 'webshops.ini'}, 'listen': 'localhost'}

You can factor out common options and pass them to @command decorator, keeping your pants DRY. ;-)

So… Read documentation and use it! :) Any feedback, questions, suggestions and patches are highly welcome. ;-)

Comments: 21 (already: 8) Comment post

Real cool. Good point you mention DRY because one usually has to duplicate params name in the called function and in the optparse/getopt code. Hopefully this will go in stdlib one day. Handling subcommands is also handy.

Benjamin Sergeant , 15:17

Thanks for the kind words, I really appreciate this. :-)

Alexander Solovyov , 16:03

Looks nice! However, I’d like to see support for variable numbers of arguments — perhaps I missed it in the documentation. Say I have a program which takes some options, some fixed arguments (like dirname in your first example) and then some additional arguments which are variable in number (just for example, a list of integers to sum). What I’d like to do is to collect all the additional positional arguments into the *args tuple in the decorated function.

Vinay Sajip , 19:21

Unfortunately there is Python restriction on this, so you are not able to simultaneously define options as keyword arguments and use *args, but this code will work:

opts = [('l', 'listen', 'localhost', 'ip to listen on'),
        ('p', 'port', 8000, 'port to listen on')]

@command(opts, usage='[-l HOST] DIR')
def main(*dirs, **opts):
    print locals()

if __name__ == '__main__':
    main()
Alexander Solovyov , 16:43 (after 1 day)

Thank you, that makes sense. You’ve done a nice job of wrapping getopt functionality in a declarative interface. Have you come across argparse? It started as an evolution of optparse and while it’s no longer compatible with optparse, it has added some useful functionality and is being talked about (on stdlib-sig) as a candidate for inclusion in the stdlib. I think your basic scheme, if applied to wrapping argparse rather than getopt, would be terrific. What would be your thoughts about that?

Vinay Sajip , 20:16 (after 1 day)

Wrapping argparse will probably take less code than wrapping getopt, but argparse is not in a standard library (yet :)… Though I think I can talk with argparse author about contributing such an interface to argparse itself.

Only things which are embarrassing me are that argparse itself is much bigger than getopt + opster (82kb instead of 23kb) plus it’s distributed under Apache License (as I understand, this is an issue for projects under BSD License, where I’d like to use opster).

Alexander Solovyov , 20:30 (after 1 day)

It would be great if something could be worked out! I did mention the size discrepancy to Steven Bethard, the author of argparse. Given the small absolute size of these libraries I’m not sure it’s an issue.

Vinay Sajip , 20:43 (after 1 day)

Eh, he doesn’t seem to be interested… though I think I’ll try to get such interface working just for fun on next week (unfortunately I have no time before that). Or maybe tomorrow. ;-)

Alexander Solovyov , 20:54 (after 1 day)

You. Are. My. Hero.

Kenneth Reitz , 22:18

this rocks! nice work.

John , 23:50

Great job! This library is just what I have been waiting for. This library accomplishes exactly what I need with the lowest overhead of any I’ve used. The resulting code is readable and clean. The built in help is effective. Nice use of decorators.

I would also love to see this library make it into stdlib…

(Edit: thanks for the correction about annotations vs decorators. You can tell my day job involves java. ;)

bw , 01:20 (after 1 day)

bw, by annotations I think you mean decorators ;)

Cool library. I like how its one module.

Bulkan Evcimen , 03:26 (after 1 day)

I would also love to see this library make it into stdlib…

Eh, I think chances are not that big… At least it needs to be widespread. :-)

Alexander Solovyov , 16:48 (after 1 day)

Learn Albanian :)

http://zabivator.livejournal.com/345174.html?thread=6725974#t6725974 (a small tutor on English articles usage, in Russian)

Fractal , 14:52 (after 1 day)

Alexander, another question. You use the type of the default value to determine how to handle the argument passed at run-time. How do you handle the case where the mapping between type of default value and run-time value is not a straightforward one?

For example, I might have a --log argument which should default to sys.stderr, but if a string value 'fn' is passed at runtime, I want to use it to call e.g. open('fn', 'w') and make that the value of the argument.

If I make a function

def log_maker(s):
    if s is None:
        return sys.stderr
    else:
        return open(s, 'w')

and specify ('l', 'log', log_maker, 'File to log to') in the options tuple, then if I don’t specify a --log argument at runtime, my function is never called. Of course if I do specify an argument then the function is called and I get the right value — but how can I get the default value in this case?

Vinay Sajip , 18:17 (after 2 days)

That’s a valid concern… This function should be called every time parsing is done, but then how to call it? With None or without arguments at all? Hm, probably None is better.

Alexander Solovyov , 06:17 (after 3 days)

I think it has to be None — the function will need a signature of one argument anyway to deal with the argument value which is passed. It’s worth elaborating this point in the documentation, too :-)

Vinay Sajip , 09:15 (after 3 days)

This module is a great idea — every time I use optparse or the other modules I have to look up the syntax. I have made a couple of tweaks which you might be interested in:

  1. The ability to call from python without specifying all the arguments and let the parser fill int the defaults as usual.
  2. Something related to the above which makes it possible to use any callable rather than just a function as the default (can be used to implement repeated arguments with type conversions added to a list etc)
  3. The ability to have variable numbers of arguments passed into the wrapped ‘main’ function when a magic ‘args’ argument is given as the first argument. Similar to the *args question above.

These are all very minor tweaks, I’m not sure how to post them but will if you are interested.

John Porter , 11:10 (after 19 days)

Sorry for late answer (I was away from connection for some time), I’m definitely interested. If you know how to use mercurial, use patchbomb extension to send them to me (piranha AT piranha.org.ua), or just post diff somewhere (any pastebin).

Alexander Solovyov , 19:48 (after 23 days)

I put my patched version here — http://pastebin.com/m4536b6fc. The changes should be fairly obvious.

John Porter , 15:36 (after 26 days)

I’ve been a big fan of cmdln for awhile:

http://code.google.com/p/cmdln/

I use it instead of optparse in all cases. The idea is that you write a class that inherits from cmdln.Cmdln, and this class can be decorated with @cmdln.option decorators that determine what is passed into them.

One downside of cmdln is that it “only” supports the multi-command (svn/hg) style, but I find that any command-line tool I write ends up becoming rich enough to need this style anyway.

See this guide:

http://code.google.com/p/cmdln/wiki/GettingStarted

It would be interesting to see a comparison between Opster and cmdln.

Andrew Montalenti , 19:30 (after 5 days)

Comment form for «Opster»

Required. 30 chars of fewer.

Required.

Comment post