multum im parvo

URL

XML feed
http://djfroofy.blogspot.com/

Last update

50 min 51 sec ago

November 5, 2008

10:50

I always thought I understand `finally' in the context of exception handling in either Java or Python. Now, I'm not sure I really understand it. Riddle: What does the following print?




def f():
try:
return 1
finally:
return 2
print f()


Ah ... never mind ... I understand it again now. It means: finally.

Categories: Atlanta Tech

October 28, 2008

09:21

My dreaded multiple inheritance article was published in Python Magazine yesterday. It was tricky enough giving a presentation on the subject, so I look at the article as somewhat of an achievement. Someone contacted me shortly after concerning my use of class variable in one of the examples:




class Publishable(object):
published = False
def __init__(self, start_date, end_date):
self.start_date = start_date
self.end_date = end_date
def publish(self):
self.published = True


The inquirer's question:


I've always been under the impression having class & instance
variables with the same name is just confusing and not a good way to
write Python. Why did you do this?



Anyhow, on the subject of "class variables," here are my thoughts:



[Begin email response]




I guess it depends on who you ask. The pattern of using class level
variable as a default values for instances is one I like and don't
find confusing. But these kind of things are always subjective. It's
not my own pattern either, but one I've noticed in other projects.



http://twistedmatrix.com/trac/browser/tags/releases/twisted-8.1.0/twisted/internet/defer.py#L137
http://trac.pythonpaste.org/pythonpaste/browser/Paste/WebOb/trunk/webob/__init__.py#L474


I make a distinction, though, between a "class variable" and a
variable defined at the class level. In terms of the VM, they are of
course the same thing, but are differentiated in human terms by uses
cases. The former "class variable" is useful if you are tracking
state on a class - for example keeping a count of objects created
through a class method:




class Foo:
_created = 0
@classmethod
def create(cls):
foo = Foo()
cls._created += 1
return foo



The latter "class level variable" can be useful for defining default
values for attributes (especially ones which cannot be specified in
the constructor):




class Task:
completed = False
def __init__(self):
self.start_time = time.time()



(This is same pattern I'm adopting in my article.) Of course, when
the "completed" attribute is looked up on an instance of Task, it has
to take an extra step - first looking in dict of instance and then in
the class. However, this also means a decent space optimization if
your ratio of incomplete tasks to complete tasks is very high - once a
task is marked as completed it is likely to get discarded and hence
garbage collected.




I think it's only confusing (again, this is very subjective) if you
try to mix the two uses cases in one class:




class Confusing:
completed = 0
def __init__(self):
self.completed = False
@classmethod
def complete(cls, confusingInstance):
cls.completed += 1
confusingInstance.completed = True


Thanks for the feedback.



[End email response]

Categories: Atlanta Tech

October 9, 2008

08:38

So, briefly on the topic of lexical scope, what does the following print?


def f():
funcs = []
for i in range(5):
def g():
print i
funcs.append(g)
return funcs

funcs = f()
for func in funcs:
func()

Categories: Atlanta Tech

October 8, 2008

09:00

Problem: You have a queue of tasks to manage but only n should run at a time. An example of this is some CPU-intensive task where running more than a fixed number of tasks decreases your overall throughput due to context switching. Sometimes the precise number is simplified to the number of CPUs you have to work with. ( Insert actual numbers here ;)



A simple queuing mechanism is the perhaps the most obvious solution to the above problem. I wrote one Task Queue implementation which was pretty terrible and required you to "pump" initial events - but only a certain number to "start" the queue. I won't post the code for that here. However, I revisited the problem yesterday and came up with some simple code which seems to do the job nicely. (Note, this is for an application using Twisted, hence "Deferred Task Queue". In a producer/consumer thread-based model where consumers are threads in fixed-size thread-pool, the problem is already solved by just using Python's Queue and letting the consumers pull jobs off the queue - i.e. the size of thread pool dictates how many jobs can run concurrently.)




class TaskQueue:

def __init__(self, concurrentMax=cpuCount()):
self.concurrentMax = concurrentMax
self._running = 0
self._queued = []

def push(self, f, *args, **kwargs):
if self._running < self.concurrentMax:
self._running += 1
return f(*args, **kwargs).addBoth(self._try_queued)
d = defer.Deferred()
self._queued.append((f, args, kwargs, d))
return d

def _try_queued(self, r):
self._running -= 1
if self._running < self.concurrentMax and self._queued:
f, args, kwargs, d = self._queued.pop(0)
self._running += 1
actuald = f(*args, **kwargs).addBoth(self._try_queued)
actuald.chainDeferred(d)
if isinstance(r, failure.Failure):
r.trap()
return r



Note that the above implementation is missing a notion of "capacity" - which might be important for a more general solution. My application actually handles capacity external to queue, but there might be some benefit in internalizing the concept and raising exceptions on push() when capacity is exceeded. I'm still undecided on this.




The interface is pretty straightforward. You have a function f (and its arguments) that returns a Deferred and that you want to call (eventually). For example, doSomeStuff() below simply returns a Deferred object that will fire after 2 seconds have elapsed:


def doSomeStuff(a, b=None):
print 'doSomeStuff(%s, %s) called: %f' % (a, b, time.time())
def finishUp():
print 'doSomeStuff(%s, %s) finished: %f' % (a, b, time.time())
d.callback('done %d %d' % (a, b))
d = defer.Deferred()
reactor.callLater(2.0, finishUp)
return d


Let's queue up some calls to doSomeStuff()


taskq = TaskQueue(3)
taskq.push(doSomeStuff, 1, b=2)
taskq.push(doSomeStuff, 2, b=3)
taskq.push(doSomeStuff, 3, b=4)
taskq.push(doSomeStuff, 4, b=5)
taskq.push(doSomeStuff, 5, b=6)


The output of the above would be something like this:


doSomeStuff(1, 2) called: 1223472790.943929
doSomeStuff(2, 3) called: 1223472790.944112
doSomeStuff(1, 2) finished: 1223472792.947769
doSomeStuff(3, 4) called: 1223472792.947887
doSomeStuff(2, 3) finished: 1223472792.948004
doSomeStuff(4, 5) called: 1223472792.948162
doSomeStuff(3, 4) finished: 1223472794.951818
doSomeStuff(5, 6) called: 1223472794.951937
doSomeStuff(4, 5) finished: 1223472794.952080
doSomeStuff(5, 6) finished: 1223472796.955836



As you can see, as soon as one of the called functions completes its job, the next one queued is called. There is no need to tell the queue to start doing its work - just push jobs onto the queue.




The technique that makes this work is simply exploiting the elegance of Deferreds by "sneaking" in a check for pushed jobs that have been queued via Deferred's addBoth() method. For those of you unfamiliar with how a Deferred works in Twisted, you should the deferred section of Twisted's asynchronous programming guide and then this document on Deferreds ... and maybe this one too.

What makes this useful is that I can treat a call to push() as if it were simply a call to the function being queued - no other messy API to tell the queue which callbacks or errbacks need to be invoked when the function is finally called. The internal function _try_queued() acts as a transparent pass-through gateway so the caller to push() doesn't need to worry about adding a funky callback to translate some wrapped value or otherwise - again I'm eschewing unnecessary API details. So for example:




# this
taskq.push(doSomeStuff, 1, b=2).addCallback(cb).addErrback(eb)

# is the same as
doSomeStuff(1, b=2).addCallback(cb).addErrback(eb)

# ... just with queuing behavior under the hood


In closing, in the context of asynchronous programming or otherwise, I've begun to strongly believe "the best API is no API".

Categories: Atlanta Tech

August 21, 2008

16:38

I though it would be amusing to do a redacted blogpost. After all, the assholes with power do this stuff all the time and for things that really matter - life or death matters ... seriously. The stupid blog post is the exact antithesis - things of relatively no matter whatsoever. Were it to provoke one, such a reaction would be quite absurd. So here it is - a repost of an internal blogpost from work replete with inked out sections (or "so much whiteout") like xxxx xxxx. I hope the all-seeing corporate eye of Sauron looming down on the piteous ant I am - above my head, my cube our ceiling and piercing through the clouded firmament shading all humble living beings on this green earth - looks kindly upon my ramblings here.



[Begin repost]






xxxxx xx xxxxxxxxxx xxx xxxxxxxxxxxxxxxxxxxxxxxxxx xxxxxxxxxx, here's a little splash on the fire from my whiskey flask:



http://antoniocangiano.com/2008/03/04/rails-is-the-best-thing-that-ever-happened-to-python/

... Some great humor relating to differing attitudes towards marketing (that's the `sexy', in case you're wondering) between the Ruby and Python communities. My favorite quote from the above:




If Twisted Matrix was implemented in Ruby it would be advertised as the second coming ...



Anyhow, read on, because he's not really bashing Rails:




So what does this mean for me personally? I?ll use them both, as I?m a firm believer in using the right tool for the right job.


The last point is the single most important: Use the Right Tool for the Right Job. For the last few years (lets call them the Dark Ages now) in Enterprisey-Land, the motto has been One F***ing Tool for Every F***ing Job. The are also variations on the motto that end in "... Or Else!" x xxxx xxxx xxxxxx xxxxx xxxxxx xxxxx xx xxxxxx xx xxxxx xxxxxxxx xx xxxxxx xxx xxxx xxxx xxxxxx xxxx xxxxxx xxx xxxx x xxxxxx xxxxxxxx xxxx xxxxxxxxxxxxxx xxxx xx xxxxx xx xx xxxxxxx xxxxx xxxxx xx xxxxx xxxxxxxxx xx xxxxx xxxxxxx xxxx xx x xxxxxxxx xx xx xxx xxxxxxxxxxxxx xxxxxx xxxxx xxxx x xxxxxxxxxx xxxxxxxxxx xxxx xxxx xxxx xxxx xxx x xxx xx xxxxx xxx xxxxx xxx x xxxxxxx xxxxxxx xx xxxxxxx xx xx xxxx xx xx xxx xxxxxxxxx xx xxx xxxxx xx xxxxxxx xxx xxx xxx xxx xxx [1]x

Of course I don't have the statistics to back this (who does?), buy my perception (which is arguably valid considering I work day and night as a Java developer and mingle with other prisoners of the `Enterprise'), homebrew Java projects outside the scope of well-understood [2] frameworks (Spring MVC, Webwork) almost always necessitate a great amount of design decisions to be made up front. To quote Glyph Lefkowitz xxxxx xxx x xxxxxxx in a recent discussion on twisted python:


A design discussion is an unverified hypothesis. There's no point in developing it into a theory until you have some further indication that it might be implemented.


The Spring team realized this and began a successful political campaign [3] against all things J2EE, and focused on implementations, instead of bullsh**, I-Smell-Fear, specifications. (Notice, how politically correct I am in censoring my own writings. I wish I could put asterisks in my speech as well.)

Design decisions are made up front with most Java projects. However, at a higher level, that a matter of individual programmer/architect or wider spanning community philosophy. The same philosophy could be adopted by Pythonistas, Rubyists or Lispers (and it unfortunately is by some people who haven't read this), and you would end up with similar issues. Note that I use the word `similar', not `same', because the issue is compounded in Java by the instilling of the up-front design into the lower-level implementation. The instilling process isn't optional - it's demanded by static typing. Static typing is not (always) a bad thing, and the rigid stance taken by static type puritans who love languages like Java and Haskell makes some sense when considering the legacy of weakly typed languages like C. But that's a whole another story - but briefly, let's remind ourselves that Python and Ruby are strongly (not weakly) typed, but also dynamically typed which essentially enables you to punch an API in the face (with 2-5 lines of readable code) if you don't like it, or even modify is ill behavior at runtime [4] if you can't convince the stubborn maintainer to fix bugs or funny smells.

Now, one meme that remains in the sinking ship of Enterprise Snake-Oil [5] is that all the conveniences provided by dynamic languages are a wash because the end result is a bundle of cute little scripts [6] which suffer in performance and are hacked out and difficult to maintain. Let me go ahead and call bullsh** on that. x xxxx xxxxx xx xxxxxx xxxxxxxxxx xxxxxxxxxx xxxxxxx xxx xxxxxxxxxxxxx xx xxx xxxxxxx xxxxxxxxx xxx xxxxx xxx xxxxxx xx xxxxxx xxxx xxxxxxxxxxxxxxx xxxxx xx xxxxxxx xxxxxxxxx xx xxxxxxxx xxxxx xx xxxxxxxxxxxx xxxxxxxxxxx xx xxx xxxxxxxx xxxxxxxxxxx xxxxxxxx xxxx xx xxxxxxxx Java combined with our fancy-schmancy IDE and all their attempts to slap developers on the wrist while they pound away on source code have done very little to keep us from sinning. The only solutions to the problems that arise in bad Java programming are enforcing design patterns we're all slow to adopt: beware of the singleton, write components instead of final service classes with static methods (so you can actually write unit tests around things), use Spring (or replace Java with XML), etc. In sum, understand all the esoteric details of the Java programming language (event the transient modifier, if serialization is important to you) while laying out classes and deciding on method signatures - again, these will be irreversible decisions that will lead to hate from your fellow developers who can't pragmatically work around your anti-patterns to facilitate better unit-testing, adaptive integration points, proper serialization, and other goals easily attained with dynamic languages.

Another meme is that Java (or more generally, static typing) facilitates quality (and security) through type safety. On the security front, this is true in some extreme cases - though it's a problem easily overcome in distributed computing by people with smarts. Quality ensured through compilation is of course a joke - compilers don't know the requirements, they only speak 1s and 0s. Ultimately, the larger task is preventing regressions, which can be done only with unit tests. And there are plenty of Python projects that get this very right. Take the unit test suite for the Twisted project as an example:



Ran 4113 tests in 172.023s

PASSED (skips=62, expectedFailures=19, successes=4032)


Yeah, just a few tests there. And yes, that's 172 seconds (not minutes). I guess there are some equivalents in open source Ruby projects. Ruby people?



Ok, so my little blog entry is supposed to be about how the next alternatives to Java are apparently Ruby and Python and my own personal fear is that Ruby will be just the next Java - and people will claim I'm not sexy enough with my old bag-o-tricks in Python [7] .. blah, blah, even though I know in my heart of hearts that "Python is the only acceptable implementation of Ruby". Really, it takes more energy than I'm willing to expend to get into these silly arguments. xx xx xxxx xxxxxxx xx xxx xxxxxxxxxxxxx xxx xxxx xxxxxxxx xxx xxx xxx xx x xxxx x xxxxx x xxx xxxxxxxxxx xxxx xxxxxxxx xxxx xxxxxxxxxx xxxxxxxxx xxxxxxx xxx xxxxxxxxxx xxxx x xxx xxxxxxxxxx xxxx xxxx xxxxx xxx xxxx xxxx xx xxxxxx xxxxxxxxxx xx xxxxxxxxx xxx xx xx xxxxxx xxx xxx xxxx xxxx xxxx xxxx xxx xxxxxxxxx xxx xxx xxx xxxxxxxxxx xx xxxxx xxxxxxxxx xxxxxxxx xxx xxxxxxx xxx xxxxxxxxxxxx xx xxxxx xxxxx xx xxxxxx xxxx xxxx xxxx xxx x xxxxxxx xxxxxxxxxxxxxxx xxxxxxxxxxx xxx xxxxx xxxxxxxx xxxx at least one painful piledriver.





[1] EST is an acronym for Enterprise Service Toilet.

[2] Well, let's just wave our hands and pretend like they're really well understood.

[3] There were some problems here as well. It appeared the platform they were running on was build of some space-age cardboard, but on closer inspection, it ended be highly compressed XML - about 500K lines of it. Oh ... And they basically reimplemented J2EE all over again - since that stuff is so tasty, right? The best approach, in my jaded world-view, would have been to slit the Hegemon's throat, instead of knocking it's head with a bat and then proceed to built a neck brace for it.


[4] In Python, this is called monkey-patching (which involves trivial reassignment of unbound methods, functions in a module's namespace, etc), and Ruby facilitates `stepping into' classes. Java offers some incomprehensible bullsh** on this front, which will only necessitate 3 months of training before any developer can partially utilize it - of course it will likely continue to feel like acupuncture in your eyeballs when you revisit code leveraging such awe-inspiring byte-code futzing libraries.

[5] Ahh, rhetoric ... gotta love it.


[6] Yes, you can write applications with modules and namespaces. Isn't that nifty?

[7] This is yet another meme affecting at least part of the Ruby community - Python is some old crusty crap no-one hacks with except dorky research scientists who wear their pants up too high. This is of course nonsense - and to paraphrase Zed Shaw: "Ruby will not lend you more appeal to the ladies." Of course, Zed used his own branded terminology to convey the equivalent concept.

Categories: Atlanta Tech