Hacker Newsnew | past | comments | ask | show | jobs | submitlogin
Python iteration (nedbatchelder.com)
113 points by joshbaptiste on April 25, 2012 | hide | past | favorite | 26 comments


while > for > foreach - increasingly higher level abstractions.

article stops at iteration (foreach) but misses an even higher level abstraction: filter, map, reduce. each of these can be implemented on top of foreach, and i haven't yet seen a loop that can't be decomposed into a combination of map, filter, reduce. also note that filter/map/reduce are expressions, not statements, so they are more "mathy" and thus easier to reason about.

its interesting that these are not necessarily idiomatic python despite being a slightly higher level abstraction. note they are idiomatic in all "real" functional languages, which python is decidedly not. also list comprehensions are faster than map/filter in python. (this is probably python's fault and is really dumb because you could just implement map/filter as a list comprehension. some people have criticized Guido for some of his design choices[1].)

[1] http://www.quora.com/What-are-the-main-weaknesses-of-Python-...

also on topic: http://docs.python.org/library/itertools.html


> its interesting that these are not necessarily idiomatic python despite being a slightly higher level abstraction. note they are idiomatic in all "real" functional languages, which python is decidedly not. also list comprehensions are faster than map/filter in python. (this is probably python's fault and is really dumb because you could just implement map/filter as a list comprehension. some people have criticized Guido for some of his design choices.)

I don't miss map/reduce syntax the slightest. List comprehensions (for when I need to read multiple times a list result) and generators expressions (for lazy evaluation and genericity) are incredibly readable and efficient .

Here's an example script[0] I threw in a cron to report on some daily FTP transfers from a legacy system. Those list comprehensions are very descriptive, like a mathematical set syntax. In comparison e.g Ruby's map/inject feels terribly awkward and alien to me. [1] (and [2], although a bit academic) is an invaluable resource on the subject.

[0] https://gist.github.com/2492951

[1] http://www.dabeaz.com/generators/

[2] http://www.dabeaz.com/coroutines/


He does do a "map", although without using the function of that name:

  answers = [f(x) for x in iterable]
Also, the "zip" operation gives loops that seem to me to be unable to be decomposed into map, filter, reduce. E.g., how would you print a file with line numbers using only map, filter, reduce?

Regardless, I agree with your basic point: that thinking of loops in terms of high-level loop primitives, is a Good Thing.


    def zip(*seqs):
      return map(lambda *items: items, *seqs)

    assert [(1, 'a'), (2, 'b'), (3, 'c')] == zip([1,2,3],['a','b','c'])


    def count(start=0):
      while True:
        yield start
        start += 1

    from itertools import izip, islice # lazy versions
    with open('/etc/passwd', 'r') as f:
      zipped = izip(count(), f.readlines())
      for numbered in zipped: print numbered
izip and islice are trivial to implement: https://github.com/dustingetz/sandbox/blob/master/etc/lazy.p...


Well, strictly speaking, you have answered my objection. But I guess I was thinking in Haskell. If Python's map takes multiple iterables, then, yes, zip is a special case. But writing zip by mapping a function without side effects, using a "map" that works only on a single iterable (as in Haskell) strikes me as a rather different thing.

So I suppose that the truth of "this loop can be written using map" depends on exactly what one means by "map".


interesting. how about

    def head(seq): return seq[0]
    def rest(seq): return seq[1:]
    def transpose(mtx):
        a = map(head, mtx)
        b = map(rest, mtx)
        return [a] + ([transpose(b)] if b[0] else [])

    def zip(xs, ys):
        return transpose((xs, ys))

    def zip2(*seqs):
        return transpose(seqs)
this works but is sort of icky in python, any ideas? this is also where functional python starts to fall apart - no tail call recursion, and default data structures aren't persistent. based on a haskell solution which is nicer. http://stackoverflow.com/questions/2578930/understanding-thi...


Line numbering a file with reduce is straightforward:

    def line_number_catamorphism(fp):
        def accumulate(a, line):
            line_number, lines = a
            return line_number + 1, lines + [(line_number + 1, line)]
        return reduce(accumulate, fp, (0, []))[1]
It may be worth nothing that zip can be straightforwardly expressed in terms of map as well, i.e.:

    zip = lambda *a: map(None, *a)


I'm sorry, is this a joke? "straightforward"? Why would you use that code when you could implement the same thing more simply like this?

    def line_number_mogrifier(f):
        num = 0
        for line in f:
            yield (num, line)
            num += 1

The functional proponents say that eventually everything will be functional, I guess I'll wait and see how that goes over. People think procedurally. Functional constructs may have some technical advantages, but if adopting them shrinks the pool of effective developers, it won't catch on.


Can you control the ordering of map/filter/reduce? As in, I want to iterate through a matrix by iterating through the columns and rows in reverse order? As in:

  [9,6,3]
  [8,5,2]
  [7,4,1]
I've just started using map/filter/reduce so I'm not sure.


No. They start from the beginning and go to the end. Conceptually it could work on any iterable, and not just a list container.

You can, however, use the 'reversed()' function, as in "for row in reversed(rows): for col in reversed(row): print col"

There is a special double-underscore method, which lists and other data structures support, to allow iteration in reverse with the normal performance of forward iteration.


What data structure are you using to represent your matrix? A list of lists? If so, it's not terribly difficult:

    >>> a = [[1,2,3], [4,5,6], [7,8,9]]
    >>> b = zip(*map(reversed, reversed(a))
    >>> b
    [(9, 6, 3), (8, 5, 2), (7, 4, 1)]
    >>> reduce(operator.add, b) # could also use a lambda with + instead of operator.add...
    (9, 6, 3, 8, 5, 2, 7, 4, 1)


Simple and useful overview. I find enumerate() quite useful. I should probably define generators more often.

One built-in function I basically never use is zip(). I know how it's used and what it does, and I grok the names/ages types of examples that are always used to explain it. But I've simply never needed it in real code. Anyone else use it often, and in what contexts?


I sometimes use zip as a poor man's matrix transposition operator:

    >>> m = [(1,2,3), (4,5,6), (7,8,9)]
    >>> zip(*m)
    [(1, 4, 7), (2, 5, 8), (3, 6, 9)]


If you want to transpose a matrix-like list of lists, but the sizes don't all match up, you can use izip_longest from itertools and give it a fillvalue to "fill in" for the missing elements:

  >>> m = [(1,2,3), (4,5,6), (7, 8)]
  >>> zip(*m)
  [(1, 4, 7), (2, 5, 8)] # Wrong
  >>> from itertools import izip_longest
  >>> list(izip_longest(*m, fillvalue=None))
  [(1, 4, 7), (2, 5, 8), (3, 6, None)]


That's pretty.


Quickly make a dict:

    >>> keys = ["foo", "bar", "spam", "egg"]
    >>> values = [1, 2, 3, 4]
    >>> dict(zip(keys, values))
    {'egg': 4, 'foo': 1, 'bar': 2, 'spam': 3}
Quite useful when you MGET from Redis.


I use it rarely in Python, although it's sometimes useful. I use it more often in other languages to replicate Python's enumerate, since zip(range(len(seq)),seq) is basically enumerate(seq) (although with this implementation it's not a generator).


Ah, but this is:

  from itertools import izip
  izip(xrange(len(seq)), seq)


And in Python 3 it's a generator with

    zip(range(len(seq)), seq)


i use it all the time. mostly to loop over n things at the same time:

   for a, b in zip(list_a, list_b):
      print a + b


For example, trying to solve this problem (CodeJam 2008 minimum scalar product):

You are given two vectors v1=(x1,x2,...,xn) and v2=(y1,y2,...,yn). The scalar product of these vectors is a single number, calculated as x1y1+x2y2+...+xnyn.

Haskell:

  sum(zipWith (*) [1,2,3][3,2,1])
Python is not inherently functional, just supports a few convenience idioms. It has zip, but no zipWith. As a result you have to use list comprehension to perform the same:

  sum([x*y for (x,y) in zip([1,2,3][3,2,1])])


You can use map with several iterables:

  sum(map(operator.mul, [1,2,3], [3,2,1]))


I've used zip before to find an LCA:

    def lca(a,b)
      a.ancestors.zip(b.ancestors).take_while { |a,b| a == b }.last
    end


First examples of generators that I've seen that made them look useful. I'd never thought of using a generator as a filter, or for turning nested for-loops into one for-loop. I have some code I can go clean up now.


Here's a pile more to blow your mind with: http://www.dabeaz.com/coroutines/ [2009]


I would have liked to have seen mention of this while loop instead of the index-based one.

    while my_list:
        v = my_list.pop(0)
        print v
Although it doesn't fit the theme of the others because it consumes the list, it is certainly a more natural use of "while" to loop over a list.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: