C programming, Python programming

It's said that you can write C in any programming language. I used to do this all the time – C was my first language, and I came to Python from Perl – but I am doing it much less these days and profiting from the change. Look at how I was parsing 2D GeoRSS coordinates or a GML posList into tuples back in 2005 in the 6 lines below.

# value is "33.338611 52.566592 32.537546 52.527238 31.744196 52.409555"

latlons = value.split()
coords = []
for i in range(0, len(latlons), 2):
    lat = float(latlons[i])
    lon = float(latlons[i+1])
    coords.append((lon, lat))

# coords is [(52.566592, 33.338611), ...]

Counters, increments, offsets – It's rudimentary C pointer arithmetic in Python. I recently rewrote this in a different style.

def take_tuples(itr):
    nxt = itr.next
    while True:
        lat = nxt()
        lon = nxt()
        yield lon, lat

latlons = iter(map(float, text.split()))
coords = list(take_tuples(latlons))

Python isn't a functional language, but I still enjoy and benefit from using it in a functional style. The statements in my first, older batch of code are concerned with how to extract coordinate pairs. The statements in the second express what values I want. I want the results of float mapped over all the items split from the initial value, and then I want a list of pairs of these results.

For a more simple example, If what I want is a list of integers from 1 to 10, I write this:

numbers = list(range(1, 11))

instead of this:

numbers = []
for i in range(1, 11):
    numbers.append(i)

And if what I want is integers converted to floats, I write this:

numbers = map(float, range(1, 11))

instead of this:

numbers = []
for i in range(1, 11):
    numbers.append(float(i))

The generator function take_tuples (code section 2, above) is also written in terms of what results I want. It calls next twice on an iterator and yields the pair, swapped. When it hits the end of the iterator, it raises StopIteration. This is expected and normal for functions that accept iterators, like list(). It's the what-not-how way to get things from a sequence, pairwise.

>>> def take_pairs(value):
...     itr = iter(value)
...     while True:
...         yield itr.next(), itr.next()
...
>>> list(take_pairs(range(4)))
[(0, 1), (2, 3)]
>>> list(take_pairs("abc"))
[('a', 'b')]

Now, here's the icing: my new code (section 2 above, with map and list and the generator) benchmarks 10% faster than my old C style code. Binding the itr.next to a local variable (nxt) makes a big difference. I'm eating my cake and having it, too.

Comments

Re: C programming, Python programming

Author: Victor

But the magic really happens when you get rid of the take_tuples function:

latlons = iter(map(float, text.split()))
coords = zip(latlons, latlons)

Re: C programming, Python programming

Author: Sean

Victor, that's super cool. It benchmarks 30% faster than my C style code, and it easily extends to n-tuples (say I expected X,Y,Z triples) ala

<pre>

coords = zip(*[latlons]*n) </pre> but I am swapping the order of lat and lon in take_tuples(), so zip() doesn't quite work.</p>

Re: C programming, Python programming

Author: Michal Migurski

Python does Numpy-style strides, so:

lonlats = zip(latlons[1::2], latlons[0::2])

Re: C programming, Python programming

Author: Sean

Michal, that's even 1-2% faster than Victor's (my original benchmark was a linear ring with 64 vertices, approximating a circle, but difference is the same for 30 and 100 vertices). Even though slices copy data, they are apparently (at least with Python 2.7.3) a bit faster than iterators.

Re: C programming, Python programming

Author: John Drake

"Now, here's the icing: my new code (section 2 above, with map and list and the generator) benchmarks 10% faster than my old C style code. Binding the itr.next to a local variable (nxt) makes a big difference. I'm eating my cake and having it, too." That's really interesting, I wonder why it runs so much faster. It would seem that the opposite should be true.