C programming, Python programming
It's said that you can write C in any programming language. I used to do this all the time – C was my first language, and I came to Python from Perl – but I am doing it much less these days and profiting from the change. Look at how I was parsing 2D GeoRSS coordinates or a GML posList into tuples back in 2005 in the 6 lines below.
# value is "33.338611 52.566592 32.537546 52.527238 31.744196 52.409555" latlons = value.split() coords = [] for i in range(0, len(latlons), 2): lat = float(latlons[i]) lon = float(latlons[i+1]) coords.append((lon, lat)) # coords is [(52.566592, 33.338611), ...]
Counters, increments, offsets – It's rudimentary C pointer arithmetic in Python. I recently rewrote this in a different style.
def take_tuples(itr): nxt = itr.next while True: lat = nxt() lon = nxt() yield lon, lat latlons = iter(map(float, text.split())) coords = list(take_tuples(latlons))
Python isn't a functional language, but I still enjoy and benefit from using it
in a functional style. The statements in my first, older batch of code are
concerned with how to extract coordinate pairs. The statements in the second
express what values I want. I want the results of float
mapped over all
the items split from the initial value, and then I want a list of pairs of
these results.
For a more simple example, If what I want is a list of integers from 1 to 10, I write this:
instead of this:
And if what I want is integers converted to floats, I write this:
instead of this:
The generator function take_tuples
(code section 2, above) is also written
in terms of what results I want. It calls next
twice on an iterator and
yields the pair, swapped. When it hits the end of the iterator, it raises
StopIteration. This is expected and normal for functions that accept
iterators, like list(). It's the what-not-how way to get things from
a sequence, pairwise.
>>> def take_pairs(value): ... itr = iter(value) ... while True: ... yield itr.next(), itr.next() ... >>> list(take_pairs(range(4))) [(0, 1), (2, 3)] >>> list(take_pairs("abc")) [('a', 'b')]
Now, here's the icing: my new code (section 2 above, with map
and list
and the generator) benchmarks 10% faster than my old C style code. Binding the
itr.next
to a local variable (nxt
) makes a big difference. I'm eating
my cake and having it, too.
Comments
Re: C programming, Python programming
Author: Victor
But the magic really happens when you get rid of the take_tuples function:
Re: C programming, Python programming
Author: Sean
Victor, that's super cool. It benchmarks 30% faster than my C style code, and it easily extends to n-tuples (say I expected X,Y,Z triples) ala
coords = zip(*[latlons]*n) </pre> but I am swapping the order of lat and lon in take_tuples(), so zip() doesn't quite work.</p>
Re: C programming, Python programming
Author: Michal Migurski
Python does Numpy-style strides, so:
Re: C programming, Python programming
Author: Sean
Michal, that's even 1-2% faster than Victor's (my original benchmark was a linear ring with 64 vertices, approximating a circle, but difference is the same for 30 and 100 vertices). Even though slices copy data, they are apparently (at least with Python 2.7.3) a bit faster than iterators.
Re: C programming, Python programming
Author: John Drake
"Now, here's the icing: my new code (section 2 above, with map and list and the generator) benchmarks 10% faster than my old C style code. Binding the itr.next to a local variable (nxt) makes a big difference. I'm eating my cake and having it, too." That's really interesting, I wonder why it runs so much faster. It would seem that the opposite should be true.