Python 内置 sum 函数 vs. for 循环性能

Python built-in sum function vs. for loop performance(Python 内置 sum 函数 vs. for 循环性能)
本文介绍了Python 内置 sum 函数 vs. for 循环性能的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着跟版网的小编来一起学习吧!

问题描述

我注意到 Python 的内置 sum 函数在对 1 000 000 个整数列表求和时比 for 循环快大约 3 倍:

I noticed that Python's built-in sum function is roughly 3x faster than a for loop when summing a list of 1 000 000 integers:

import timeit

def sum1():
    s = 0
    for i in range(1000000):
        s += i
    return s

def sum2():
    return sum(range(1000000))

print 'For Loop Sum:', timeit.timeit(sum1, number=10)
print 'Built-in Sum:', timeit.timeit(sum2, number=10)

# Prints:
# For Loop Sum: 0.751425027847
# Built-in Sum: 0.266746997833

这是为什么呢?sum是如何实现的?

Why is that? How is sum implemented?

推荐答案

速度差异实际上大于 3 倍,但是您通过首先创建一个包含 100 万个整数的巨大内存列表来减慢任一版本的速度.将其与计时赛分开:

The speed difference is actually greater than 3 times, but you slow down either version by first creating a huge in-memory list of 1 million integers. Separate that out of the time trials:

>>> import timeit
>>> def sum1(lst):
...     s = 0
...     for i in lst:
...         s += i
...     return s
... 
>>> def sum2(lst):
...     return sum(lst)
... 
>>> values = range(1000000)
>>> timeit.timeit('f(lst)', 'from __main__ import sum1 as f, values as lst', number=100)
3.457869052886963
>>> timeit.timeit('f(lst)', 'from __main__ import sum2 as f, values as lst', number=100)
0.6696369647979736

现在速度差已经上升到5倍以上了.

The speed difference has risen to over 5 times now.

for 循环作为解释的 Python 字节码执行.sum() 完全在 C 代码中循环.解释字节码和 C 代码之间的速度差异很大.

A for loop is executed as interpreted Python bytecode. sum() loops entirely in C code. The speed difference between interpreted bytecode and C code is large.

此外,如果 C 代码可以将总和保留在 C 类型中,则确保不会创建新的 Python 对象;这适用于 intfloat 结果.

In addition, the C code makes sure not to create new Python objects if it can keep the sum in C types instead; this works for int and float results.

反汇编的 Python 版本是这样做的:

The Python version, disassembled, does this:

>>> import dis
>>> def sum1():
...     s = 0
...     for i in range(1000000):
...         s += i
...     return s
... 
>>> dis.dis(sum1)
  2           0 LOAD_CONST               1 (0)
              3 STORE_FAST               0 (s)

  3           6 SETUP_LOOP              30 (to 39)
              9 LOAD_GLOBAL              0 (range)
             12 LOAD_CONST               2 (1000000)
             15 CALL_FUNCTION            1
             18 GET_ITER            
        >>   19 FOR_ITER                16 (to 38)
             22 STORE_FAST               1 (i)

  4          25 LOAD_FAST                0 (s)
             28 LOAD_FAST                1 (i)
             31 INPLACE_ADD         
             32 STORE_FAST               0 (s)
             35 JUMP_ABSOLUTE           19
        >>   38 POP_BLOCK           

  5     >>   39 LOAD_FAST                0 (s)
             42 RETURN_VALUE        

除了解释器循环比 C 慢之外,INPLACE_ADD 将创建一个新的整数对象(超过 255,CPython 将小的 int 对象缓存为单例).

Apart from the interpreter loop being slower than C, the INPLACE_ADD will create a new integer object (past 255, CPython caches small int objects as singletons).

您可以在Python mercurial 代码存储库,但它在评论中明确指出:

You can see the C implementation in the Python mercurial code repository, but it explicitly states in the comments:

/* Fast addition by keeping temporary sums in C instead of new Python objects.
   Assumes all inputs are the same type.  If the assumption fails, default
   to the more general routine.
*/

这篇关于Python 内置 sum 函数 vs. for 循环性能的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持跟版网!

本站部分内容来源互联网,如果有图片或者内容侵犯您的权益请联系我们删除!

相关文档推荐

patching a class yields quot;AttributeError: Mock object has no attributequot; when accessing instance attributes(修补类会产生“AttributeError:Mock object has no attribute;访问实例属性时)
How to mock lt;ModelClassgt;.query.filter_by() in Flask-SqlAlchemy(如何在 Flask-SqlAlchemy 中模拟 lt;ModelClassgt;.query.filter_by())
FTPLIB error socket.gaierror: [Errno 8] nodename nor servname provided, or not known(FTPLIB 错误 socket.gaierror: [Errno 8] nodename nor servname provided, or not known)
Weird numpy.sum behavior when adding zeros(添加零时奇怪的 numpy.sum 行为)
Why does the #39;int#39; object is not callable error occur when using the sum() function?(为什么在使用 sum() 函数时会出现 int object is not callable 错误?)
How to sum in pandas by unique index in several columns?(如何通过几列中的唯一索引对 pandas 求和?)