• <bdo id='qXNse'></bdo><ul id='qXNse'></ul>

    <tfoot id='qXNse'></tfoot>
    <i id='qXNse'><tr id='qXNse'><dt id='qXNse'><q id='qXNse'><span id='qXNse'><b id='qXNse'><form id='qXNse'><ins id='qXNse'></ins><ul id='qXNse'></ul><sub id='qXNse'></sub></form><legend id='qXNse'></legend><bdo id='qXNse'><pre id='qXNse'><center id='qXNse'></center></pre></bdo></b><th id='qXNse'></th></span></q></dt></tr></i><div id='qXNse'><tfoot id='qXNse'></tfoot><dl id='qXNse'><fieldset id='qXNse'></fieldset></dl></div>
    <legend id='qXNse'><style id='qXNse'><dir id='qXNse'><q id='qXNse'></q></dir></style></legend>
  • <small id='qXNse'></small><noframes id='qXNse'>

        使用 Pandas 为每个过滤器识别列中最接近的值

        Identifying closest value in a column for each filter using Pandas(使用 Pandas 为每个过滤器识别列中最接近的值)
      1. <small id='Jsqv7'></small><noframes id='Jsqv7'>

        <i id='Jsqv7'><tr id='Jsqv7'><dt id='Jsqv7'><q id='Jsqv7'><span id='Jsqv7'><b id='Jsqv7'><form id='Jsqv7'><ins id='Jsqv7'></ins><ul id='Jsqv7'></ul><sub id='Jsqv7'></sub></form><legend id='Jsqv7'></legend><bdo id='Jsqv7'><pre id='Jsqv7'><center id='Jsqv7'></center></pre></bdo></b><th id='Jsqv7'></th></span></q></dt></tr></i><div id='Jsqv7'><tfoot id='Jsqv7'></tfoot><dl id='Jsqv7'><fieldset id='Jsqv7'></fieldset></dl></div>
        • <legend id='Jsqv7'><style id='Jsqv7'><dir id='Jsqv7'><q id='Jsqv7'></q></dir></style></legend>

            <tbody id='Jsqv7'></tbody>
        • <tfoot id='Jsqv7'></tfoot>
                <bdo id='Jsqv7'></bdo><ul id='Jsqv7'></ul>

                1. 本文介绍了使用 Pandas 为每个过滤器识别列中最接近的值的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着跟版网的小编来一起学习吧!

                  问题描述

                  我有一个包含类别和值的数据框.我需要在每个类别中找到最接近某个值的值.我想我已经接近了,但在将 argsort 的结果应用于原始数据帧时,我无法真正获得正确的输出.

                  I have a data frame with categories and values. I need to find the value in each category closest to a value. I think I'm close but I can't really get the right output when applying the results of argsort to the original dataframe.

                  例如,如果输入是在下面的代码中定义的,则输出应该只有 (a, 1, True), (b, 2, True),(c, 2, True) 和所有其他 isClosest Values 应为 False.

                  For example, if the input was defined in the code below the output should have only (a, 1, True), (b, 2, True), (c, 2, True) and all other isClosest Values should be False.

                  如果多个值最接近,那么它应该是列出的第一个标记的值.

                  If multiple values are closest then it should be the first value listed marked.

                  这是我拥有的代码,但我无法让它正确地重新应用于数据帧.我会喜欢一些指针.

                  Here is the code I have which works but I can't get it to reapply to the dataframe correctly. I would love some pointers.

                  df = pd.DataFrame()
                  df['category'] = ['a', 'b', 'b', 'b', 'c', 'a', 'b', 'c', 'c', 'a']
                  df['values'] = [1, 2, 3, 4, 5, 4, 3, 2, 1, 0]
                  df['isClosest'] = False
                  
                  uniqueCategories = df['category'].unique()
                  for c in uniqueCategories:
                      filteredCategories = df[df['category']==c]    
                      sortargs = (filteredCategories['value']-2.0).abs().argsort()
                      #how to use sortargs so that we set column in df isClosest=True if its the closest value in each category to 2.0?
                  

                  推荐答案

                  可以创建一列绝对差值:

                  You can create a column of absolute differences:

                  df['dif'] = (df['values'] - 2).abs()
                  
                  df
                  Out: 
                    category  values  dif
                  0        a       1    1
                  1        b       2    0
                  2        b       3    1
                  3        b       4    2
                  4        c       5    3
                  5        a       4    2
                  6        b       3    1
                  7        c       2    0
                  8        c       1    1
                  9        a       0    2
                  

                  然后用groupby.transform检查每组的最小值是否等于你计算的差:

                  And then use groupby.transform to check whether the minimum value of each group is equal to the difference you calculated:

                  df['is_closest'] = df.groupby('category')['dif'].transform('min') == df['dif']
                  
                  df
                  Out: 
                    category  values  dif is_closest
                  0        a       1    1       True
                  1        b       2    0       True
                  2        b       3    1      False
                  3        b       4    2      False
                  4        c       5    3      False
                  5        a       4    2      False
                  6        b       3    1      False
                  7        c       2    0       True
                  8        c       1    1      False
                  9        a       0    2      False
                  

                  df.groupby('category')['dif'].idxmin() 还会为您提供每个类别的最接近值的索引.您也可以将其用于映射.

                  df.groupby('category')['dif'].idxmin() would also give you the indices of the closest values for each category. You can use that for mapping too.

                  供选择:

                  df.loc[df.groupby('category')['dif'].idxmin()]
                  Out: 
                    category  values  dif
                  0        a       1    1
                  1        b       2    0
                  7        c       2    0
                  

                  分配:

                  df['is_closest'] = False
                  df.loc[df.groupby('category')['dif'].idxmin(), 'is_closest'] = True
                  df
                  Out: 
                    category  values  dif is_closest
                  0        a       1    1       True
                  1        b       2    0       True
                  2        b       3    1      False
                  3        b       4    2      False
                  4        c       5    3      False
                  5        a       4    2      False
                  6        b       3    1      False
                  7        c       2    0       True
                  8        c       1    1      False
                  9        a       0    2      False
                  

                  这些方法之间的区别在于,如果您根据差异检查相等性,则在出现平局的情况下,所有行都会得到 True.但是,使用 idxmin 它将在第一次出现时返回 True(每个组只有一个).

                  The difference between these approaches is that if you check equality against the difference, you would get True for all rows in case of ties. However, with idxmin it will return True for the first occurrence (only one for each group).

                  这篇关于使用 Pandas 为每个过滤器识别列中最接近的值的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持跟版网!

                  本站部分内容来源互联网,如果有图片或者内容侵犯了您的权益,请联系我们,我们会在确认后第一时间进行删除!

                  相关文档推荐

                  Initialize Multiple Numpy Arrays (Multiple Assignment) - Like MATLAB deal()(初始化多个 Numpy 数组(多重赋值) - 像 MATLAB deal())
                  How to extend Python class init(如何扩展 Python 类初始化)
                  What#39;s the difference between dict() and {}?(dict() 和 {} 有什么区别?)
                  What is a wrapper_descriptor, and why is Foo.__init__() one in this case?(什么是 wrapper_descriptor,为什么 Foo.__init__() 在这种情况下是其中之一?)
                  Initialize list with same bool value(使用相同的布尔值初始化列表)
                  setattr with kwargs, pythonic or not?(setattr 与 kwargs,pythonic 与否?)

                  <i id='oU3rZ'><tr id='oU3rZ'><dt id='oU3rZ'><q id='oU3rZ'><span id='oU3rZ'><b id='oU3rZ'><form id='oU3rZ'><ins id='oU3rZ'></ins><ul id='oU3rZ'></ul><sub id='oU3rZ'></sub></form><legend id='oU3rZ'></legend><bdo id='oU3rZ'><pre id='oU3rZ'><center id='oU3rZ'></center></pre></bdo></b><th id='oU3rZ'></th></span></q></dt></tr></i><div id='oU3rZ'><tfoot id='oU3rZ'></tfoot><dl id='oU3rZ'><fieldset id='oU3rZ'></fieldset></dl></div>
                    <bdo id='oU3rZ'></bdo><ul id='oU3rZ'></ul>
                      <legend id='oU3rZ'><style id='oU3rZ'><dir id='oU3rZ'><q id='oU3rZ'></q></dir></style></legend>

                        <small id='oU3rZ'></small><noframes id='oU3rZ'>

                          <tbody id='oU3rZ'></tbody>

                          <tfoot id='oU3rZ'></tfoot>