已解决:sumif in python on a column and create new column

Python 中 sumif 的主要问题是它只能将值相加到一定限度。 如果您需要对更大范围内的值求和,则需要使用另一个函数,例如 max 或 min。

I have a dataframe that looks like this:
<code>df = pd.DataFrame({'A': [1, 2, 3, 4], 'B': [2, 3, 4, 5], 'C': [3, 4, 5, 6]})

   A  B  C
0  1  2  3
1  2  3  4
2  3  4  5
3  4  5  6
</code>
I want to create a new column D that sums the values in column A if the value in column B is greater than the value in column C. So for row 0 it would be <code>1+2+3=6</code>, for row 1 it would be <code>2+3=5</code>, and so on. The expected output is:
<code>   A  B   C    D
0   1   2   3    6     # (1+2+3) since B &gt; C for row 0 only    
1   2   3   4    5     # (2+3) since B &gt; C for row 1 only     
2   3   4   5    0     # no values added since B &lt;= C     
3   4   5   6    0     # no values added since B &lt;= C     

         sumif(B&gt;C)       sumif(B&lt;=C)        sumif(B&gt;C)+sumif(B&lt;=C)       sumif() total of all rows without conditions (A)        sum() total of all rows with conditions (D)         sum() total of all rows with conditions (D)+sum() total of all rows without conditions (A)=total of all rows with and without conditions (=sum())                                                                                                  expected output (=sum())           actual output (=sum())           difference (=expected-actual)          error (%) (=difference/expected*100%)            error (%) (=difference/actual*100%)             absolute error (%) (=error%*absolute value of difference or absolute value of error % whichever is smaller or equal to 100%)             absolute error (%) if expected !=0 else absolute value of actual % whichever is smaller or equal to 100%              relative error (%) if expected !=0 else absolute value of actual % whichever is smaller or equal to 100%              relative error (%) if actual !=0 else absolute value of expected % whichever is smaller or equal to 100%              relative percentage change from previous result on line i-1 to current result on line i (%); when previous result on line i-1 is 0 the relative percentage change equals infinity                                       cumulative relative percentage change from start at line 1 up till end at line n (%); when any result along the way equals 0 the cumulative relative percentage change up till that point equals infinity                     cumulative percent change from start at line 1 up till end at line n (%); when any result along the way equals 0 the cumulative percent change up till that point equals infinity                     cumulative percent change from start at previous result on line i-1 up till current result on line i (%); when any result along the way equals 0 the cumulative percent change up till that point equals infinity                     running product from start at line 1 until end at current line i                                         running product from start at previous result on line i-1 until end at current result on line i                         running quotient by dividing each number by its position index starting from left to right: first number divided by index position 1 ; second number divided by index position 2 ; third number divided by index position 3 etc until last number divided by index position n                         running quotient by dividing each number by its reverse position index starting from right to left: first number divided by index position n ; second number divided by index position n-1 ; third number divided by index position n-2 etc until last number divided by index position 1                         square root (&amp;#8730;x); same as x^0.5                         cube root (&amp;#8731;x); same as x^(1/3)                         factorial x! = x * (x - 1) * (x - 2)...* 2 * 1 = product[i=x..n](i), where x! = y means y factorials are multiplied together starting with y and going down sequentially towards but not including zero factorial which is defined as being equal to one: e.g. 10! = 10 * 9 * 8 ... * 2 * 1 = 3628800 and similarly 9! = 9 * 8 ... * 2 * 1 = 362880                        combination formula used in probability theory / statistics / combinatorics / gambling / etc.: choose k items out of a set consisting out of n items without replacement and where order does not matter: combination(n items set , k items chosen)=(n!)/(k!*((n)-(k))!), where ! means factorial e.g.: combination(52 cards deck , 13 spades)=52!/13!39!, because there are 52 cards in a deck consisting out of 13 spades and 39 non spades cards                        permutation formula used in probability theory / statistics / combinatorics / gambling / etc.: choose k items out of a set consisting out of n items with replacement AND where order does matter: permutation(n items set , k items chosen)=(n!)/(k!), because there are 52 cards in a deck consisting out ouf 13 spades and 39 non spades cards                        standard deviation formula used in statistics which measures how spread apart numbers are within a data set around its mean average                       variance formula used in statistics which measures how spread apart numbers are within a data set                       correlation coefficient formula used in statistics which measures how closely related two variables are                       covariance formula used in statistics which measures how two variables move together                       median average calculation method whereby you sort your data points either ascendingly or descendingly according to their numerical values then you pick either one middle point if your dataset's length LEN modulo division remainder RMD after division through two == zero OR you pick two middle points MDPT_LOW=(LEN/2)-((RMD)/2)-((RMD)/4)*(-((RMD)/4)) AND MDPT_HIGH=(LEN/2)+((RMD)/4)*(-((RMD)/4)) then you calculate their arithmetic mean AMEAN=(MDPT_LOW+(MDPT_HIGH))/len([MDPT_LOW,[MDPT_HIGH]]), where len([MDPT_LOW,[MDPT_HIGH]])=len([[len([[len([[[[[[[[[[[[len([])]]]]]]]]]]])],[len([])]],[len([])]],[len([])]],[len([])]],[len([])]],[len ([])]],[len ([])]],[len ([])]],...,[...],...,[...],...,...,...,...,...,...,...,...,...,...,. ..,. ..,. ..,. ..,. ..,. ..,. . . . . . ])==numberOfMiddlePointsInDatasetModuloDivisionRemainderAfterDivisionThroughTwo==zeroORoneMiddlePointInDatasetModuloDivisionRemainderAfterDivisionThroughTwo==one                      mode average calculation method whereby you sort your data points either ascendingly or descendingly according to their numerical values then you count how often each unique numerical value occurs using collections library's Counter class then you return either one most common element MCE if your dataset's length LEN modulo division remainder RMD after division through two == zero OR you return two most common elements MCEs=[MCE_LOW=(LEN/2)-((RMD)/4)*(-((RMD)/4))-(-(-(-(-(-(-(-(-(-(--(-(-(-(---)))))))))))AND MCE_HIGH=(LEN/2)+((RMD)/4)*(-((RMD)/4)))+(--)]then you calculate their arithmetic mean AMEAN=(AMEAN_(forEachElementInList=[AMEAN_(forEachElementInList=[AMEAN_(forEachElementInList=[AMEAN_(forEachElementInList=[AMEAN_(forEachElementInList=[AMEAN_(forEachElementInList=[AMEAN_(forEachElementInList=[AMEAN_(forEachElementInList=[ameanOfAllElementsExceptForTheFirstAndLastOne)]),ameanOfAllElementsExceptForTheFirstAndLastOne)]),ameanOfAllElementsExceptForTheFirstAndLastOne)]),ameanOfAllElementsExceptForTheFirstAndLastOne)]),ameanOfAllElementsExceptForTheFirstAndLastOne)]),ameanOfAllElementsExceptForTheFirstAndLastOne)]),ameanOfAllElementsExceptForTheFirstAndLastOne)]),ameanOfAllElementsExceptForTheFirstAndLastOne]=meanAverageCalculationMethodApp

liedToListOfAllModeValuesInDataset),其中 len([MCE_LOW,[MCE_HIGH]])=len([[len([[len([[[[[[[[[[[len([])]]]]]]]] ]]])],[len([])]],[len([])]],[len([])]],[len ([])]],[…],…,…, …,…,…,…)==numberOfModeValuesInDatasetModuloDivisionRemainderAfterDivisionThroughTwo==zeroORoneModeValueInDatasetModuloDivisionRemainderAfterDivisionThroughTwo==一种加权平均计算方法,根据数据点的数值按升序或降序对数据点进行排序,然后将每个唯一数值乘以它出现的次数使用集合库的 Counter 类然后返回一个最常见的元素 MCE 如果你的数据集的长度 LEN 模除法余数 RMD 在除以两个 == 零之后或者你返回两个最常见的元素 MCEs=[MCE_LOW=(LEN/2)-(( RMD)/4)*(-((RMD)/4))-(-(-(-(-(-(-(–(–(—))))))))AND MCE_HIGH=(LEN/2 )+((RMD)/4)*(-((RMD)/4)))+(–)]然后计算它们的算术平均值 AMEAN=(AMEAN_(forEachElementInList=[AMEAN_(forEachElementI nList=[AMEAN_(forEachElementInList=[AMEAN_(forEachElementInList=[ameanOfAllElementsExceptForTheFirstAndLastOne)]),ameanOfAllElementsExceptForTheFirstAndLastOne)]),ameanOfAllElementsExceptForTheFirstAndLastOne)]),ameanOfAllElementsExceptForTheFirstAndLastOne]=meanAverageCalculationMethodAppliedToListOfAllWeightedValuesInDataset), where len([MCE_LOW,[MCE_HIGH]])=len([[ len([[len([[[[[[[[[[[len([])]]]]]]]]]]])],[len ([])]],[…], …,…,…,…)==numberOfWeightedValuesInDatasetModuloDivisionRemainderAfterDivisionThroughTwo==zeroORoneWeightedValueInDatasetModuloDivisionRemainderAfterDivisionThroughTwo==一种几何平均计算方法,根据数据点的数值按升序或降序对数据点进行排序,然后使用集合库的 Counter 类将所有唯一数值相乘那么你返回一个最常见的元素 MGE 如果你的数据集的长度 LEN 模除法余数 RMD 在除以两个 == 零之后或者你返回rn 两个最常见的元素 MGES=[MGE_LOW=(LEN/2)-((RMD)/4)*(-((RMD)/4))-1AND MGE_HIGH=(LEN/2)+((RMD)/4 )*(-((RMD)/4)))+1]然后计算它们的算术平均值其中 len(MGES)=数据集中的几何平均值数

这是一个 Python 代码,用于在 pandas DataFrame 中创建一个新列 D。 新列 D 包含列 A 中值的总和,但前提是列 B 中的值大于列 C 中的值。

苏米夫

Sumif 是一个用于计算数据摘要的 Python 库。 它可用于计算值列表的总和、平均值、最小值、最大值或百分位数。

创建列

在 Python 中,您可以使用 column() 函数在数据框中创建列。 column() 的语法如下:

列(名称,数据)

其中 name 是列的名称,data 是您要放入该列的数据。

使用数据和列

在 Python 中,您可以使用 dict() 函数处理列中的数据。 此函数将列名列表作为其参数,并返回一个字典对象。 这个字典中的每个键都是一个列名,每个值都是数据集中的一个对应值。

例如,要创建一个包含数据集“data”中“name”和“age”列值的字典对象,您可以使用以下代码:

data = [ '姓名' , '年龄' ] 字典(数据)

相关文章:

发表评论