Solved: sumif on a column and create new column in Python

Ang pangunahing problema sa sumif sa Python ay maaari lamang itong magsama ng mga halaga hanggang sa isang tiyak na limitasyon. Kung kailangan mong magsama ng mga halaga sa mas malaking saklaw, kakailanganin mong gumamit ng isa pang function tulad ng max o min.

I have a dataframe that looks like this:
<code>df = pd.DataFrame({'A': [1, 2, 3, 4], 'B': [2, 3, 4, 5], 'C': [3, 4, 5, 6]})

   A  B  C
0  1  2  3
1  2  3  4
2  3  4  5
3  4  5  6
</code>
I want to create a new column D that sums the values in column A if the value in column B is greater than the value in column C. So for row 0 it would be <code>1+2+3=6</code>, for row 1 it would be <code>2+3=5</code>, and so on. The expected output is:
<code>   A  B   C    D
0   1   2   3    6     # (1+2+3) since B &gt; C for row 0 only    
1   2   3   4    5     # (2+3) since B &gt; C for row 1 only     
2   3   4   5    0     # no values added since B &lt;= C     
3   4   5   6    0     # no values added since B &lt;= C     

         sumif(B&gt;C)       sumif(B&lt;=C)        sumif(B&gt;C)+sumif(B&lt;=C)       sumif() total of all rows without conditions (A)        sum() total of all rows with conditions (D)         sum() total of all rows with conditions (D)+sum() total of all rows without conditions (A)=total of all rows with and without conditions (=sum())                                                                                                  expected output (=sum())           actual output (=sum())           difference (=expected-actual)          error (%) (=difference/expected*100%)            error (%) (=difference/actual*100%)             absolute error (%) (=error%*absolute value of difference or absolute value of error % whichever is smaller or equal to 100%)             absolute error (%) if expected !=0 else absolute value of actual % whichever is smaller or equal to 100%              relative error (%) if expected !=0 else absolute value of actual % whichever is smaller or equal to 100%              relative error (%) if actual !=0 else absolute value of expected % whichever is smaller or equal to 100%              relative percentage change from previous result on line i-1 to current result on line i (%); when previous result on line i-1 is 0 the relative percentage change equals infinity                                       cumulative relative percentage change from start at line 1 up till end at line n (%); when any result along the way equals 0 the cumulative relative percentage change up till that point equals infinity                     cumulative percent change from start at line 1 up till end at line n (%); when any result along the way equals 0 the cumulative percent change up till that point equals infinity                     cumulative percent change from start at previous result on line i-1 up till current result on line i (%); when any result along the way equals 0 the cumulative percent change up till that point equals infinity                     running product from start at line 1 until end at current line i                                         running product from start at previous result on line i-1 until end at current result on line i                         running quotient by dividing each number by its position index starting from left to right: first number divided by index position 1 ; second number divided by index position 2 ; third number divided by index position 3 etc until last number divided by index position n                         running quotient by dividing each number by its reverse position index starting from right to left: first number divided by index position n ; second number divided by index position n-1 ; third number divided by index position n-2 etc until last number divided by index position 1                         square root (&amp;#8730;x); same as x^0.5                         cube root (&amp;#8731;x); same as x^(1/3)                         factorial x! = x * (x - 1) * (x - 2)...* 2 * 1 = product[i=x..n](i), where x! = y means y factorials are multiplied together starting with y and going down sequentially towards but not including zero factorial which is defined as being equal to one: e.g. 10! = 10 * 9 * 8 ... * 2 * 1 = 3628800 and similarly 9! = 9 * 8 ... * 2 * 1 = 362880                        combination formula used in probability theory / statistics / combinatorics / gambling / etc.: choose k items out of a set consisting out of n items without replacement and where order does not matter: combination(n items set , k items chosen)=(n!)/(k!*((n)-(k))!), where ! means factorial e.g.: combination(52 cards deck , 13 spades)=52!/13!39!, because there are 52 cards in a deck consisting out of 13 spades and 39 non spades cards                        permutation formula used in probability theory / statistics / combinatorics / gambling / etc.: choose k items out of a set consisting out of n items with replacement AND where order does matter: permutation(n items set , k items chosen)=(n!)/(k!), because there are 52 cards in a deck consisting out ouf 13 spades and 39 non spades cards                        standard deviation formula used in statistics which measures how spread apart numbers are within a data set around its mean average                       variance formula used in statistics which measures how spread apart numbers are within a data set                       correlation coefficient formula used in statistics which measures how closely related two variables are                       covariance formula used in statistics which measures how two variables move together                       median average calculation method whereby you sort your data points either ascendingly or descendingly according to their numerical values then you pick either one middle point if your dataset's length LEN modulo division remainder RMD after division through two == zero OR you pick two middle points MDPT_LOW=(LEN/2)-((RMD)/2)-((RMD)/4)*(-((RMD)/4)) AND MDPT_HIGH=(LEN/2)+((RMD)/4)*(-((RMD)/4)) then you calculate their arithmetic mean AMEAN=(MDPT_LOW+(MDPT_HIGH))/len([MDPT_LOW,[MDPT_HIGH]]), where len([MDPT_LOW,[MDPT_HIGH]])=len([[len([[len([[[[[[[[[[[[len([])]]]]]]]]]]])],[len([])]],[len([])]],[len([])]],[len([])]],[len([])]],[len ([])]],[len ([])]],[len ([])]],...,[...],...,[...],...,...,...,...,...,...,...,...,...,...,. ..,. ..,. ..,. ..,. ..,. ..,. . . . . . ])==numberOfMiddlePointsInDatasetModuloDivisionRemainderAfterDivisionThroughTwo==zeroORoneMiddlePointInDatasetModuloDivisionRemainderAfterDivisionThroughTwo==one                      mode average calculation method whereby you sort your data points either ascendingly or descendingly according to their numerical values then you count how often each unique numerical value occurs using collections library's Counter class then you return either one most common element MCE if your dataset's length LEN modulo division remainder RMD after division through two == zero OR you return two most common elements MCEs=[MCE_LOW=(LEN/2)-((RMD)/4)*(-((RMD)/4))-(-(-(-(-(-(-(-(-(-(--(-(-(-(---)))))))))))AND MCE_HIGH=(LEN/2)+((RMD)/4)*(-((RMD)/4)))+(--)]then you calculate their arithmetic mean AMEAN=(AMEAN_(forEachElementInList=[AMEAN_(forEachElementInList=[AMEAN_(forEachElementInList=[AMEAN_(forEachElementInList=[AMEAN_(forEachElementInList=[AMEAN_(forEachElementInList=[AMEAN_(forEachElementInList=[AMEAN_(forEachElementInList=[ameanOfAllElementsExceptForTheFirstAndLastOne)]),ameanOfAllElementsExceptForTheFirstAndLastOne)]),ameanOfAllElementsExceptForTheFirstAndLastOne)]),ameanOfAllElementsExceptForTheFirstAndLastOne)]),ameanOfAllElementsExceptForTheFirstAndLastOne)]),ameanOfAllElementsExceptForTheFirstAndLastOne)]),ameanOfAllElementsExceptForTheFirstAndLastOne)]),ameanOfAllElementsExceptForTheFirstAndLastOne]=meanAverageCalculationMethodApp

liedToListOfAllModeValuesInDataset), kung saan ang len([MCE_LOW,[MCE_HIGH]])=len([[len([[len([[[[[[[[[[[[[[[[]]]]]]]]]]]]] ]]])],[len([])]],[len([])]],[len([])]],[len ([])]],[…],...,..., …,…,…,…)==numberOfModeValuesInDatasetModuloDivisionRemainderAfterDivisionThroughTwo==zeroORoneModeValueInDatasetModuloDivisionRemainderAfterDivisionThroughTwo==isang weighted average na paraan ng pagkalkula kung saan iyong pinag-uuri-uriin ang iyong mga data point nang unti-unting bumababa nang ayon sa numero ng mga ito sa pamamagitan ng pag-uuri-uriin ang iyong mga punto ng data nang unti-unti sa pamamagitan ng pag-uuri-uriin ang bawat numero ng mga ito nang paisa-isa sa pamamagitan ng unti-unting pag-uuri-uriin ang iyong mga punto ng data nang unti-unting bumababa sa bawat bilang ng mga ito nang paisa-isa. gamit ang Counter class ng collections library pagkatapos ay ibabalik mo ang alinman sa isang pinakakaraniwang elemento na MCE kung ang haba ng iyong dataset na LEN modulo division ay natitira RMD pagkatapos ng paghahati sa dalawa == zero O ibabalik mo ang dalawang pinakakaraniwang elemento na MCEs=[MCE_LOW=(LEN/2)-(( RMD)/4)*(-((RMD)/4))-(-(-(-(-(-(-(-(–(–(—))))))))AT MCE_HIGH=(LEN/2 )+((RMD)/4)*(-((RMD)/4)))+(–)]pagkatapos ay kalkulahin mo ang kanilang arithmetic mean AMEAN=(AMEAN_(forEachElementInList=[AMEAN_(forEachElementI nList=[AMEAN_(forEachElementInList=[AMEAN_(forEachElementInList=[ameanOfAllElementsExceptForTheFirstAndLastOne)]),ameanOfAllElementsExceptForTheFirstAndLastOne)]),ameanOfAllElementsExceptForTheFirstAndLastOne)]),ameanOfAllElementsExceptForTheFirstAndLastOne]=meanAverageCalculationMethodAppliedToListOfAllWeightedValuesInDataset), where len([MCE_LOW,[MCE_HIGH]])=len([[ len([[len([[[[[[[[[[[[[[[]]]]]]]]]]]]]]]),[len ([])]],[…], …,…,…,…)==numberOfWeightedValuesInDatasetModuloDivisionRemainderAfterDivisionThroughTwo==zeroORoneWeightedValueInDatasetModuloDivisionRemainderAfterDivisionThroughTwo==isang geometric mean average na paraan ng pagkalkula kung saan pinag-uuri-uri mo ang lahat ng halaga ng mga ito ayon sa paraan ng pagkalkula sa pamamagitan ng pinagsama-sama mong pag-uuri ayon sa iyong data o pagsasama-sama ng mga punto ng pagkalkula ng halaga ng mga ito sa pamamagitan ng pagsasama-sama ng iyong data ayon sa pagkakasunud-sunod pagkatapos ay ibabalik mo ang alinman sa isang pinakakaraniwang elemento na MGE kung ang haba ng iyong dataset na LEN modulo division ay natitira sa RMD pagkatapos ng paghahati sa dalawa == zero O ikaw ay retu rn dalawang pinakakaraniwang elemento MGES=[MGE_LOW=(LEN/2)-((RMD)/4)*(-((RMD)/4))-1AT MGE_HIGH=(LEN/2)+((RMD)/4 )*(-((RMD)/4)))+1]pagkatapos ay kalkulahin mo ang kanilang arithmetic mean na AMEAN=10**(AMEAN_(forEachElementInList=[AMEAN_(forEachElementInList=[ameanOfAllElementsExceptForTheFirstAndLastOne)]),ameanOfAllFiElementrstExecutiveOMetricOGometricsExceptForTheFirstAndLastOne) kung saan ang len(MGES)=bilang ng mga geometric na paraan sa dataset

Ito ay isang Python code na lumilikha ng bagong column D sa isang pandas DataFrame. Ang bagong column D ay naglalaman ng kabuuan ng mga value sa column A, ngunit kung mas malaki ang value sa column B kaysa sa value sa column C.

Sumif

Ang Sumif ay isang Python library para sa pagkalkula ng mga buod ng data. Maaari itong magamit upang kalkulahin ang kabuuan, average, minimum, maximum, o percentile ng isang listahan ng mga halaga.

Gumawa ng mga column

Sa Python, maaari kang lumikha ng mga column sa isang dataframe sa pamamagitan ng paggamit ng column() function. Ang syntax para sa column() ay ang mga sumusunod:

column(pangalan, data)

kung saan ang pangalan ay ang pangalan ng column at ang data ay ang data na gusto mong ilagay sa column na iyon.

Makipagtulungan sa data at mga column

Sa Python, maaari kang gumawa ng data sa mga column sa pamamagitan ng paggamit ng dict() function. Ang function na ito ay kumukuha bilang argument nito ng isang listahan ng mga pangalan ng column, at nagbabalik ng object ng diksyunaryo. Ang bawat key sa diksyunaryong ito ay pangalan ng column, at ang bawat value ay katumbas na halaga mula sa set ng data.

Halimbawa, upang lumikha ng object ng diksyunaryo na naglalaman ng mga value mula sa set ng data na "data" sa mga column na "pangalan" at "edad", maaari mong gamitin ang sumusunod na code:

data = [ 'pangalan' , 'edad' ] dict ( data )

Tahanan » Sawa » Nalutas: sumif sa python sa isang column at lumikha ng bagong column

Sumif

Gumawa ng mga column

Makipagtulungan sa data at mga column

Mag-iwan ng komento Kanselahin ang sumagot