An warware: pandas tana tace layuka ta dabi'u masu duhu

A duniyar nazarin bayanai, ya zama ruwan dare a gamu da manyan bayanan da ke buƙatar sarrafa bayanai da sarrafa su. Ɗaya daga cikin irin wannan matsala da ke tasowa shine tace layuka bisa ga ƙididdiga masu ban sha'awa, musamman lokacin da ake hulɗa da bayanan rubutu. Pandas, sanannen ɗakin karatu na Python don sarrafa bayanai, yana ba da kyakkyawar mafita don taimakawa magance wannan batu. A cikin wannan labarin, za mu nutse cikin yadda ake amfani da Pandas don tace layuka ta amfani da ƙima mai ban sha'awa, bincika lambar mataki-mataki, da tattauna ɗakunan karatu da ayyuka masu dacewa waɗanda zasu iya taimakawa wajen magance irin waɗannan matsalolin.

Don fara magance wannan matsalar, za mu yi amfani da Panda library tare da fuzzywuzzy ɗakin karatu wanda ke taimakawa lissafin kamance tsakanin igiyoyi daban-daban. The fuzzywuzzy ɗakin karatu yana amfani da nisa na Levenshtein, ma'aunin kamanni dangane da adadin gyare-gyare (sakewa, sharewa, ko maye gurbin) da ake buƙata don canza wannan kirtani zuwa wani.

Shigarwa da shigo da dakunan karatu da ake buƙata

Don farawa, muna buƙatar shigar da shigo da dakunan karatu masu mahimmanci. Kuna iya amfani da pip don shigar da Pandas da fuzzywuzzy:

pip install pandas
pip install fuzzywuzzy

Da zarar an shigar, shigo da dakunan karatu a cikin lambar Python ku:

import pandas as pd
from fuzzywuzzy import fuzz, process

Tace Layukan Da Aka Gina Akan Ƙimar Maɗaukaki

Yanzu da mun shigo da dakunan karatu da ake buƙata, bari mu ƙirƙiri saitin bayanan ƙagaggen mu baje kolin yadda ake tace layuka bisa ga ƙima. A cikin wannan misalin, saitin bayanan mu zai ƙunshi sunayen tufafi da salon su.

data = {'Garment': ['T-shirt', 'Polo shirt', 'Jeans', 'Leather jacket', 'Winter coat'],
        'Style': ['Casual', 'Casual', 'Casual', 'Biker', 'Winter']}
df = pd.DataFrame(data)

A ɗauka muna son tace layuka masu ɗauke da riguna masu kama da “Tee shirt”, za mu buƙaci yin amfani da ɗakin karatu na fuzzywuzzy don cim ma wannan.

search_string = "Tee shirt"
threshold = 70

def filter_rows(df, column, search_string, threshold):
    return df[df[column].apply(lambda x: fuzz.token_sort_ratio(x, search_string)) >= threshold]

filtered_df = filter_rows(df, 'Garment', search_string, threshold)

A cikin lambar da ke sama, muna ayyana aiki tace_launi wanda ke ɗaukar sigogi huɗu: DataFrame, sunan ginshiƙi, igiyoyin bincike, da madaidaicin kofa. Yana dawo da tacewa DataFrame dangane da ƙayyadadden ƙofa, wanda aka ƙididdige shi ta amfani da fuzz.token_sort_ratio aiki daga fuzzywuzzy library.

Fahimtar Code Mataki-da-mataki

  • Da farko, mun ƙirƙiri DataFrame da ake kira df dauke da saitin bayanan mu.
  • Na gaba, muna ayyana kirtan binciken mu azaman "Tee shirt" kuma saita madaidaicin kofa na 70. Kuna iya daidaita ƙimar kofa gwargwadon matakin kamancen da kuke so.
  • Sai mu ƙirƙiri wani aiki mai suna tace_launi, wanda ke tace DataFrame dangane da nisa na Levenshtein tsakanin layin bincike da ƙimar kowane jere a cikin ƙayyadadden ginshiƙi.
  • A ƙarshe, muna kira da tace_launi aiki don samun tacewa DataFrame, tace_df.

A ƙarshe, Pandas, a haɗe tare da ɗakin karatu na fuzzywuzzy, kayan aiki ne mai kyau don tace layuka dangane da ƙima mara kyau. Fahimtar waɗannan ɗakunan karatu da ayyukansu yana ba mu damar sarrafa bayanai da kyau da kuma magance hadaddun ayyukan sarrafa bayanai.

Shafi posts:

Leave a Comment