An warware: pandas tebur zuwa postgresql

A duniyar bincike da sarrafa bayanai, ɗaya daga cikin shahararrun ɗakunan karatu na Python shine Panda. Yana ba da kayan aiki masu ƙarfi iri-iri don aiki tare da bayanan da aka tsara, yana sauƙaƙa sarrafa shi, gani da nazari. Ɗaya daga cikin ayyuka da yawa da mai nazarin bayanai zai iya ci karo da shi shine shigo da bayanai daga a CSV file in a PostgreSQL database. A cikin wannan labarin, za mu tattauna yadda za a yi wannan aiki yadda ya kamata da kuma yadda ya kamata ta amfani da duka biyun Panda da dabaru2 ɗakin karatu. Za mu kuma bincika ayyuka daban-daban da ɗakunan karatu da ke cikin wannan tsari, tare da samar da cikakkiyar fahimtar mafita.

Gabatarwa zuwa Pandas da PostgreSQL

Pandas babban ɗakin karatu ne na Python wanda ke ba da tsarin bayanai mai sauƙin amfani da ayyukan sarrafa bayanai don nazarin bayanai. Yana da amfani musamman lokacin da ake mu'amala da manyan saitin bayanai ko lokacin da kuke buƙatar yin rikitattun canje-canjen bayanai. PostgreSQL, a gefe guda, kyauta ne kuma tsarin buɗe tushen tushen abu-dangane da tsarin sarrafa bayanai (ORDBMS) yana ƙarfafa haɓakawa da bin SQL. Ana amfani da shi ko'ina don manyan ayyuka masu rikitarwa masu rikitarwa.

Yanzu, bari mu ce muna da fayil ɗin CSV mai ɗauke da manyan bayanai, kuma muna son shigo da shi cikin bayanan PostgreSQL. Hanyar gama gari don cimma wannan aikin ita ce amfani da Pandas a hade tare da ɗakin karatu na psycopg2, wanda ke ba da adaftar bayanan bayanan PostgreSQL wanda ke ba mu damar sadarwa tare da shi ta amfani da Python.

Pandas: Karanta fayilolin CSV

Mataki na farko a cikin tsarin mu shine karanta abubuwan da ke cikin fayil ɗin CSV ta amfani da Pandas.

import pandas as pd

filename = "example.csv"
df = pd.read_csv(filename)

Wannan code yana amfani da pd.read_csv() aiki, wanda ke karanta fayil ɗin CSV kuma ya dawo da wani abu na DataFrame. Tare da abin DataFrame, za mu iya sarrafa bayanai cikin sauƙi da bincika bayanan.

Haɗa zuwa bayanan PostgreSQL

Mataki na gaba shine haɗi zuwa bayananmu na PostgreSQL ta amfani da ɗakin karatu na psycopg2. Don yin wannan, muna buƙatar shigar da ɗakin karatu na psycopg2, wanda za'a iya yi ta amfani da pip:

pip install psycopg2

Da zarar an shigar da ɗakin karatu, muna buƙatar haɗi zuwa bayananmu na PostgreSQL:

import psycopg2

connection = psycopg2.connect(
    dbname="your_database_name",
    user="your_username",
    password="your_password",
    host="your_hostname",
    port="your_port",
)

The psycopg2.connect() Aiki yana kafa haɗi tare da uwar garken bayanai ta amfani da bayanan da aka bayar. Idan haɗin ya yi nasara, aikin zai dawo da abin haɗin da za mu yi amfani da shi don yin hulɗa tare da bayanan bayanai.

Ƙirƙirar tebur a cikin PostgreSQL

Yanzu da muke da bayanan mu a cikin wani abu na DataFrame da haɗin kai zuwa bayanan PostgreSQL, za mu iya ƙirƙirar tebur a cikin bayanan don adana bayanan mu.

cursor = connection.cursor()
create_table_query = '''
CREATE TABLE IF NOT EXISTS example_table (
    column1 data_type,
    column2 data_type,
    ...
)
'''
cursor.execute(create_table_query)
connection.commit()

A cikin wannan snippet na lambar, da farko mun ƙirƙiri abun siginar ta amfani da connection.cursor() hanya. Ana amfani da siginan kwamfuta don yin ayyukan bayanai kamar ƙirƙirar tebur da saka bayanai. Bayan haka, muna ayyana tambayar SQL don ƙirƙirar tebur, da aiwatar da shi ta amfani da cursor.execute() hanya. A ƙarshe, mun ƙaddamar da canje-canje zuwa bayanan bayanai tare da connection.commit().

Saka bayanai a cikin bayanan PostgreSQL

Yanzu da muke da tebur, za mu iya saka bayanai daga DataFrame a cikin bayanan PostgreSQL ta amfani da zuwa_sql() Hanyar da Pandas ya bayar.

from sqlalchemy import create_engine

engine = create_engine("postgresql://your_username:your_password@your_hostname:your_port/your_database_name")
df.to_sql("example_table", engine, if_exists="append", index=False)

A cikin wannan snippet code, mun fara ƙirƙirar injin bayanai ta amfani da engine_engine() aikin ɗakin karatu na SQLAlchemy, wanda ke buƙatar igiyar haɗi mai ɗauke da bayanan bayanan mu. Sa'an nan, mu yi amfani da zuwa_sql() hanyar saka bayanai daga DataFrame ɗinmu a cikin tebur "example_table" a cikin bayanan PostgreSQL.

A ƙarshe, wannan labarin yana ba da cikakken jagora kan yadda ake shigo da bayanai daga fayil ɗin CSV a cikin bayanan PostgreSQL ta amfani da Pandas da psycopg2. Ta hanyar haɗuwa da sauƙi na magudin bayanai a cikin Pandas tare da iko da scalability na PostgreSQL, za mu iya cimma nasara maras kyau da ingantaccen bayani ga aikin gama gari na shigo da bayanan CSV a cikin bayanan bayanai.

Shafi posts:

Leave a Comment