目 录CONTENT

文章目录

【GaussDB】安装测试GaussDB基于psycopg3修改的python驱动

DarkAthena
2026-03-02 / 0 评论 / 0 点赞 / 4 阅读 / 0 字

【GaussDB】安装测试GaussDB基于psycopg3修改的python驱动

前言

前面几篇提过了这个驱动,这个psycopg3是比psycopg2更符合规范的驱动,但是上一篇最后提到了直接pip install gaussdb用起来可能会coredump,所以这篇来尝试基于源码来进行安装,并简单分析一下这个驱动的实现方式。

先按照官方文档来安装

1. 安装libpq

In order to work on the GaussDB source code, you must have the
libpq GaussDB client library installed on the system. For instance, on
EulerOS x86_64 systems, you can obtain it by running::

useradd -m gaussdbUser
usermod -aG wheel gaussdbUser
echo "gaussdbUser ALL=(ALL) NOPASSWD: ALL" | sudo tee /etc/sudoers.d/gaussdbUser
passwd gaussdbUser

su - gaussdbUser
source tools/install_gaussdb_driver.sh

这里其实就是新建了个操作系统用户,添加了免密sudo权限,然后进入这个用户执行了source tools/install_gaussdb_driver.sh,这个脚本里面是在线下载GaussDB 505.2的驱动包,解压,然后再解当前操作系统版本的python驱动,即psycopg2,再把里面的lib目录添加到 /etc/ld.so.conf.d/[操作系统用户].conf 文件里,执行了sudo ldconfig刷新了LD_LIBRARY_PATH,最后source了一把.bashrc,也就是说在执行完这个脚本后,LD_LIBRARY_PATH里就有了这个lib目录了。

其实这里就有点反直觉了,我原本以为这个脚本会解压libpq那个包,没想到实际上解压的是psycopg2的包。不过我后面对比了一下,psycopg2的lib目录会比完整的libpq包解压出来的少了很多文件,只要不基于libpq开发,而是直接使用libpq.so动态库,那么直接用psycopg2里的lib的确能节省一些空间。但是psycopg2这个无用的目录没有被删掉,所以暂且只能认为这个脚本只是为了方便开发,真正部署安装还是得自己手动配置libpq添加LD_LIBRARY_PATH环境变量。

2. 关于uv环境

文档里建议用uv环境,但是下面的pip命令前面没加uv,那么执行pip install就不会安装到uv环境里,我对比了这段和原版psycopg3,写得是一样的,可能是新版uv有变化但文档没更新?暂且不管了,我先手动在pip命令前都加上uv

3. 使用源码安装

# Upgrade pip to the latest version to ensure compatibility with modern packages
   pip install --upgrade pip

   # Install all required dependencies listed in the requirements.txt file
   pip install -r requirements.txt

   # Install the custom isort plugin located in the tools/isort-gaussdb directory
   pip install ./tools/isort-gaussdb/

   # Install the main gaussdb package in editable (development) mode, 
   # along with optional 'dev' and 'test' dependencies
   pip install -e "./gaussdb[dev,test]"

   # Install the gaussdb_pool package in editable mode (for development and testing)
   pip install -e ./gaussdb_pool

这里有个坑,执行 uv pip install ./tools/isort-gaussdb/ 会报错

Using Python 3.11.12 environment at: /home/aop/.env311
  × Failed to build `isort-gaussdb @ file:///home/aop/gaussdb-python/tools/isort-gaussdb`
  ├─▶ Failed to parse: `/home/aop/gaussdb-python/pyproject.toml`
  ╰─▶ TOML parse error at line 5, column 1
        |
      5 | [project]
        | ^^^^^^^^^
      `pyproject.toml` is using the `[project]` table, but the required `project.version` field is neither set nor present in the `project.dynamic` list

但是使用非UV环境下相同的python版本就不会报这个错。
让GPT-5.3-Codex排查了下,是因为UV的校验更严格,对子目录安装时,会向上读取仓库根目录的pyproject.toml,去找里面的version,如果没有就会报错,但实际上这个项目的三个包都是在子目录里,根目录的pyproject.toml是用来跑测试的。
AI建议我把子包目录复制到其他目录后再去执行uv pip install,或者在根目录的pyproject.toml里加上版本号。实测两种方式都可以,我暂时先把版本号加上去了(打包whl不需要在这里加版本号)。

4. 测试

源码里的测试是把连接串配置到环境变量里,然后直接执行example里的demo.py文件,我直接改了个,把连接串写到py文件里去了,

from gaussdb import connect

# Connect to an existing database
with connect("dbname=postgres user=ogadmin password=Enmo@123 host=192.168.1.131 port=8000",connect_timeout=3) as conn:

    # Open a cursor to perform database operations
    with conn.cursor() as cur:

        # Execute a command: this creates a new table
        cur.execute("""
            CREATE TABLE test (
                id serial PRIMARY KEY,
                num integer,
                data text)
            """)

        # Pass data to fill a query placeholders and let Psycopg perform
        # the correct conversion (no SQL injections!)
        cur.execute(
            "INSERT INTO test (num, data) VALUES (%s, %s)",
            (100, "abc'def"))

        # Query the database and obtain data as Python objects.
        cur.execute("SELECT * FROM test")
        print(cur.fetchone())
        # will print (1, 100, "abc'def")

        # You can use `cur.executemany()` to perform an operation in batch
        cur.executemany(
            "INSERT INTO test (num) values (%s)",
            [(33,), (66,), (99,)])

        # You can use `cur.fetchmany()`, `cur.fetchall()` to return a list
        # of several records, or even iterate on the cursor
        cur.execute("SELECT id, num FROM test order by num")
        for record in cur:
            print(record)

        # Make the changes to the database persistent
        conn.commit()

测试正常,但是要注意,test.py不能放到这个项目的根目录执行,因为这里有个gaussdb目录,import会识别错误

[aop@kylinv10sp3-node1 example]$ python test.py
(1, 100, "abc'def")
(2, 33)
(3, 66)
(4, 99)
(1, 100) 

另外,如果卸载gaussdb-pool和isort-gaussdb,这个测试用例也是可以跑通的。gaussdb-pool是连接池组件,isort-gaussdb是用来优化improt顺序的。

打包whl

先安装打包需要的几个库

uv pip install -U pip
uv pip install -U setuptools wheel build

这个项目里没有一键打包的脚本,我就随手写了个(基于AI生成改的)
build_all_whl.sh

#!/bin/bash
ROOT=$PWD
OUT_DIR="${1:-$ROOT/all_dist}"
rm -rf "$OUT_DIR"
mkdir -p "$OUT_DIR"

cd $ROOT/gaussdb && python -m build --wheel --outdir "$OUT_DIR"
cd $ROOT/gaussdb_pool && python -m build --wheel --outdir "$OUT_DIR"
cd $ROOT/tools/isort-gaussdb && python -m build --wheel --outdir "$OUT_DIR"

echo "==> built wheels:"
ls -1 "$OUT_DIR"/*.whl

之前为什么会coredump?

libpq是动态库,gaussdb(psycopg3)也还是需要libpq,也就是从python访问到c,但是和psycopg2不一样,psycopg2是 psycopg2->_psycopg.so->libpq.so ,而gaussdb(psycopg3)是 psycopg->ctypes->libpq.so ,这么写,对于不了解的人来说可能不太明白。简单来说,psycopg2里是有C代码的,会编译出动态库so文件,这样就必须编译出N个版本,在现在的国产化平台上不得不手动编译了;而gaussdb(psycopg3)里没有C代码,是纯python的,这样就不用和CPU、操作系统强绑定,发布一个包就够了。所以华为在pypi上只传了一个版本的whl文件。

至于我上一篇文章最后出现coredump,其实也还是因为当时LD_LIBRARY_PATH里的libpq版本不对,不是GaussDB的,而是操作系统自带的

(.env311) [aop@kylinv10sp3-node1 example]$ ldconfig -p |grep libpq
        libpq.so.5 (libc6,x86-64) => /usr/lib64/libpq.so.5
(.env311) [aop@kylinv10sp3-node1 example]$ python test.py
段错误 (核心已转储)
(.env311) [aop@kylinv10sp3-node1 example]$ export LD_LIBRARY_PATH=/home/aop/GaussDB-connector-python-psycopg2/depend/lib/
(.env311) [aop@kylinv10sp3-node1 example]$ python test.py
Traceback (most recent call last):
  File "/home/aop/gaussdb-python/example/test.py", line 10, in <module>
    cur.execute("""
  File "/home/aop/.env311/lib/python3.11/site-packages/gaussdb/cursor.py", line 98, in execute
    raise ex.with_traceback(None)
gaussdb.errors.DuplicateTable: Relation "test" already exists in schema "ogadmin".
DETAIL:  Creating new table with existing name in the same schema.
(.env311) [aop@kylinv10sp3-node1 example]$

如上,手动设置了LD_LIBRARY_PATH后就不会coredump了。由此也可以判断,原生PG的libpq和GaussDB的libpq接口是存在不兼容的。

如果环境中完全不存在任何libpq,则不会coredump,如下:

(.env311) [aop@kylinv10sp3-node1 example]$ sudo  mv /usr/lib64/libpq.so.5 /tmp/
(.env311) [aop@kylinv10sp3-node1 example]$ python test.py
Traceback (most recent call last):
  File "/home/aop/gaussdb-python/example/test.py", line 1, in <module>
    from gaussdb import connect
  File "/home/aop/.env311/lib/python3.11/site-packages/gaussdb/__init__.py", line 27, in <module>
    from . import pq  # noqa: F401 import early to stabilize side effects
    ^^^^^^^^^^^^^^^^
  File "/home/aop/.env311/lib/python3.11/site-packages/gaussdb/pq/__init__.py", line 116, in <module>
    import_from_libpq()
  File "/home/aop/.env311/lib/python3.11/site-packages/gaussdb/pq/__init__.py", line 108, in import_from_libpq
    raise ImportError(
ImportError: no pq wrapper available.
Attempts made:
- couldn't import gaussdb 'c' implementation: No module named 'gaussdb_c'
- couldn't import gaussdb 'binary' implementation: No module named 'gaussdb_binary'
- couldn't import gaussdb 'python' implementation: libpq.so.5: cannot open shared object file: No such file or directory
(.env311) [aop@kylinv10sp3-node1 example]$

其实这里也可以看到,这个驱动是有三种实现 ,对应源码:

    # The best implementation: fast but requires the system libpq installed
    if not impl or impl == "c":
        try:
            from gaussdb_c import pq as module  # type: ignore
        except Exception as e:
            handle_error("c", e)

    # Second best implementation: fast and stand-alone
    if not module and (not impl or impl == "binary"):
        try:
            from gaussdb_binary import pq as module  # type: ignore
        except Exception as e:
            handle_error("binary", e)

    # Pure Python implementation, slow and requires the system libpq installed.
    if not module and (not impl or impl == "python"):
        try:
            from . import pq_ctypes as module  # type: ignore[assignment]
        except Exception as e:
            handle_error("python", e)

华为改出来的这个gaussdb驱动,里面是没有gaussdb_c和gaussdb_binary的,所以只能走到最后这个纯python实现的分支。
暂不确定这三种实现的性能差异有多大,但是只用纯python的这个方式,的确更方便维护和发布。

下一篇开始整基于gaussdb(psycopg3)的sqlalchemy方言。

0
  1. 支付宝打赏

    qrcode alipay
  2. 微信打赏

    qrcode weixin
博主关闭了所有页面的评论