在 Python 中,用包来组织模块。
Python 包
两种类型的包:
- 常规包
- 命名空间包
常规包
常规包是包含 __init__.py 文件的目录。此文件指示应将目录视为包。__init__.py 文件可以为空,但它通常用于初始化包、定义向外部公开的内容或在导入包时运行设置代码。
mypackage/
__init__.py
module1.py
module2.py
命名空间包
命名空间包是一种类似包的结构,不需要 __init__.py 文件,并且可以跨越多个目录。它允许分布式和灵活的包创建,从而可以将包的不同部分拆分到不同的位置。
例:
想象一下分布在两个目录中的命名空间包 mynamespace:
目录 1:
project1/mynamespace/
module1.py
目录 2:
project2/mynamespace/
module2.py
用法
import mynamespace.module1
import mynamespace.module2
Python 会将两个 mynamespace 目录合并到一个命名空间包中,使所有模块都可以访问。
比较
命名空间包示例
我们了解到 Namespace 包的源代码可以分布在多个目录中。它有助于根据团队对大型项目进行解耦,以便可以独立开发和分发包的组件。
例如:一个存储库(一个团队)可以处理一个组件,另一个团队可以处理同一包的另一个组件,但独立开发和分发。
让我们了解 Namespace 包如何帮助独立开发和分发包。
假设我们有一个公司 (log-corp) 提供了一个大而有用的包。但是该包的子包/模块在不同的存储库(目录)中是多样化的,因为不同的团队正在处理它。
- log_corp 是主包,其中包含许多跨团队开发并独立分发的子包和模块。
- 版本控制和包分发将独立进行。
- 他们唯一共享的是命名空间。(所有子包和模块都将归入主包log_corp)
团队 1(存储库 1)
使用 src 布局。
LOG_CORP_COMP1/
├── src/
│ └── log_corp/
│ ├── sub_pkg1/
│ │ └── sub_module1.py
│ └── module1.py
├── pyproject.toml
└── README.md
module1.py
def func1():
print('From log-corp module1')
sub_module1.py
def sub_func1():
print('From log-corp submodule1')
pyproject.toml 文件
[build-system]
requires = ["setuptools"]
build-backend = "setuptools.build_meta"
[project]
name = "log-corp1" # as if it appears in pypi
authors = [
{name = "Logesh", email = "log@gmail.com"},
]
description = "My package description"
readme = "README.md"
requires-python = ">=3.10"
license = {text = "MIT License"}
classifiers = [
"Programming Language :: Python :: 3",
]
version = "1.0.1" # version 1.0.1
构建软件包
PS C:\Users\L\Desktop\log_corp_comp1> ls
Directory: C:\Users\L\Desktop\log_corp_comp1
Mode LastWriteTime Length Name
---- ------------- ------ ----
d----- 12/28/2024 11:32 AM src
-a---- 12/28/2024 11:41 AM 387 pyproject.toml
-a---- 12/28/2024 11:31 AM 34 README.md
PS C:\Users\L\Desktop\log_corp_comp1> python -m build
* Creating isolated environment: venv+pip...
* Installing packages in isolated environment:
- setuptools
* Getting build dependencies for sdist...
running egg_info
creating src\log_corp1.egg-info
writing src\log_corp1.egg-info\PKG-INFO
writing dependency_links to src\log_corp1.egg-info\dependency_links.txt
writing top-level names to src\log_corp1.egg-info\top_level.txt
writing manifest file 'src\log_corp1.egg-info\SOURCES.txt'
reading manifest file 'src\log_corp1.egg-info\SOURCES.txt'
writing manifest file 'src\log_corp1.egg-info\SOURCES.txt'
* Building sdist...
running sdist
running egg_info
writing src\log_corp1.egg-info\PKG-INFO
writing dependency_links to src\log_corp1.egg-info\dependency_links.txt
writing top-level names to src\log_corp1.egg-info\top_level.txt
reading manifest file 'src\log_corp1.egg-info\SOURCES.txt'
writing manifest file 'src\log_corp1.egg-info\SOURCES.txt'
running check
creating log_corp1-1.0.1
creating log_corp1-1.0.1\src\log_corp
creating log_corp1-1.0.1\src\log_corp1.egg-info
creating log_corp1-1.0.1\src\log_corp\sub_pkg1
copying files to log_corp1-1.0.1...
copying README.md -> log_corp1-1.0.1
copying pyproject.toml -> log_corp1-1.0.1
copying src\log_corp\module1.py -> log_corp1-1.0.1\src\log_corp
copying src\log_corp1.egg-info\PKG-INFO -> log_corp1-1.0.1\src\log_corp1.egg-info
copying src\log_corp1.egg-info\SOURCES.txt -> log_corp1-1.0.1\src\log_corp1.egg-info
copying src\log_corp1.egg-info\dependency_links.txt -> log_corp1-1.0.1\src\log_corp1.egg-info
copying src\log_corp1.egg-info\top_level.txt -> log_corp1-1.0.1\src\log_corp1.egg-info
copying src\log_corp\sub_pkg1\sub_module1.py -> log_corp1-1.0.1\src\log_corp\sub_pkg1
copying src\log_corp1.egg-info\SOURCES.txt -> log_corp1-1.0.1\src\log_corp1.egg-info
Writing log_corp1-1.0.1\setup.cfg
Creating tar archive
removing 'log_corp1-1.0.1' (and everything under it)
* Building wheel from sdist
* Creating isolated environment: venv+pip...
* Installing packages in isolated environment:
- setuptools
* Getting build dependencies for wheel...
running egg_info
writing src\log_corp1.egg-info\PKG-INFO
writing dependency_links to src\log_corp1.egg-info\dependency_links.txt
writing top-level names to src\log_corp1.egg-info\top_level.txt
reading manifest file 'src\log_corp1.egg-info\SOURCES.txt'
writing manifest file 'src\log_corp1.egg-info\SOURCES.txt'
* Building wheel...
running bdist_wheel
running build
running build_py
creating build\lib\log_corp
copying src\log_corp\module1.py -> build\lib\log_corp
creating build\lib\log_corp\sub_pkg1
copying src\log_corp\sub_pkg1\sub_module1.py -> build\lib\log_corp\sub_pkg1
running egg_info
writing src\log_corp1.egg-info\PKG-INFO
writing dependency_links to src\log_corp1.egg-info\dependency_links.txt
writing top-level names to src\log_corp1.egg-info\top_level.txt
reading manifest file 'src\log_corp1.egg-info\SOURCES.txt'
writing manifest file 'src\log_corp1.egg-info\SOURCES.txt'
installing to build\bdist.win-amd64\wheel
running install
running install_lib
creating build\bdist.win-amd64\wheel
creating build\bdist.win-amd64\wheel\log_corp
copying build\lib\log_corp\module1.py -> build\bdist.win-amd64\wheel\.\log_corp
creating build\bdist.win-amd64\wheel\log_corp\sub_pkg1
copying build\lib\log_corp\sub_pkg1\sub_module1.py -> build\bdist.win-amd64\wheel\.\log_corp\sub_pkg1
running install_egg_info
Copying src\log_corp1.egg-info to build\bdist.win-amd64\wheel\.\log_corp1-1.0.1-py3.11.egg-info
running install_scripts
creating build\bdist.win-amd64\wheel\log_corp1-1.0.1.dist-info\WHEEL
creating 'C:\Users\L\Desktop\log_corp_comp1\dist\.tmp-ryh5ya6f\log_corp1-1.0.1-py3-none-any.whl' and adding 'build\bdist.win-amd64\wheel' to it
adding 'log_corp/module1.py'
adding 'log_corp/sub_pkg1/sub_module1.py'
adding 'log_corp1-1.0.1.dist-info/METADATA'
adding 'log_corp1-1.0.1.dist-info/WHEEL'
adding 'log_corp1-1.0.1.dist-info/top_level.txt'
adding 'log_corp1-1.0.1.dist-info/RECORD'
removing build\bdist.win-amd64\wheel
Successfully built log_corp1-1.0.1.tar.gz and log_corp1-1.0.1-py3-none-any.whl
PS C:\Users\L\Desktop\log_corp_comp1>
团队 2(存储库 2)
src 布局
LOG_CORP_COMP2/
├── src/
│ └── log_corp/
│ ├── sub_pkg2/
│ │ └── sub_module2.py
│ └── module2.py
├── pyproject.toml
└── README.md
module2.py
def func2():
print('From log-corp module2')
sub_module2.py
def sub_func2():
print('From log-corp submodule2')
pyproject.toml 文件
[build-system]
requires = ["setuptools"]
build-backend = "setuptools.build_meta"
[project]
name = "log-corp2" # Different package name from the previous one
authors = [
{name = "Logesh", email = "log@gmail.com"},
]
description = "My package description"
readme = "README.md"
requires-python = ">=3.10"
license = {text = "MIT License"}
classifiers = [
"Programming Language :: Python :: 3",
]
version = "1.0.2" # Different version
构建了软件包
PS E:\Documents\log_corp_comp2> ls
Directory: E:\Documents\log_corp_comp2
Mode LastWriteTime Length Name
---- ------------- ------ ----
d----- 12/28/2024 11:25 AM src
-a---- 12/28/2024 11:31 AM 34 README.md
-a---- 12/28/2024 11:41 AM 387 pyproject.toml
PS E:\Documents\log_corp_comp2> python -m build
* Creating isolated environment: venv+pip...
* Installing packages in isolated environment:
- setuptools
* Getting build dependencies for sdist...
running egg_info
creating src\log_corp2.egg-info
writing src\log_corp2.egg-info\PKG-INFO
writing dependency_links to src\log_corp2.egg-info\dependency_links.txt
writing top-level names to src\log_corp2.egg-info\top_level.txt
writing manifest file 'src\log_corp2.egg-info\SOURCES.txt'
reading manifest file 'src\log_corp2.egg-info\SOURCES.txt'
writing manifest file 'src\log_corp2.egg-info\SOURCES.txt'
* Building sdist...
running sdist
running egg_info
writing src\log_corp2.egg-info\PKG-INFO
writing dependency_links to src\log_corp2.egg-info\dependency_links.txt
writing top-level names to src\log_corp2.egg-info\top_level.txt
reading manifest file 'src\log_corp2.egg-info\SOURCES.txt'
writing manifest file 'src\log_corp2.egg-info\SOURCES.txt'
running check
creating log_corp2-1.0.2
creating log_corp2-1.0.2\src\log_corp
creating log_corp2-1.0.2\src\log_corp2.egg-info
creating log_corp2-1.0.2\src\log_corp\sub_pkg2
copying files to log_corp2-1.0.2...
copying README.md -> log_corp2-1.0.2
copying pyproject.toml -> log_corp2-1.0.2
copying src\log_corp\module2.py -> log_corp2-1.0.2\src\log_corp
copying src\log_corp2.egg-info\PKG-INFO -> log_corp2-1.0.2\src\log_corp2.egg-info
copying src\log_corp2.egg-info\SOURCES.txt -> log_corp2-1.0.2\src\log_corp2.egg-info
copying src\log_corp2.egg-info\dependency_links.txt -> log_corp2-1.0.2\src\log_corp2.egg-info
copying src\log_corp2.egg-info\top_level.txt -> log_corp2-1.0.2\src\log_corp2.egg-info
copying src\log_corp\sub_pkg2\sub_module2.py -> log_corp2-1.0.2\src\log_corp\sub_pkg2
Writing log_corp2-1.0.2\setup.cfg
Creating tar archive
removing 'log_corp2-1.0.2' (and everything under it)
* Building wheel from sdist
* Creating isolated environment: venv+pip...
* Installing packages in isolated environment:
- setuptools
* Getting build dependencies for wheel...
running egg_info
writing src\log_corp2.egg-info\PKG-INFO
writing dependency_links to src\log_corp2.egg-info\dependency_links.txt
writing top-level names to src\log_corp2.egg-info\top_level.txt
reading manifest file 'src\log_corp2.egg-info\SOURCES.txt'
writing manifest file 'src\log_corp2.egg-info\SOURCES.txt'
* Building wheel...
running bdist_wheel
running build
running build_py
creating build\lib\log_corp
copying src\log_corp\module2.py -> build\lib\log_corp
creating build\lib\log_corp\sub_pkg2
copying src\log_corp\sub_pkg2\sub_module2.py -> build\lib\log_corp\sub_pkg2
running egg_info
writing src\log_corp2.egg-info\PKG-INFO
writing dependency_links to src\log_corp2.egg-info\dependency_links.txt
writing top-level names to src\log_corp2.egg-info\top_level.txt
reading manifest file 'src\log_corp2.egg-info\SOURCES.txt'
writing manifest file 'src\log_corp2.egg-info\SOURCES.txt'
installing to build\bdist.win-amd64\wheel
running install
running install_lib
creating build\bdist.win-amd64\wheel
creating build\bdist.win-amd64\wheel\log_corp
copying build\lib\log_corp\module2.py -> build\bdist.win-amd64\wheel\.\log_corp
creating build\bdist.win-amd64\wheel\log_corp\sub_pkg2
copying build\lib\log_corp\sub_pkg2\sub_module2.py -> build\bdist.win-amd64\wheel\.\log_corp\sub_pkg2
running install_egg_info
Copying src\log_corp2.egg-info to build\bdist.win-amd64\wheel\.\log_corp2-1.0.2-py3.11.egg-info
running install_scripts
creating build\bdist.win-amd64\wheel\log_corp2-1.0.2.dist-info\WHEEL
creating 'E:\Documents\log_corp_comp2\dist\.tmp-bn0t41tx\log_corp2-1.0.2-py3-none-any.whl' and adding 'build\bdist.win-amd64\wheel' to it
adding 'log_corp/module2.py'
adding 'log_corp/sub_pkg2/sub_module2.py'
adding 'log_corp2-1.0.2.dist-info/METADATA'
adding 'log_corp2-1.0.2.dist-info/WHEEL'
adding 'log_corp2-1.0.2.dist-info/top_level.txt'
adding 'log_corp2-1.0.2.dist-info/RECORD'
removing build\bdist.win-amd64\wheel
Successfully built log_corp2-1.0.2.tar.gz and log_corp2-1.0.2-py3-none-any.whl
PS E:\Documents\log_corp_comp2>
使用 log_corp
创建虚拟环境。
安装 log-corp 的一个组件
安装 log-corp 的另一个组件
因此,我们安装了 2 个共享相同命名空间的不同包(主包)。所以,现在这两个软件包 log_corp1 和 log_corp2 可以独立进行开发、分发、安装、升级、版本化。(仍然共享相同的命名空间,即所谓的 Namespace 包。
在这里,源的合并是在安装过程中自动完成的,并且在合并过程中没有冲突。这是使用 Namespace 包的主要好处之一。
将 log-corp 作为一个整体使用,它是从不同来源安装的。
一个明显的缺点是 import * 到 log_corp 无法按预期工作。
由于没有__init__.py文件来初始化包。
尝试使用常规软件包实现相同的效果
为两个组件添加 __init__.py,并在该文件中进行一些初始化。
组件 1
LOG_CORP_COMP1/
├── src/
│ └── log_corp/
│ ├── sub_pkg1/
│ │ ├── __init__.py
│ │ └── sub_module1.py
│ ├── __init__.py
│ └── module1.py
├── pyproject.toml
└── README.md
log_corp/__init__.py
from . import module1
from . import sub_pkg1
log_corp/sub_pkg1/__init__.py
from . import sub_module1
其余文件具有相同的内容。
组件 2
LOG_CORP_COMP2/
├── src/
│ └── log_corp/
│ ├── sub_pkg2/
│ │ ├── __init__.py
│ │ └── sub_module2.py
│ ├── __init__.py
│ └── module2.py
├── pyproject.toml
└── README.md
log_corp/__init__.py
from . import module2
from . import sub_pkg2
log_corp/sub_pkg2/__init__.py
from . import sub_module2
其余文件具有相同的内容。
构建两个软件包
组件 1
PS C:\Users\L\Desktop\log_corp_comp1> python -m build
* Creating isolated environment: venv+pip...
* Installing packages in isolated environment:
- setuptools
* Getting build dependencies for sdist...
running egg_info
creating src\log_corp1.egg-info
<...truncated output>
removing build\bdist.win-amd64\wheel
Successfully built log_corp1-1.0.1.tar.gz and log_corp1-1.0.1-py3-none-any.whl
PS C:\Users\L\Desktop\log_corp_comp1>
PS C:\Users\L\Desktop\log_corp_comp1> ls
Directory: C:\Users\L\Desktop\log_corp_comp1
Mode LastWriteTime Length Name
---- ------------- ------ ----
d----- 12/28/2024 1:31 PM dist
d----- 12/28/2024 1:31 PM src
-a---- 12/28/2024 11:41 AM 387 pyproject.toml
-a---- 12/28/2024 11:31 AM 34 README.md
PS C:\Users\L\Desktop\log_corp_comp1>
组件 2
PS E:\Documents\log_corp_comp2> python -m build
* Creating isolated environment: venv+pip...
* Installing packages in isolated environment:
- setuptools
* Getting build dependencies for sdist...
running egg_info
creating src\log_corp2.egg-info
writing src\log_corp2.egg-info\PKG-INFO
writing dependency_links to src\log_corp2.egg-info\dependency_links.txt
writing top-level names to src\log_corp2.egg-info\top_level.txt
writing manifest file 'src\log_corp2.egg-info\SOURCES.txt'
reading manifest file 'src\log_corp2.egg-info\SOURCES.txt'
writing manifest file 'src\log_corp2.egg-info\SOURCES.txt'
* Building sdist...
running sdist
running egg_info
<...truncated output>
Writing log_corp2-1.0.2\setup.cfg
Creating tar archive
removing 'log_corp2-1.0.2' (and everything under it)
* Building wheel from sdist
* Creating isolated environment: venv+pip...
* Installing packages in isolated environment:
- setuptools
* Getting build dependencies for wheel...
running egg_info
<...truncated output>
Successfully built log_corp2-1.0.2.tar.gz and log_corp2-1.0.2-py3-none-any.whl
PS E:\Documents\log_corp_comp2>
用法
已卸载之前安装的软件包。
安装软件包 log_corp1
在脚本中使用它。
module1 和 subpkg1 是可访问的
安装软件包 log_corp2
此安装会直接覆盖 log_corp/__init__.py 文件。
现在只有 log_corp2 的模块和子包可用。
但是我们仍然可以从 log_corp1 访问包/模块,将 log_corp 作为主包。
唯一需要注意的是,log_corp/__init__.py 被 log_corp2 覆盖。因此,如果我们有任何来自 log_corp1 的关键初始化将受到影响,这可能会导致包损坏问题。
结论
命名空间包是创建灵活且可扩展的 Python 包的现代方法。它们允许包的多个部分驻留在不同的目录中或分布在多个项目中。这使它们成为大型项目、插件系统或独立开发组件的情况的理想选择。
导入时的行为*
命名空间包不直接支持通配符导入(从 package import *),因为它们缺少 __init__.py 文件,其中常规包通常定义__all__列表来控制导入的内容。
命名空间包的优点
- 模块化:包的不同部分可以拆分到多个项目或目录中。
- 可扩展性:非常适合带有插件或独立开发组件的大型系统。
- 易于维护:支持对包的特定部分进行更新,而不会影响整个结构。
- 无__init__.py要求:简化包的创建并避免不必要的样板文件。
- 改进的协作:团队可以并行处理包的不同部分,每个团队都拥有自己的目录。
命名空间包的缺点
- 无__init__.py功能:如果没有 __init__.py 文件,您将无法初始化包、定义默认导入或设置包级逻辑。
- 通配符导入的复杂性:不支持 import *,在某些用例中可能会带来不便。
- 运行时组合:解析完整的包结构在运行时进行,如果目录配置错误,可能会导致细微的问题。
- 调试挑战:由于软件包的各个部分分布在多个位置,因此调试和跟踪问题可能会更加困难。
将命名空间包用于分布式模块化系统或框架。对于更简单的项目或需要与较旧的 Python 版本兼容时,请坚持使用常规包。