Chapter 14Project Path Utility

项目

概述

在本章的实践中,我们将构建一个微小、对分配器友好的路径助手,它能很好地与Zig的标准库配合,并在各种平台上工作。我们将采用测试先行的方式来开发它,然后还会提供一个小型的CLI演示,这样你就可以在没有测试工具的情况下看到实际输出。在此过程中,我们故意引入一个泄漏,观察Zig的测试分配器如何捕捉到它,然后修复并验证。

目标不是要取代std.fs.path,而是在一个现实的、小巧的实用工具中练习API设计、测试驱动开发(TDD)和防泄漏的内存分配。参见13__testing-and-leak-detection.xmlpath.zig

学习目标

一个小的API接口

我们将在pathutil命名空间中实现四个助手函数:

  • joinAlloc(allocator, parts)[]u8:用单个分隔符连接组件,保留绝对根
  • basename(path)[]const u8:最后一个组件,忽略尾随分隔符
  • dirpath(path)[]const u8:目录部分,无尾随分隔符(“.”用于裸名,“/”用于根)
  • extname(path)[]const u8changeExtAlloc(allocator, path, new_ext)[]u8

这些函数强调可预测、适合教学的行为;对于生产级的边缘情况,请优先使用std.fs.path

Zig
const std = @import("std");

/// Tiny, allocator-friendly path utilities for didactic purposes.
/// Note: These do not attempt full platform semantics; they aim to be predictable
/// and portable for teaching. Prefer std.fs.path for production code.
/// 微型、分配器友好的路径工具,用于教学目的。
/// 注意:这些不尝试完整的平台语义;它们旨在为教学提供可预测和可移植性。
/// 生产代码请使用 std.fs.path。
pub const pathutil = struct {
    /// Join parts with exactly one separator between components.
    /// - Collapses duplicate separators at boundaries
    /// - Preserves a leading root (e.g. "/" on POSIX) if the first non-empty part starts with a separator
    /// - Does not resolve dot segments or drive letters
    /// 使用组件之间恰好一个分隔符连接部分。
    /// - 在边界处折叠重复的分隔符
    /// - 如果第一个非空部分以分隔符开头,则保留前导根(例如 POSIX 上的 "/")
    /// - 不解析点段或驱动器字母
    pub fn joinAlloc(allocator: std.mem.Allocator, parts: []const []const u8) ![]u8 {
        var list: std.ArrayListUnmanaged(u8) = .{};
        defer list.deinit(allocator);

        const sep: u8 = std.fs.path.sep;
        var has_any: bool = false;

        for (parts) |raw| {
            if (raw.len == 0) continue;

            // Trim leading/trailing separators from this component
            // 从此组件中修剪前导/尾随分隔符
            var start: usize = 0;
            var end: usize = raw.len;
            while (start < end and isSep(raw[start])) start += 1;
            while (end > start and isSep(raw[end - 1])) end -= 1;

            const had_leading_sep = start > 0;
            const core = raw[start..end];

            if (!has_any) {
                if (had_leading_sep) {
                    // Preserve absolute root
                    // 保留绝对根
                    try list.append(allocator, sep);
                    has_any = true;
                }
            } else {
                // Ensure exactly one separator between components if we have content already
                // 如果我们已有内容,确保组件之间恰好一个分隔符
                if (list.items.len == 0 or list.items[list.items.len - 1] != sep) {
                    try list.append(allocator, sep);
                }
            }

            if (core.len != 0) {
                try list.appendSlice(allocator, core);
                has_any = true;
            }
        }

        return list.toOwnedSlice(allocator);
    }

    /// Return the last path component. Trailing separators are ignored.
    /// Examples: "a/b/c" -> "c", "/a/b/" -> "b", "/" -> "/", "" -> "".
    /// 返回最后一个路径组件。忽略尾随分隔符。
    /// 示例:"a/b/c" -> "c", "/a/b/" -> "b", "/" -> "/", "" -> ""。
    pub fn basename(path: []const u8) []const u8 {
        if (path.len == 0) return path;

        // Skip trailing separators
        // 跳过尾随分隔符
        var end = path.len;
        while (end > 0 and isSep(path[end - 1])) end -= 1;
        if (end == 0) {
            // path was all separators; treat it as root
            // 路径全是分隔符;将其视为根
            return path[0..1];
        }

        // Find previous separator
        // 查找前一个分隔符
        var i: isize = @intCast(end);
        while (i > 0) : (i -= 1) {
            if (isSep(path[@intCast(i - 1)])) break;
        }
        const start: usize = @intCast(i);
        return path[start..end];
    }

    /// Return the directory portion (without trailing separators).
    /// Examples: "a/b/c" -> "a/b", "a" -> ".", "/" -> "/".
    /// 返回目录部分(不带尾随分隔符)。
    /// 示例:"a/b/c" -> "a/b", "a" -> ".", "/" -> "/"。
    pub fn dirpath(path: []const u8) []const u8 {
        if (path.len == 0) return ".";

        // Skip trailing separators
        // 跳过尾随分隔符
        var end = path.len;
        while (end > 0 and isSep(path[end - 1])) end -= 1;
        if (end == 0) return path[0..1]; // all separators -> root
        // 所有分隔符 -> 根

        // Find previous separator
        // 查找前一个分隔符
        var i: isize = @intCast(end);
        while (i > 0) : (i -= 1) {
            const ch = path[@intCast(i - 1)];
            if (isSep(ch)) break;
        }
        if (i == 0) return ".";

        // Skip any trailing separators in the dir portion
        // 跳过目录部分中的任何尾随分隔符
        var d_end: usize = @intCast(i);
        while (d_end > 1 and isSep(path[d_end - 1])) d_end -= 1;
        if (d_end == 0) return path[0..1];
        return path[0..d_end];
    }

    /// Return the extension (without dot) of the last component or "" if none.
    /// Examples: "file.txt" -> "txt", "a.tar.gz" -> "gz", ".gitignore" -> "".
    /// 返回最后一个组件的扩展名(不带点),如果没有则返回 ""。
    /// 示例:"file.txt" -> "txt", "a.tar.gz" -> "gz", ".gitignore" -> ""。
    pub fn extname(path: []const u8) []const u8 {
        const base = basename(path);
        if (base.len == 0) return base;
        if (base[0] == '.') {
            // Hidden file as first character '.' does not count as extension if there is no other dot
            // 隐藏文件以第一个字符 '.' 开头,如果没有其他点,则不计算为扩展名
            if (std.mem.indexOfScalar(u8, base[1..], '.')) |idx2| {
                const idx = 1 + idx2;
                if (idx + 1 < base.len) return base[(idx + 1)..];
                return "";
            } else return "";
        }
        if (std.mem.lastIndexOfScalar(u8, base, '.')) |idx| {
            if (idx + 1 < base.len) return base[(idx + 1)..];
        }
        return "";
    }

    /// Return a newly-allocated path with the extension replaced by `new_ext` (no dot).
    /// If there is no existing extension, appends one if `new_ext` is non-empty.
    /// 返回一个新分配的路径,其扩展名替换为 `new_ext`(不带点)。
    /// 如果没有现有扩展名,如果 `new_ext` 非空则追加一个。
    pub fn changeExtAlloc(allocator: std.mem.Allocator, path: []const u8, new_ext: []const u8) ![]u8 {
        const base = basename(path);
        const dir = dirpath(path);
        const sep: u8 = std.fs.path.sep;

        var base_core = base;
        if (std.mem.lastIndexOfScalar(u8, base, '.')) |idx| {
            if (!(idx == 0 and base[0] == '.')) {
                base_core = base[0..idx];
            }
        }

        const need_dot = new_ext.len != 0;
        const dir_has = dir.len != 0 and !(dir.len == 1 and dir[0] == '.' and base.len == path.len);
        // Compute length at runtime to avoid comptime_int dependency
        // 在运行时计算长度以避免 comptime_int 依赖
        var new_len: usize = 0;
        if (dir_has) new_len += dir.len + 1;
        new_len += base_core.len;
        if (need_dot) new_len += 1 + new_ext.len;

        var out = try allocator.alloc(u8, new_len);
        errdefer allocator.free(out);

        var w: usize = 0;
        if (dir_has) {
            @memcpy(out[w..][0..dir.len], dir);
            w += dir.len;
            out[w] = sep;
            w += 1;
        }
        @memcpy(out[w..][0..base_core.len], base_core);
        w += base_core.len;
        if (need_dot) {
            out[w] = '.';
            w += 1;
            @memcpy(out[w..][0..new_ext.len], new_ext);
            w += new_ext.len;
        }
        return out;
    }
};

inline fn isSep(ch: u8) bool {
    return ch == std.fs.path.sep or isOtherSep(ch);
}

inline fn isOtherSep(ch: u8) bool {
    // Be forgiving in parsing: treat both '/' and '\\' as separators on any platform
    // but only emit std.fs.path.sep when joining.
    // 在解析中要宽容:在任何平台上都将 '/' 和 '\\' 视为分隔符
    // 但仅在连接时发出 std.fs.path.sep。
    return ch == '/' or ch == '\\';
}

为了教学目的,我们在解析时接受'/''\'作为任何平台上的分隔符,但在连接时总是发出本地分隔符(std.fs.path.sep)。

试试看:运行演示(可见输出)

为了在测试运行器之外保持输出可见,这里有一个调用我们的助手并打印结果的微小CLI。

Zig
const std = @import("std");
const pathutil = @import("path_util.zig").pathutil;

pub fn main() !void {
    var out_buf: [2048]u8 = undefined;
    var out_writer = std.fs.File.stdout().writer(&out_buf);
    const out = &out_writer.interface;

    // Demonstrate join
    // 演示连接
    const j1 = try pathutil.joinAlloc(std.heap.page_allocator, &.{ "a", "b", "c" });
    defer std.heap.page_allocator.free(j1);
    try out.print("join a,b,c => {s}\n", .{j1});

    const j2 = try pathutil.joinAlloc(std.heap.page_allocator, &.{ "/", "usr/", "/bin" });
    defer std.heap.page_allocator.free(j2);
    try out.print("join /,usr/,/bin => {s}\n", .{j2});

    // Demonstrate basename/dirpath
    // 演示 basename/dirpath
    const p = "/home/user/docs/report.txt";
    try out.print("basename({s}) => {s}\n", .{ p, pathutil.basename(p) });
    try out.print("dirpath({s}) => {s}\n", .{ p, pathutil.dirpath(p) });

    // Extension helpers
    // 扩展名辅助函数
    try out.print("extname({s}) => {s}\n", .{ p, pathutil.extname(p) });
    const changed = try pathutil.changeExtAlloc(std.heap.page_allocator, p, "md");
    defer std.heap.page_allocator.free(changed);
    try out.print("changeExt({s}, md) => {s}\n", .{ p, changed });

    try out.flush();
}
运行
Shell
$ zig run chapters-data/code/14__project-path-utility-tdd/path_util_demo.zig
输出
Shell
join a,b,c => a/b/c
join /,usr/,/bin => /usr/bin
basename(/home/user/docs/report.txt) => report.txt
dirpath(/home/user/docs/report.txt) => /home/user/docs
extname(/home/user/docs/report.txt) => txt
changeExt(/home/user/docs/report.txt, md) => /home/user/docs/report.md

测试先行:编码行为和边缘情况

TDD有助于阐明意图并锁定边缘情况。我们保持测试小而快;它们用Zig的测试分配器运行,该分配器默认捕捉泄漏。本章包含测试,因为内容计划要求TDD;在其他地方,我们将倾向于使用zig run风格的演示以获得可见输出。参见13__testing-and-leak-detection.xmltesting.zig

Zig
const std = @import("std");
const testing = std.testing;
const pathutil = @import("path_util.zig").pathutil;

// Helper to allocate-join path parts with the testing allocator
// 使用测试分配器分配连接路径部分的辅助函数
fn ajoin(parts: []const []const u8) ![]u8 {
    return try pathutil.joinAlloc(testing.allocator, parts);
}

test "joinAlloc basic and absolute" {
    const p1 = try ajoin(&.{ "a", "b", "c" });
    defer testing.allocator.free(p1);
    try testing.expectEqualStrings("a" ++ [1]u8{std.fs.path.sep} ++ "b" ++ [1]u8{std.fs.path.sep} ++ "c", p1);

    const p2 = try ajoin(&.{ "/", "usr/", "/bin" });
    defer testing.allocator.free(p2);
    try testing.expectEqualStrings("/usr/bin", p2);

    const p3 = try ajoin(&.{ "", "a", "", "b" });
    defer testing.allocator.free(p3);
    try testing.expectEqualStrings("a" ++ [1]u8{std.fs.path.sep} ++ "b", p3);

    const p4 = try ajoin(&.{ "a/", "/b/" });
    defer testing.allocator.free(p4);
    try testing.expectEqualStrings("a" ++ [1]u8{std.fs.path.sep} ++ "b", p4);
}

test "basename and dirpath edges" {
    try testing.expectEqualStrings("c", pathutil.basename("a/b/c"));
    try testing.expectEqualStrings("b", pathutil.basename("/a/b/"));
    try testing.expectEqualStrings("/", pathutil.basename("////"));
    try testing.expectEqualStrings("", pathutil.basename(""));

    try testing.expectEqualStrings("a/b", pathutil.dirpath("a/b/c"));
    try testing.expectEqualStrings(".", pathutil.dirpath("a"));
    try testing.expectEqualStrings("/", pathutil.dirpath("////"));
}

test "extension and changeExtAlloc" {
    try testing.expectEqualStrings("txt", pathutil.extname("file.txt"));
    try testing.expectEqualStrings("gz", pathutil.extname("a.tar.gz"));
    try testing.expectEqualStrings("", pathutil.extname(".gitignore"));
    try testing.expectEqualStrings("", pathutil.extname("noext"));

    const changed1 = try pathutil.changeExtAlloc(testing.allocator, "a/b/file.txt", "md");
    defer testing.allocator.free(changed1);
    try testing.expectEqualStrings("a/b/file.md", changed1);

    const changed2 = try pathutil.changeExtAlloc(testing.allocator, "a/b/file", "md");
    defer testing.allocator.free(changed2);
    try testing.expectEqualStrings("a/b/file.md", changed2);

    const changed3 = try pathutil.changeExtAlloc(testing.allocator, "a/b/.profile", "txt");
    defer testing.allocator.free(changed3);
    try testing.expectEqualStrings("a/b/.profile.txt", changed3);
}
运行
Shell
$ zig test chapters-data/code/14__project-path-utility-tdd/path_util_test.zig
输出
Shell
All 3 tests passed.

捕捉一个故意的泄漏 → 修复它

测试分配器在测试结束时标记泄漏。首先,一个忘记free的失败示例:

Zig
const std = @import("std");
const testing = std.testing;
const pathutil = @import("path_util.zig").pathutil;

test "deliberate leak caught by testing allocator" {
    const joined = try pathutil.joinAlloc(testing.allocator, &.{ "/", "tmp", "demo" });
    // Intentionally forget to free: allocator leak should be detected by the runner
    // 故意忘记释放:分配器泄漏应由运行器检测到
    // defer testing.allocator.free(joined);
    try testing.expect(std.mem.endsWith(u8, joined, "demo"));
}
运行(预期失败)
Shell
$ zig test chapters-data/code/14__project-path-utility-tdd/leak_demo_fail.zig
输出(摘录)
[gpa] (err): memory address 0x… leaked:
… path_util.zig:49:33: … in joinAlloc
… leak_demo_fail.zig:6:42: … in test.deliberate leak caught by testing allocator

All 1 tests passed.
1 errors were logged.
1 tests leaked memory.
error: the following test command failed with exit code 1:
…/test --seed=0x…

然后用defer修复它,并观察测试套件变绿:

Zig
const std = @import("std");
const testing = std.testing;
const pathutil = @import("path_util.zig").pathutil;

test "fixed: no leak after adding defer free" {
    // 修复后:添加 defer free 后无泄漏
    const joined = try pathutil.joinAlloc(testing.allocator, &.{ "/", "tmp", "demo" });
    defer testing.allocator.free(joined);
    try testing.expect(std.mem.endsWith(u8, joined, "demo"));
}
运行
Shell
$ zig test chapters-data/code/14__project-path-utility-tdd/leak_demo_fix.zig
输出
Shell
All 1 tests passed.

10__allocators-and-memory-management.xml, heap.zig

注意与警告

  • 对于生产路径处理,请查阅std.fs.path以了解平台细微差别(UNC路径、驱动器号、特殊根)。
  • 在成功分配后立即优先使用defer allocator.free(buf);它通过构造使成功和错误路径都正确。04__errors-resource-cleanup.xml
  • 当你需要可见输出(教程、演示)时,优先使用zig run示例;当你需要保证(CI)时,优先使用zig test。本章演示了两者,因为它明确地以TDD为重点。13__testing-and-leak-detection.xml

练习

  • 扩展joinAlloc以省略.段并折叠中间的..对(在根附近要小心)。为边缘情况添加测试,然后用zig run进行演示。
  • 添加stem(path),返回不带扩展名的基本名称;验证.gitignore、多点名称和尾随点的行为。
  • 编写一个微小的CLI,接受--change-ext md file1 file2 …并打印结果,使用页分配器和缓冲写入器。28__filesystem-and-io.xml

替代方案和边缘情况

  • 在Windows上,这个教学工具将'/''\'都视为输入分隔符,但总是打印本地分隔符。std.fs.path如果你需要精确的Windows行为,有更丰富的语义。
  • 分配失败处理:演示使用std.heap.page_allocator,在内存不足时会中止;测试使用std.testing.allocator来系统地捕捉泄漏。10__allocators-and-memory-management.xml
  • 如果你将这些助手嵌入到更大的工具中,请将分配器贯穿你的API,并保持所有权规则的明确性;避免全局状态。36__style-and-best-practices.xml

Help make this chapter better.

Found a typo, rough edge, or missing explanation? Open an issue or propose a small improvement on GitHub.