Drizzle Seed

PostgreSQL

SQLite

MySQL

SingleStore

IMPORTANT

drizzle-seed 只能与 drizzle-orm@0.36.4 或更高版本一起使用。低于此版本的版本可能在运行时正常工作，但可能存在类型问题和标识列问题，因为此补丁是在 drizzle-orm@0.36.4 中引入的。

drizzle-seed 是一个 TypeScript 库，可帮助你生成确定性但又真实的模拟数据来填充数据库。通过利用可播种伪随机数生成器 (pRNG)，它可以确保你生成的数据在不同运行中保持一致且可重现。这对于测试、开发和调试尤其有用。

什么是确定性数据生成？

确定性数据生成意味着相同的输入始终会产生相同的输出。在 drizzle-seed 上下文中，当你使用相同的种子号初始化库时，它每次都会生成相同的虚假数据序列。这使得数据集可预测且可重复。

伪随机数生成器 (pRNG)

伪随机数生成器是一种算法，它生成与随机数性质近似的数字序列。但是，由于它基于一个称为种子的初始值，因此你可以控制其随机性。通过使用相同的种子，pRNG 将生成相同的数字序列，从而使你的数据生成过程可重复。

使用 pRNG：

一致性：确保你的测试每次都基于相同的数据运行。
调试：通过提供一致的数据集，可以更轻松地重现和修复错误。
协作：团队成员可以共享种子编号以使用相同的数据集。

使用 Drizzle-seed，你可以兼得两者之长：能够生成逼真的虚假数据，并能够在需要时控制重现这些数据。

安装

npm

yarn

pnpm

bun

npm i drizzle-seed

yarn add drizzle-seed

pnpm add drizzle-seed

bun add drizzle-seed

基本用法

在本例中，我们将创建 10 个具有随机名称和 ID 的用户。

import { pgTable, integer, text } from "drizzle-orm/pg-core";
import { drizzle } from "drizzle-orm/node-postgres";
import { seed } from "drizzle-seed";

const users = pgTable("users", {
  id: integer().primaryKey(),
  name: text().notNull(),
});

async function main() {
  const db = drizzle(process.env.DATABASE_URL!);
  await seed(db, { users });
}

main();

选项

count

默认情况下，seed 函数将创建 10 个实体。但是，如果你的测试需要更多种子，你可以在种子选项对象中指定它。

await seed(db, schema, { count: 1000 });

seed

如果你需要一个种子来为所有后续运行生成一组不同的值，你可以在 seed 选项中定义一个不同的数字。任何新数字都会生成一组唯一的值

await seed(db, schema, { seed: 12345 });

重置数据库

使用 drizzle-seed，你可以轻松重置数据库并使用新值进行播种，例如在测试套件中。

// path to a file with schema you want to reset
import * as schema from "./schema.ts";
import { reset } from "drizzle-seed";

async function main() {
  const db = drizzle(process.env.DATABASE_URL!);
  await reset(db, schema);
}

main();

不同的方言会有不同的数据库重置策略。

PostgreSQL

MySQL

SQLite

对于 PostgreSQL，drizzle-seed 包将生成带有 CASCADE 选项的 TRUNCATE 语句，以确保在运行重置函数后所有表都为空。

TRUNCATE tableName1, tableName2, ... CASCADE;

对于 MySQL，drizzle-seed 包将首先禁用 FOREIGN_KEY_CHECKS 以确保下一步不会失败，然后生成 TRUNCATE 语句清空所有表的内容。

SET FOREIGN_KEY_CHECKS = 0;
TRUNCATE tableName1;
TRUNCATE tableName2;
...
SET FOREIGN_KEY_CHECKS = 1;

对于 SQLite，drizzle-seed 包将首先禁用 foreign_keys 编译指示以确保下一步不会失败，然后生成 DELETE FROM 语句清空所有表的内容。

PRAGMA foreign_keys = OFF;
DELETE FROM tableName1;
DELETE FROM tableName2;
...
PRAGMA foreign_keys = ON;

改进

如果你需要更改 drizzle-seed 默认使用的种子生成器函数的行为，你可以指定自己的实现，甚至可以在种子生成过程中使用自己的值列表。

.refine 是一个回调函数，它接收来自 drizzle-seed 的所有可用生成器函数列表。它应该返回一个对象，其中包含代表你要优化的表的键，并根据需要定义它们的行为。每个表都可以指定多个属性以简化数据库的种子设置：

columns：通过指定所需的生成器函数来优化每列的默认行为。
count：指定要插入数据库的行数。默认情况下，为 10。如果在 seed() 选项中定义了全局计数，则此处定义的计数将覆盖此特定表的全局计数。
with：如果要生成关联实体，请定义每个父表要创建的引用实体数量。

info

你还可以为要创建的引用值的数量指定加权随机分布。有关此 API 的详细信息，请参阅加权随机文档文档部分

API

await seed(db, schema).refine((f) => ({
  users: {
    columns: {},
    count: 10,
    with: {
        posts: 10
    }
  },
}));

让我们看几个示例，并解释一下会发生什么：

import { pgTable, integer, text } from "drizzle-orm/pg-core";

export const users = pgTable("users", {
  id: integer().primaryKey(),
  name: text().notNull(),
});

export const posts = pgTable("posts", {
  id: integer().primaryKey(),
  description: text(),
  userId: integer().references(() => users.id),
});

示例 1：仅使用 20 个实体为 users 表填充种子，并使用精确的种子逻辑为 name 列填充种子。

import { drizzle } from "drizzle-orm/node-postgres";
import { seed } from "drizzle-seed";
import * as schema from './schema.ts'

async function main() {
  const db = drizzle(process.env.DATABASE_URL!);

  await seed(db, { users: schema.users }).refine((f) => ({
    users: {
        columns: {
            name: f.fullName(),
        },
        count: 20
    }
  }));
}

main();

示例 2：使用 20 个实体为 users 表填充种子，并通过为 posts 表填充种子并创建从 posts 到 users 的引用，为每个 user 添加 10 个 posts。

import { drizzle } from "drizzle-orm/node-postgres";
import { seed } from "drizzle-seed";
import * as schema from './schema.ts'

async function main() {
  const db = drizzle(process.env.DATABASE_URL!);

  await seed(db, schema).refine((f) => ({
    users: {
        count: 20,
        with: {
            posts: 10
        }
    }
  }));
}

main();

示例 3：使用 5 个实体为 users 表填充种子，并使用 100 个 posts 填充数据库，但不将它们连接到 users 实体。优化 users 的 id 生成，使其能够提供从 10000 到 20000 的任意整数值并保持唯一；优化 posts 以从自定义数组中检索值

import { drizzle } from "drizzle-orm/node-postgres";
import { seed } from "drizzle-seed";
import * as schema from './schema.ts'

async function main() {
  const db = drizzle(process.env.DATABASE_URL!);

  await seed(db, schema).refine((f) => ({
    users: {
        count: 5,
        columns: {
            id: f.int({
              minValue: 10000,
              maxValue: 20000,
              isUnique: true,
            }),
        }
    },
    posts: {
        count: 100,
        columns: {
            description: f.valuesFromArray({
            values: [
                "The sun set behind the mountains, painting the sky in hues of orange and purple", 
                "I can't believe how good this homemade pizza turned out!", 
                "Sometimes, all you need is a good book and a quiet corner.", 
                "Who else thinks rainy days are perfect for binge-watching old movies?", 
                "Tried a new hiking trail today and found the most amazing waterfall!",
                // ...
            ],
          })
        }
    }
  }));
}

main();

IMPORTANT

我们将在这些文档中定义更多可能性，但目前，你可以浏览本文档中的几个部分。查看生成器部分以熟悉所有可用的生成器函数。

一个特别棒的功能是能够使用加权随机化，既可以用于为列创建的生成器值，也可以用于确定 drizzle-seed 可以生成的相关实体的数量。

请查看加权随机文档了解更多信息。

加权随机数

在某些情况下，你可能需要使用多个优先级不同的数据集，这些数据集应在种子阶段插入到数据库中。对于这种情况，drizzle-seed 提供了一个名为“加权随机”的 API。

Drizzle Seed 包中有几个地方可以使用加权随机数：

每个表内的列细化
with 属性，确定要创建的关联实体数量

让我们看一个同时包含这两种情况的示例：

import { pgTable, integer, text, varchar, doublePrecision } from "drizzle-orm/pg-core";

export const orders = pgTable(
  "orders",
  {
    id: integer().primaryKey(),
    name: text().notNull(),
    quantityPerUnit: varchar().notNull(),
    unitPrice: doublePrecision().notNull(),
    unitsInStock: integer().notNull(),
    unitsOnOrder: integer().notNull(),
    reorderLevel: integer().notNull(),
    discontinued: integer().notNull(),
  }
);

export const details = pgTable(
  "details",
  {
    unitPrice: doublePrecision().notNull(),
    quantity: integer().notNull(),
    discount: doublePrecision().notNull(),

    orderId: integer()
      .notNull()
      .references(() => orders.id, { onDelete: "cascade" }),
  }
);

示例 1：优化 unitPrice 生成逻辑以生成 5000 随机价格，其中价格在 10-100 之间的概率为 30%，在 100-300 之间的概率为 70%

import { drizzle } from "drizzle-orm/node-postgres";
import { seed } from "drizzle-seed";
import * as schema from './schema.ts'

async function main() {
  const db = drizzle(process.env.DATABASE_URL!);

  await seed(db, schema).refine((f) => ({
    orders: {
       count: 5000,
       columns: {
           unitPrice: f.weightedRandom(
               [
                   {
                       weight: 0.3,
                       value: funcs.int({ minValue: 10, maxValue: 100 })
                   },
                   {
                       weight: 0.7,
                       value: funcs.number({ minValue: 100, maxValue: 300, precision: 100 })
                   }
               ]
           ),
       }
    }
  }));
}

main();

示例 2：对于每个订单，生成 1 到 3 个详细信息的概率为 60%，生成 5 到 7 个详细信息的概率为 30%，生成 8 到 10 个详细信息的概率为 10%。

import { drizzle } from "drizzle-orm/node-postgres";
import { seed } from "drizzle-seed";
import * as schema from './schema.ts'

async function main() {
  const db = drizzle(process.env.DATABASE_URL!);

  await seed(db, schema).refine((f) => ({
    orders: {
       with: {
           details:
               [
                   { weight: 0.6, count: [1, 2, 3] },
                   { weight: 0.3, count: [5, 6, 7] },
                   { weight: 0.1, count: [8, 9, 10] },
               ]
       }
    }
  }));
}

main();

复杂示例

main.ts

schema.ts

import { seed } from "drizzle-seed";
import * as schema from "./schema.ts";

const main = async () => {
    const titlesOfCourtesy = ["Ms.", "Mrs.", "Dr."];
    const unitsOnOrders = [0, 10, 20, 30, 50, 60, 70, 80, 100];
    const reorderLevels = [0, 5, 10, 15, 20, 25, 30];
    const quantityPerUnit = [
        "100 - 100 g pieces",
        "100 - 250 g bags",
        "10 - 200 g glasses",
        "10 - 4 oz boxes",
        "10 - 500 g pkgs.",
        "10 - 500 g pkgs."
    ];
    const discounts = [0.05, 0.15, 0.2, 0.25];

    await seed(db, schema).refine((funcs) => ({
        customers: {
            count: 10000,
            columns: {
                companyName: funcs.companyName(),
                contactName: funcs.fullName(),
                contactTitle: funcs.jobTitle(),
                address: funcs.streetAddress(),
                city: funcs.city(),
                postalCode: funcs.postcode(),
                region: funcs.state(),
                country: funcs.country(),
                phone: funcs.phoneNumber({ template: "(###) ###-####" }),
                fax: funcs.phoneNumber({ template: "(###) ###-####" })
            }
        },
        employees: {
            count: 200,
            columns: {
                firstName: funcs.firstName(),
                lastName: funcs.lastName(),
                title: funcs.jobTitle(),
                titleOfCourtesy: funcs.valuesFromArray({ values: titlesOfCourtesy }),
                birthDate: funcs.date({ minDate: "2010-12-31", maxDate: "2010-12-31" }),
                hireDate: funcs.date({ minDate: "2010-12-31", maxDate: "2024-08-26" }),
                address: funcs.streetAddress(),
                city: funcs.city(),
                postalCode: funcs.postcode(),
                country: funcs.country(),
                homePhone: funcs.phoneNumber({ template: "(###) ###-####" }),
                extension: funcs.int({ minValue: 428, maxValue: 5467 }),
                notes: funcs.loremIpsum()
            }
        },
        orders: {
            count: 50000,
            columns: {
                shipVia: funcs.int({ minValue: 1, maxValue: 3 }),
                freight: funcs.number({ minValue: 0, maxValue: 1000, precision: 100 }),
                shipName: funcs.streetAddress(),
                shipCity: funcs.city(),
                shipRegion: funcs.state(),
                shipPostalCode: funcs.postcode(),
                shipCountry: funcs.country()
            },
            with: {
                details:
                    [
                        { weight: 0.6, count: [1, 2, 3, 4] },
                        { weight: 0.2, count: [5, 6, 7, 8, 9, 10] },
                        { weight: 0.15, count: [11, 12, 13, 14, 15, 16, 17] },
                        { weight: 0.05, count: [18, 19, 20, 21, 22, 23, 24, 25] },
                    ]
            }
        },
        suppliers: {
            count: 1000,
            columns: {
                companyName: funcs.companyName(),
                contactName: funcs.fullName(),
                contactTitle: funcs.jobTitle(),
                address: funcs.streetAddress(),
                city: funcs.city(),
                postalCode: funcs.postcode(),
                region: funcs.state(),
                country: funcs.country(),
                phone: funcs.phoneNumber({ template: "(###) ###-####" })
            }
        },
        products: {
            count: 5000,
            columns: {
                name: funcs.companyName(),
                quantityPerUnit: funcs.valuesFromArray({ values: quantityPerUnit }),
                unitPrice: funcs.weightedRandom(
                    [
                        {
                            weight: 0.5,
                            value: funcs.int({ minValue: 3, maxValue: 300 })
                        },
                        {
                            weight: 0.5,
                            value: funcs.number({ minValue: 3, maxValue: 300, precision: 100 })
                        }
                    ]
                ),
                unitsInStock: funcs.int({ minValue: 0, maxValue: 125 }),
                unitsOnOrder: funcs.valuesFromArray({ values: unitsOnOrders }),
                reorderLevel: funcs.valuesFromArray({ values: reorderLevels }),
                discontinued: funcs.int({ minValue: 0, maxValue: 1 })
            }
        },
        details: {
            columns: {
                unitPrice: funcs.number({ minValue: 10, maxValue: 130 }),
                quantity: funcs.int({ minValue: 1, maxValue: 130 }),
                discount: funcs.weightedRandom(
                    [
                        { weight: 0.5, value: funcs.valuesFromArray({ values: discounts }) },
                        { weight: 0.5, value: funcs.default({ defaultValue: 0 }) }
                    ]
                )
            }
        }
    }));
}

main();

import type { AnyPgColumn } from "drizzle-orm/pg-core";
import { integer, numeric, pgTable, text, timestamp, varchar } from "drizzle-orm/pg-core";

export const customers = pgTable('customer', {
	id: varchar({ length: 256 }).primaryKey(),
	companyName: text().notNull(),
	contactName: text().notNull(),
	contactTitle: text().notNull(),
	address: text().notNull(),
	city: text().notNull(),
	postalCode: text(),
	region: text(),
	country: text().notNull(),
	phone: text().notNull(),
	fax: text(),
});

export const employees = pgTable(
	'employee',
	{
		id: integer().primaryKey(),
		lastName: text().notNull(),
		firstName: text(),
		title: text().notNull(),
		titleOfCourtesy: text().notNull(),
		birthDate: timestamp().notNull(),
		hireDate: timestamp().notNull(),
		address: text().notNull(),
		city: text().notNull(),
		postalCode: text().notNull(),
		country: text().notNull(),
		homePhone: text().notNull(),
		extension: integer().notNull(),
		notes: text().notNull(),
		reportsTo: integer().references((): AnyPgColumn => employees.id),
		photoPath: text(),
	},
);

export const orders = pgTable('order', {
	id: integer().primaryKey(),
	orderDate: timestamp().notNull(),
	requiredDate: timestamp().notNull(),
	shippedDate: timestamp(),
	shipVia: integer().notNull(),
	freight: numeric().notNull(),
	shipName: text().notNull(),
	shipCity: text().notNull(),
	shipRegion: text(),
	shipPostalCode: text(),
	shipCountry: text().notNull(),

	customerId: text().notNull().references(() => customers.id, { onDelete: 'cascade' }),

	employeeId: integer().notNull().references(() => employees.id, { onDelete: 'cascade' }),
});

export const suppliers = pgTable('supplier', {
	id: integer().primaryKey(),
	companyName: text().notNull(),
	contactName: text().notNull(),
	contactTitle: text().notNull(),
	address: text().notNull(),
	city: text().notNull(),
	region: text(),
	postalCode: text().notNull(),
	country: text().notNull(),
	phone: text().notNull(),
});

export const products = pgTable('product', {
	id: integer().primaryKey(),
	name: text().notNull(),
	quantityPerUnit: text().notNull(),
	unitPrice: numeric().notNull(),
	unitsInStock: integer().notNull(),
	unitsOnOrder: integer().notNull(),
	reorderLevel: integer().notNull(),
	discontinued: integer().notNull(),

	supplierId: integer().notNull().references(() => suppliers.id, { onDelete: 'cascade' }),
});

export const details = pgTable('order_detail', {
	unitPrice: numeric().notNull(),
	quantity: integer().notNull(),
	discount: numeric().notNull(),

	orderId: integer().notNull().references(() => orders.id, { onDelete: 'cascade' }),

	productId: integer().notNull().references(() => products.id, { onDelete: 'cascade' }),
});

限制

`with` 的类型限制

由于 TypeScript 的某些限制以及 Drizzle 当前 API 的限制，无法正确推断表之间的引用，尤其是在表之间存在循环依赖时。

这意味着 with 选项将显示模式中的所有表，你需要手动选择具有一对多关系的表。

warning

with 选项适用于一对多关系。例如，如果你有一个 user 和多个 posts，你可以使用用户 with 的帖子，但不能使用帖子 with 的用户

Drizzle 表中第三个参数的类型限制：

目前，Drizzle 表中的第三个参数不支持类型。虽然它可以在运行时工作，但在类型级别无法正常运行。