feat(new-compiler): migrate metadata storage from lockfile to LMDB by AndreyHirsa · Pull Request #1955 · lingodotdev/lingo.dev

AndreyHirsa · 2026-02-03T14:18:24Z

Summary

Migrate metadata storage from JSON files with lockfile to LMDB database for improved concurrent access and elimination of ELOCKED errors.

Changes

Replace proper-lockfile with LMDB: Use lmdb-js for metadata storage with MVCC (Multi-Version Concurrency Control) for safe concurrent access across bundler workers
Remove proper-lockfile dependency: No longer needed with LMDB's built-in concurrency handling
Update related docs

Testing

Business logic tests added:

Visuals

N/A - No UI changes

Checklist

Changeset added
Tests cover business logic
No breaking changes (or documented below)

Closes #1636

Summary by CodeRabbit

New Features
- Metadata storage migrated from a JSON file to an LMDB-backed database with separate dev/build metadata directories and short-lived access.
- Server runtime API updated to accept translation hashes and return { t, locale, translations }.
Documentation
- Compiler, architecture, transformation pipeline, and runtime docs updated to reflect new storage layout and APIs.
Tests
- Added comprehensive metadata tests covering creation, save/load, cleanup, large batches, and concurrency-like scenarios.
Chores
- Removed file-locking dependency in favor of LMDB-backed persistence.

coderabbitai · 2026-02-03T14:18:49Z

Note

Reviews paused

It looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the reviews.auto_review.auto_pause_after_reviewed_commits setting.

Use the following commands to manage reviews:

@coderabbitai resume to resume automatic reviews.
@coderabbitai review to trigger a single review.

Use the checkboxes below for quick actions:

▶️ Resume reviews
🔍 Trigger review

📝 Walkthrough

Walkthrough

Replaces JSON+file-lock metadata with LMDB-backed storage, removing the MetadataManager class and proper-lockfile; adds stateless functions (loadMetadata, saveMetadata, cleanupExistingMetadata, getMetadataPath), new .lingo/metadata-dev and .lingo/metadata-build locations, and tests/docs updates.

Changes

Cohort / File(s)	Summary
Metadata Core `packages/new-compiler/src/metadata/manager.ts`, `packages/new-compiler/src/metadata/manager.test.ts`, `packages/new-compiler/src/types.ts`	Switched metadata from JSON file to LMDB DB; replaced class-based MetadataManager with exported functions: `loadMetadata`, `saveMetadata`, `cleanupExistingMetadata`, `getMetadataPath`; added LMDB lifecycle helpers, transactional writes, cleanup behavior, and comprehensive tests; changed `MetadataSchema` to `Record<string, TranslationEntry>`.
Plugin Integration & Runtime `packages/new-compiler/src/plugin/unplugin.ts`, `packages/new-compiler/src/plugin/next-compiler-loader.ts`, `packages/new-compiler/src/plugin/next.ts`, `packages/new-compiler/src/plugin/build-translator.ts`	Replaced MetadataManager usage with direct `saveMetadata` calls and updated imports/usages; adjusted metadata shape handling (flat map vs entries object); minor formatting tweaks.
Translation Service & Translators `packages/new-compiler/src/translators/translation-service.ts`, `packages/new-compiler/src/translators/pluralization/service.ts`, `packages/new-compiler/src/translation-server/translation-server.ts`	Updated codepaths to treat metadata as a flat map (Object.keys(metadata) / metadata[hash]) instead of `metadata.entries`; adapted related loops, lookups, and initialization to new schema.
Docs & Architecture `packages/new-compiler/README.md`, `packages/new-compiler/docs/TRANSLATION_ARCHITECTURE.md`, `packages/new-compiler/src/plugin/transform/TRANSFORMATION_PIPELINE.md`, `packages/new-compiler/src/translation-server/README.md`	Replaced references to `.lingo/metadata.json` with LMDB directories (`.lingo/metadata-dev/`, `.lingo/metadata-build/`); documented new LMDB-backed APIs and runtime shape changes (e.g., `getServerTranslations` usage examples).
Tests & Test Docs `packages/new-compiler/src/plugin/transform/TESTING.md`	Updated test expectations to check for LMDB metadata directories instead of a single metadata.json file.
Dependencies & Changeset `.changeset/common-teeth-reply.md`, `packages/new-compiler/package.json`	Added `lmdb` dependency, removed `proper-lockfile` (and its types), and added a patch-level changeset file.
Transformation Pipeline Docs `packages/new-compiler/src/plugin/transform/TRANSFORMATION_PIPELINE.md`	Documented data model migration, concurrency notes (LMDB multi-reader single-writer), updated runtime fetch pattern to accept hashes and return `{ t, locale, translations }`.

Sequence Diagram

sequenceDiagram
    participant Compiler as Compiler (Plugin)
    participant SaveFunc as saveMetadata(dbPath, entries)
    participant LoadFunc as loadMetadata(dbPath)
    participant DB as LMDB Database

    rect rgba(100,150,200,0.5)
    Note over Compiler,SaveFunc: Save metadata flow
    Compiler->>SaveFunc: saveMetadata(dbPath, entries)
    SaveFunc->>DB: openDatabase(dbPath)
    DB-->>SaveFunc: db handle
    SaveFunc->>DB: write entries (transaction)
    SaveFunc->>DB: update __stats__ key
    DB-->>SaveFunc: commit
    SaveFunc->>DB: close connection
    SaveFunc-->>Compiler: confirmation
    end

    rect rgba(150,200,100,0.5)
    Note over Compiler,LoadFunc: Load metadata flow
    Compiler->>LoadFunc: loadMetadata(dbPath)
    LoadFunc->>DB: openDatabase(dbPath)
    DB-->>LoadFunc: db handle
    LoadFunc->>DB: read all entries & __stats__
    DB-->>LoadFunc: entries + stats
    LoadFunc->>DB: close connection
    LoadFunc-->>Compiler: MetadataSchema
    end

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~50 minutes

Possibly related PRs

fix(new-compiler): prevent ELOCKED errors during parallel builds #1932: Adjusted proper-lockfile retry/timeout behavior; related because it targets the previous JSON+lockfile metadata flow that this change replaces.

Suggested reviewers

vrcprl

Poem

🐰 Hopped from files to LMDB delight,
Short-lived connections, transactions tight.
No locks to burrow, just keys that gleam,
Metadata snug in a speedy stream.
Translation carrots grown in a dream 🥕

🚥 Pre-merge checks | ✅ 2 | ❌ 3

❌ Failed checks (2 warnings, 1 inconclusive)

Check name	Status	Explanation	Resolution
Linked Issues check	⚠️ Warning	Issue `#1636` describes a 'traverse is not a function' error with React Router v7, but this PR only addresses metadata storage migration to LMDB. The PR does not appear to fix the traverse issue mentioned in the linked issue.	Verify that this PR actually resolves issue `#1636` or clarify the connection. If unrelated, consider removing the issue link or splitting into separate PRs.
Docstring Coverage	⚠️ Warning	Docstring coverage is 38.89% which is insufficient. The required threshold is 80.00%.	Write docstrings for the functions missing them to satisfy the coverage threshold.
Description check	❓ Inconclusive	The description follows the template with Summary, Changes, Testing, and Checklist sections. However, the 'No breaking changes' checkbox is left unchecked despite the PR changing the MetadataSchema structure from an interface to a type alias, which is a significant API change.	Clarify whether the MetadataSchema change (from interface to type alias) constitutes a breaking change and document any migration steps for consumers of this API.

✅ Passed checks (2 passed)

Check name	Status	Explanation
Title check	✅ Passed	The title accurately describes the main change: migrating metadata storage from lockfile (proper-lockfile) to LMDB, which is the primary objective of this PR.
Out of Scope Changes check	✅ Passed	All changes are focused on migrating metadata storage from file-based JSON with lockfile to LMDB. No out-of-scope changes detected beyond the intended scope of the LMDB migration.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing touches

📝 Generate docstrings

🧪 Generate unit tests (beta)

Create PR with unit tests
Post copyable unit tests in a comment
Commit unit tests in branch feat/metadata-lmdb-storage

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

coderabbitai

Actionable comments posted: 1

🤖 Fix all issues with AI agents

In `@packages/new-compiler/src/metadata/manager.test.ts`:
- Around line 309-314: The test is incorrectly treating the synchronous
MetadataManager constructor as async; update the "should throw descriptive error
for invalid path" test to directly assert the synchronous throw by replacing the
await expect(async () => new MetadataManager(invalidPath)).rejects.toThrow()
pattern with a synchronous assertion such as expect(() => new
MetadataManager(invalidPath)).toThrow() (or toThrowError/... if you want to
assert message content), referencing the MetadataManager constructor in that
test.

🧹 Nitpick comments (4)

packages/new-compiler/src/metadata/manager.ts (1)
205-226: Avoid O(n) full scans when updating stats.
Counting all keys on every save scales with database size and can slow large builds. Consider incrementally updating totalEntries per write inside the transaction.
♻️ Suggested refactor
-      for (const entry of entries) {
-        this.db.put(entry.hash, entry);
-      }
-
-      // Count entries explicitly (excluding stats key) for clarity
-      let entryCount = 0;
-      for (const { key } of this.db.getRange()) {
-        if (key !== STATS_KEY) {
-          entryCount++;
-        }
-      }
-
-      const stats = {
-        totalEntries: entryCount,
-        lastUpdated: new Date().toISOString(),
-      };
+      const prevStats =
+        (this.db.get(STATS_KEY) as MetadataSchema["stats"] | undefined) ?? {
+          totalEntries: 0,
+          lastUpdated: new Date().toISOString(),
+        };
+      let entryCount = prevStats.totalEntries;
+
+      for (const entry of entries) {
+        const existed = this.db.get(entry.hash) !== undefined;
+        this.db.put(entry.hash, entry);
+        if (!existed) entryCount++;
+      }
+
+      const stats = {
+        totalEntries: entryCount,
+        lastUpdated: new Date().toISOString(),
+      };
packages/new-compiler/src/metadata/manager.test.ts (3)
106-108: Awaiting a synchronous function is unnecessary.

cleanupExistingMetadata returns void (synchronous), so await is a no-op here. While this doesn't break functionality, it's misleading and inconsistent with the function signature.
Suggested fix
   afterEach(async () => {
-    await cleanupExistingMetadata(testDbPath);
+    cleanupExistingMetadata(testDbPath);
   });
380-384: Avoid returning values from forEach callbacks.

worker.kill() returns a boolean, triggering the Biome lint warning. Use for...of loop instead for cleaner semantics when you don't need the return values.
Suggested fix
       } finally {
-        workers.forEach((w) => w.kill());
+        for (const w of workers) {
+          w.kill();
+        }
       }
237-258: Same issue: awaiting synchronous cleanupExistingMetadata calls.

Lines 243, 303, 304, and 357 also await the synchronous cleanupExistingMetadata function. Consider removing await from all these call sites for consistency.

Also applies to: 286-307

packages/new-compiler/src/metadata/manager.test.ts

Co-authored-by: coderabbitai[bot] <136622811+coderabbitai[bot]@users.noreply.github.com>

…ions

coderabbitai

Actionable comments posted: 1

🤖 Fix all issues with AI agents

In `@packages/new-compiler/src/metadata/manager.test.ts`:
- Around line 213-217: The test "should throw descriptive error for invalid
path" is flaky because it uses a hard-coded /root/... path; change it to create
a guaranteed-failing target using the tmp directory: create a temporary file
(e.g., via fs.writeFileSync or fs.mkdtempSync under os.tmpdir()) and pass that
file path to loadMetadata so mkdirSync will deterministically fail; update the
test code around the it block (the test function invoking loadMetadata) to
create and clean up the temp file and assert that loadMetadata(tmpFilePath)
rejects.

packages/new-compiler/src/metadata/manager.test.ts

coderabbitai

Actionable comments posted: 1

🤖 Fix all issues with AI agents

In `@packages/new-compiler/src/plugin/transform/TRANSFORMATION_PIPELINE.md`:
- Around line 164-167: Replace the unsupported hard numbers and the "built-in"
compression claim in the TRANSFORMATION_PIPELINE.md bullet list: remove or
replace the "~1M ops/sec write speed vs ~50K for SQLite" line with a
conservative, sourced or generic statement such as "orders of magnitude faster
than SQLite for non-transactional writes" (or include specific benchmark
citations if you prefer exact figures), and change "Built-in LZ4 compression" to
"Optional LZ4 compression (enabled in this implementation)" to reflect that
compression is not enabled by default in lmdb-js; update the bullets containing
the exact strings "~1M ops/sec write speed vs ~50K for SQLite" and "Built-in LZ4
compression" accordingly.

🧹 Nitpick comments (2)

packages/new-compiler/src/plugin/transform/TRANSFORMATION_PIPELINE.md (2)

204-204: Clarify the concurrency model for bundler workers.

The statement "multi-reader, single-writer" is accurate for LMDB's MVCC model, but needs clarification in the context of multiple bundler workers. The PR summary mentions "MVCC for safe concurrent access across bundler workers," but:

How does LMDB handle concurrent write attempts from multiple worker processes?

Are writes automatically serialized, or do workers need to retry on conflicts?

Does each worker process get its own write transaction, or is there cross-process coordination?

Consider adding a sentence or two explaining how concurrent writes from multiple bundler workers are handled to help readers understand the practical implications.

268-268: Consider clarifying the distinction between metadata and cache storage.

The document mentions two different storage locations:

Metadata storage: .lingo/metadata-{env}/ (LMDB database for source strings - Line 160)

Translation cache: .lingo/cache/{locale}.json (JSON files for translated strings - Line 268)

While this separation makes sense architecturally, readers might benefit from a brief clarification about when each is used. Consider adding a note explaining that:

Metadata stores source strings and context extracted at build time

Cache stores the actual translations generated for each locale at runtime

This would help readers understand why there are two different storage systems.

coderabbitai · 2026-02-06T01:13:49Z

packages/new-compiler/src/plugin/transform/TRANSFORMATION_PIPELINE.md

+- ~1M ops/sec write speed vs ~50K for SQLite
+- Zero-copy reads from memory-mapped files
+- Built-in LZ4 compression
+- Simple key-value API without SQL overhead


⚠️ Potential issue | 🟡 Minor

🧩 Analysis chain

🌐 Web query:

lmdb-js LZ4 compression support documentation

💡 Result:

lmdb-js supports optional off-thread LZ4 compression for values written via its async write path. It’s off by default and is enabled with the compression option when opening the DB. [1]

Enable compression

import { open } from 'lmdb'; const db = open('my-db', { compression: true });

[1]

Configure compression

import fs from 'node:fs'; import { open } from 'lmdb'; const db = open('my-db', { compression: { threshold: 500, // only compress values > 500 bytes dictionary: fs.readFileSync('dict.txt') // optional shared dictionary (Buffer) } });

Supported settings documented:

threshold (default 1000 when compression is enabled)

dictionary (Buffer; defaults to an internal shared dictionary) [1][2]

Important notes from the docs

Decompression happens automatically on reads (including get / getRange). [1][2]

If you enable compression for a DB, you must always open it with the same compression setting so data can be decompressed correctly. [1]

Child DBs opened from a root DB inherit the root DB’s compression settings by default. [1]

Sources: lmdb-js README “Compression” section [1] and the lmdb npm “Compression” docs [2].

References:
[1] https://github.com/kriszyp/lmdb-js (README → “Compression”)
[2] https://www.npmjs.com/package/lmdb (Compression section)

🌐 Web query:

LMDB write performance benchmark SQLite comparison ops per second

💡 Result:

Benchmarks that report write throughput (ops/sec) for LMDB vs SQLite

1) LumoSQL speedtest-style benchmark (SQLite B-tree vs “SQLite+LMDB” storage backend)

LumoSQL publishes example output (timings) for the same benchmark run across targets, including vanilla SQLite and SQLite built with an LMDB backend. From the example table: “1000 INSERTs” and “25000 INSERTs in a transaction” times are shown for each target. Converting those to inserts/sec:

Test (from LumoSQL example) SQLite B-tree time SQLite B-tree ops/sec SQLite+LMDB time SQLite+LMDB ops/sec

1000 INSERTs 2.789 s ~359 ops/s 1.484–1.429 s ~674–700 ops/s

25000 INSERTs in a transaction 0.085 s ~294k ops/s 0.094–0.096 s ~266k–260k ops/s

Source timings: LumoSQL benchmark-filter example output. [1]

Key takeaway from this specific example: LMDB-backend helps a lot for many small autocommit INSERTs, while batched INSERTs inside one transaction are already extremely fast on SQLite’s default B-tree and are in the same ballpark (or faster in that sample) than SQLite+LMDB. [1]

2) LMDB-only “typical comparison” numbers (contextual reference)

Some LMDB docs/libraries cite typical insert rates such as ~350k random inserts/sec and ~580k sequential inserts/sec (workload- and hardware-dependent). This is not a SQLite comparison, but it provides an order-of-magnitude reference for LMDB writes. [2]

Notes for interpreting “ops/sec”

SQLite write throughput varies wildly depending on whether each INSERT is its own transaction (slow) vs explicit batching in a transaction / WAL / sync settings (often orders-of-magnitude faster). LumoSQL’s example illustrates this directly (hundreds ops/s vs hundreds of thousands ops/s). [1]

Sources:
[1] LumoSQL “benchmark-filter” example output showing per-test times for SQLite vs SQLite+LMDB targets.
[2] zlmdb LMDB examples page showing “Random Insert” and “Sequential Insert” ops/sec figures.

🌐 Web query:

lmdb-js npm library features compression

💡 Result:

lmdb-js (published on npm as lmdb) supports optional native, off-main-thread compression using LZ4—compression is performed in the same worker thread that batches/commits async writes, so it adds minimal main-thread overhead, and decompression is very fast (used on reads like get / range reads). [1][2]

How to enable/configure compression

Compression is off by default; enable it by passing compression: true (or an options object) when opening the database. [3]

Options include:

threshold: only compress values larger than this many bytes (default ~1000 when enabled). [3]

dictionary: provide a shared dictionary buffer to improve compression ratios for small/JSON-like values. [3]

Algorithm: LZ4. [1][2]

Example

import { open } from 'lmdb'; const db = open({ path: './mydb', compression: { threshold: 500 } // compress values > 500 bytes });

[3]

Sources

npm package docs for lmdb (lmdb-js) describing off-main-thread LZ4 compression. [1][2]

lmdb-store docs (older package name) detailing compression, threshold, and dictionary options. [3]

🏁 Script executed:

cd packages/new-compiler && find . -name "*.md" -o -name "*.js" -o -name "*.ts" | grep -i transform | head -20

Repository: lingodotdev/lingo.dev

Length of output: 479

🏁 Script executed:

cat -n packages/new-compiler/src/plugin/transform/TRANSFORMATION_PIPELINE.md | sed -n '160,196p'

Repository: lingodotdev/lingo.dev

Length of output: 1070

🏁 Script executed:

rg -i "compression|lmdb" packages/new-compiler/src/plugin/transform/ -A 3 -B 3

Repository: lingodotdev/lingo.dev

Length of output: 3947

🏁 Script executed:

rg -i "compression|open.*lmdb" packages/new-compiler/src -A 5 -B 2

Repository: lingodotdev/lingo.dev

Length of output: 3314

🏁 Script executed:

grep -r "lmdb" packages/new-compiler/package.json

Repository: lingodotdev/lingo.dev

Length of output: 85

Update performance claims and clarify compression feature status.

Performance numbers lack sources and don't align with published benchmarks. The claim of "~1M ops/sec vs ~50K for SQLite" is unsupported:

SQLite write throughput varies widely based on transaction mode (~294k ops/sec in transactions, ~359-700 ops/sec for individual commits)

LMDB typical throughput is 350k–580k sequential inserts/sec, not 1M

Either provide specific benchmark sources or replace with general statements ("orders of magnitude faster than SQLite for non-transactional writes").

LZ4 compression is optional, not "built-in." While the codebase does enable it with compression: true, lmdb-js ships with compression disabled by default. Clarify: "Optional LZ4 compression (enabled in this implementation)" rather than "Built-in LZ4 compression."

🤖 Prompt for AI Agents

In `@packages/new-compiler/src/plugin/transform/TRANSFORMATION_PIPELINE.md` around lines 164 - 167, Replace the unsupported hard numbers and the "built-in" compression claim in the TRANSFORMATION_PIPELINE.md bullet list: remove or replace the "~1M ops/sec write speed vs ~50K for SQLite" line with a conservative, sourced or generic statement such as "orders of magnitude faster than SQLite for non-transactional writes" (or include specific benchmark citations if you prefer exact figures), and change "Built-in LZ4 compression" to "Optional LZ4 compression (enabled in this implementation)" to reflect that compression is not enabled by default in lmdb-js; update the bullets containing the exact strings "~1M ops/sec write speed vs ~50K for SQLite" and "Built-in LZ4 compression" accordingly.

Interesting 🤔 I do not have an opinion on LMDB vs Sqlite myself, we do not care about >100k ops per sec I believe, the easier the setup and the better it handles concurrent writes, the better

@AleksandrSl, that makes perfect sense—LMDB is actually an excellent fit for your priorities! Here's why it aligns well with your requirements:

Concurrent writes across bundler workers: LMDB uses MVCC and allows multiple processes to access the database simultaneously, with write transactions serialized internally while readers never block writers (or vice versa). This is exactly what you need for multiple Vite/Webpack workers writing metadata without ELOCKED errors.

Simple setup: lmdb-js is straightforward—just npm install lmdb and call open(). No database server, no configuration files, no manual schema setup. It uses copy-on-write so there's no recovery needed after crashes, which means less operational overhead.

For your docs: I'd suggest reframing the "Why LMDB?" section to emphasize the concurrency benefits that matter for your use case:

"Handles concurrent writes from multiple bundler workers via MVCC"

"Zero-setup embedded database (no server required)"

"Crash-safe with no recovery procedures needed"

And update the compression line to: "Optional LZ4 compression (enabled in this implementation)" since compression is off by default in lmdb-js.

The raw performance numbers don't really matter for your scale—the key win is eliminating those lockfile errors while keeping setup simple! 🎯

AleksandrSl

Looks reasonable if it passes the real workd case scenario.
In the near future I'd also suggest adding an e2e test, I think we can mimic the high load and see that the page loads fine

AleksandrSl · 2026-02-06T16:01:20Z

packages/new-compiler/src/metadata/manager.test.ts

+    sourceText?: string;
+  } = {},
+): TranslationEntry {
+  const hash = overrides.hash ?? `hash_${Math.random().toString(36).slice(2)}`;


nit: we probably can use a real hash function here, to avoid extra mocks?

Switched to using the generateTranslationHash - no more mocks needed.

AleksandrSl · 2026-02-06T16:02:04Z

packages/new-compiler/src/metadata/manager.test.ts

+    context: { filePath: "test.tsx", componentName: "TestComponent" },
+    location: { filePath: "test.tsx", line: 1, column: 1 },
+    ...overrides,
+  } as TranslationEntry;


Why does it not comply to the type so we have to cast?

Right, that was ugly. Refactored.

AleksandrSl · 2026-02-06T16:05:09Z

packages/new-compiler/src/metadata/manager.test.ts

+    });
+
+    it("should load and preserve all entry fields", async () => {
+      const entry: TranslationEntry = {


nit: createTestEntry?

AleksandrSl · 2026-02-06T16:10:19Z

packages/new-compiler/src/metadata/manager.test.ts

+      expect(updated.entries["entry-1"].sourceText).toBe("v2");
+
+      // Empty array should not change anything
+      await saveMetadata(testDbPath, []);


nit: i think this tests is a bit to big in terms of responsibility, it tests several things (there is even and in the name 😄 )

Fair, split it into separate tests.

AleksandrSl · 2026-02-06T16:11:48Z

packages/new-compiler/src/metadata/manager.test.ts

+    it("should maintain data integrity after many operations", async () => {
+      // Many saves with overlapping keys
+      for (let i = 0; i < 10; i++) {
+        await saveMetadata(testDbPath, [


Let's make the saves here and above concurrent? They won't be fully parallele, but we may test at least asynchronous work.
I mean - run a lot of saves in the Promise.all()

Ah, lol, I see they are below, I think we can keep only concurrent versions then to keep amount of the tests reasonable, since concurrent adds more values, and there is nothing specific to catch in the synchronous test

Makes sense. Dropped the sequential test and kept only the Promise.all version

AleksandrSl · 2026-02-06T16:30:08Z

packages/new-compiler/src/metadata/manager.ts

+ * Closing before the loader returns ensures the handle is released even if the
+ * worker is killed right after.
+ */
+function openDatabaseConnection(dbPath: string): RootDatabase {


As a small improvment if we mostly opening and closing the database in every method, not keeping the connection alive, we can create method like runWithDatabaseConnect or something similar which will do

function runWithDbConnection(f: (db: Database) -> void) { try { const db = open() f(db) } finally { close() } }

So we simplify the work if anyone will be adding new methods, they won't forget to close the DB.

Good point. Refactored.

AleksandrSl · 2026-02-06T18:01:56Z

packages/new-compiler/src/metadata/manager.ts

 *
- * @param config - Config with sourceRoot, lingoDir, and environment
- * @returns Absolute path to metadata file
+ * Uses transactionSync to batch all writes into a single commit. Async puts


I think we can decrease the level of details here and only leave the things that may affect the user. e.g. knowing that this function calls transactionSync means little

Refactored.

AleksandrSl · 2026-02-06T18:06:15Z

packages/new-compiler/src/metadata/manager.ts

-      await fsPromises.unlink(tmpPath).catch(() => {});
-    }
+      const totalKeys = db.getKeysCount();
+      const entryCount =


We can check if we use entryCount anywhere and drop it entirely, I think it was added by an LLM and it was almost free to count it before, bow that we have to get stats and then do a - 1(why - by the way) it's time to consider the pros of this field

Yep, there is no real value in keeping this. Statistics have been removed. Metadata is now just a Record<string, TranslationEntry>

AleksandrSl · 2026-02-06T18:08:33Z

packages/new-compiler/src/plugin/transform/TRANSFORMATION_PIPELINE.md

+- ~1M ops/sec write speed vs ~50K for SQLite
+- Zero-copy reads from memory-mapped files
+- Built-in LZ4 compression
+- Simple key-value API without SQL overhead


Interesting 🤔 I do not have an opinion on LMDB vs Sqlite myself, we do not care about >100k ops per sec I believe, the easier the setup and the better it handles concurrent writes, the better

AleksandrSl · 2026-02-06T18:10:08Z

packages/new-compiler/src/plugin/unplugin.ts

              framework: currentFramework,
              errorType: "transform",
-              errorMessage: error instanceof Error ? error.message : "Unknown transform error",
+              errorMessage:


Interesting that it formats the code another way, did I forget to setup prettier 🤔

coderabbitai

Actionable comments posted: 1

🤖 Fix all issues with AI agents

In `@packages/new-compiler/src/metadata/manager.ts`:
- Around line 97-118: The catch block in cleanupExistingMetadata currently
checks for ENOENT which is unreachable because fs.rmSync(..., { force: true })
already silences missing-path errors; remove the ENOENT branch and simplify the
catch to always compute the error message (using error instanceof Error ?
error.message : String(error)) and call logger.warn with that message (retaining
the existing `Failed to cleanup metadata database:` text) so any real errors are
still reported; reference the cleanupExistingMetadata function and the catch
block surrounding fs.rmSync(metadataDbPath, { recursive: true, force: true }).

🧹 Nitpick comments (1)

packages/new-compiler/src/metadata/manager.ts (1)
50-61: fn callback typed as synchronous — consider guarding against accidental async usage.

runWithDbConnection accepts fn: (db: RootDatabase) => T. If an async callback were ever passed, the finally block would await closeDatabaseConnection(…) before the returned promise settles, closing the DB while the operation is still in flight. This is fine today since all callers are synchronous, but the signature doesn't prevent misuse.
Optional: add an overload or a runtime guard

A lightweight approach — add a JSDoc note:
 /**
  * Opens a database connection, runs the callback, and ensures the connection
  * is closed afterwards.
+ *
+ * ⚠️ `fn` MUST be synchronous. An async callback would cause the DB to close
+ *    before its work completes.
  */
 async function runWithDbConnection<T>(

coderabbitai · 2026-02-09T12:48:52Z

packages/new-compiler/src/metadata/manager.ts

+export function cleanupExistingMetadata(metadataDbPath: string): void {
+  logger.debug(`Cleaning up metadata database: ${metadataDbPath}`);

-    try {
-      await fsPromises.access(this.filePath);
-    } catch {
-      await fsPromises.writeFile(
-        this.filePath,
-        JSON.stringify(createEmptyMetadata(), null, 2),
-        "utf-8",
+  try {
+    fs.rmSync(metadataDbPath, { recursive: true, force: true });
+    logger.info(`🧹 Cleaned up metadata database: ${metadataDbPath}`);
+  } catch (error) {
+    const code =
+      error instanceof Error && "code" in error
+        ? (error as NodeJS.ErrnoException).code
+        : undefined;
+    const message = error instanceof Error ? error.message : String(error);
+
+    if (code === "ENOENT") {
+      logger.debug(
+        `Metadata database already deleted or doesn't exist: ${metadataDbPath}`,
      );
+      return;
    }

-    const release = await lockfile.lock(this.filePath, {
-      retries: {
-        retries: 20,
-        minTimeout: 50,
-        maxTimeout: 2000,
-      },
-      stale: 5000,
-    });
-
-    try {
-      // Re-load metadata inside lock to get latest state
-      const currentMetadata = await this.loadMetadata();
-      for (const entry of entries) {
-        currentMetadata.entries[entry.hash] = entry;
-      }
-      await this.saveMetadata(currentMetadata);
-      return currentMetadata;
-    } finally {
-      await release();
-    }
+    logger.warn(`Failed to cleanup metadata database: ${message}`);
  }


⚠️ Potential issue | 🟡 Minor

ENOENT branch in the catch is dead code — force: true already suppresses it.

fs.rmSync with { force: true } silently ignores ENOENT/ENOTDIR, so the explicit ENOENT check on line 110 is unreachable.

Proposed simplification

export function cleanupExistingMetadata(metadataDbPath: string): void { logger.debug(`Cleaning up metadata database: ${metadataDbPath}`); try { fs.rmSync(metadataDbPath, { recursive: true, force: true }); logger.info(`🧹 Cleaned up metadata database: ${metadataDbPath}`); } catch (error) { const message = error instanceof Error ? error.message : String(error); - const code = - error instanceof Error && "code" in error - ? (error as NodeJS.ErrnoException).code - : undefined; - - if (code === "ENOENT") { - logger.debug( - `Metadata database already deleted or doesn't exist: ${metadataDbPath}`, - ); - return; - } - logger.warn(`Failed to cleanup metadata database: ${message}`); } }

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

export function cleanupExistingMetadata(metadataDbPath: string): void {

logger.debug(`Cleaning up metadata database: ${metadataDbPath}`);

try {

await fsPromises.access(this.filePath);

} catch {

await fsPromises.writeFile(

this.filePath,

JSON.stringify(createEmptyMetadata(), null, 2),

"utf-8",

try {

fs.rmSync(metadataDbPath, { recursive: true, force: true });

logger.info(`🧹 Cleaned up metadata database: ${metadataDbPath}`);

} catch (error) {

const code =

error instanceof Error && "code" in error

? (error as NodeJS.ErrnoException).code

: undefined;

const message = error instanceof Error ? error.message : String(error);

if (code === "ENOENT") {

logger.debug(

`Metadata database already deleted or doesn't exist: ${metadataDbPath}`,

);

return;

}

const release = await lockfile.lock(this.filePath, {

retries: {

retries: 20,

minTimeout: 50,

maxTimeout: 2000,

},

stale: 5000,

});

try {

// Re-load metadata inside lock to get latest state

const currentMetadata = await this.loadMetadata();

for (const entry of entries) {

currentMetadata.entries[entry.hash] = entry;

}

await this.saveMetadata(currentMetadata);

return currentMetadata;

} finally {

await release();

}

logger.warn(`Failed to cleanup metadata database: ${message}`);

}

export function cleanupExistingMetadata(metadataDbPath: string): void {

logger.debug(`Cleaning up metadata database: ${metadataDbPath}`);

try {

fs.rmSync(metadataDbPath, { recursive: true, force: true });

logger.info(`🧹 Cleaned up metadata database: ${metadataDbPath}`);

} catch (error) {

const message = error instanceof Error ? error.message : String(error);

logger.warn(`Failed to cleanup metadata database: ${message}`);

}

}

🤖 Prompt for AI Agents

In `@packages/new-compiler/src/metadata/manager.ts` around lines 97 - 118, The catch block in cleanupExistingMetadata currently checks for ENOENT which is unreachable because fs.rmSync(..., { force: true }) already silences missing-path errors; remove the ENOENT branch and simplify the catch to always compute the error message (using error instanceof Error ? error.message : String(error)) and call logger.warn with that message (retaining the existing `Failed to cleanup metadata database:` text) so any real errors are still reported; reference the cleanupExistingMetadata function and the catch block surrounding fs.rmSync(metadataDbPath, { recursive: true, force: true }).

feat(new-compiler): migrate metadata storage from lockfile to LMDB

a1edc10

coderabbitai bot reviewed Feb 3, 2026

View reviewed changes

packages/new-compiler/src/metadata/manager.test.ts Outdated Show resolved Hide resolved

fix(new-compiler): use correct assertion for sync constructor throw

dc7bd46

Co-authored-by: coderabbitai[bot] <136622811+coderabbitai[bot]@users.noreply.github.com>

AndreyHirsa requested a review from AleksandrSl February 3, 2026 14:49

AndreyHirsa added 3 commits February 3, 2026 20:05

fix(new-compiler): remove await from sync cleanupExistingMetadata calls

b2f695f

refactor(new-compiler): replace MetadataManager class with pure funct…

1bfb5ea

…ions

fix(new-compiler): optimization fixes

c82ee97

coderabbitai bot reviewed Feb 4, 2026

View reviewed changes

packages/new-compiler/src/metadata/manager.test.ts Show resolved Hide resolved

docs(new-compiler): clarify LMDB connection lifecycle comments

0b30991

coderabbitai bot reviewed Feb 6, 2026

View reviewed changes

AleksandrSl reviewed Feb 6, 2026

View reviewed changes

fix(new-compiler): address PR review feedback for LMDB metadata storage

6262f9e

coderabbitai bot reviewed Feb 9, 2026

View reviewed changes

AndreyHirsa requested a review from AleksandrSl February 9, 2026 13:53

Test (from LumoSQL example)	SQLite B-tree time	SQLite B-tree ops/sec	SQLite+LMDB time	SQLite+LMDB ops/sec
1000 INSERTs	2.789 s	~359 ops/s	1.484–1.429 s	~674–700 ops/s
25000 INSERTs in a transaction	0.085 s	~294k ops/s	0.094–0.096 s	~266k–260k ops/s

Conversation

AndreyHirsa commented Feb 3, 2026 • edited by coderabbitai bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Changes

Testing

Visuals

Checklist

Summary by CodeRabbit

Uh oh!

coderabbitai bot commented Feb 3, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Reviews paused

Walkthrough

Changes

Sequence Diagram

Estimated code review effort

Possibly related PRs

Suggested reviewers

Poem

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Feb 6, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Enable compression

Configure compression

Important notes from the docs

Benchmarks that report write throughput (ops/sec) for LMDB vs SQLite

1) LumoSQL speedtest-style benchmark (SQLite B-tree vs “SQLite+LMDB” storage backend)

2) LMDB-only “typical comparison” numbers (contextual reference)

Notes for interpreting “ops/sec”

Uh oh!

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Feb 6, 2026

Choose a reason for hiding this comment

Uh oh!

AleksandrSl left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

AndreyHirsa commented Feb 3, 2026 •

edited by coderabbitai bot

Loading

coderabbitai bot commented Feb 3, 2026 •

edited

Loading

coderabbitai bot Feb 6, 2026 •

edited

Loading