feat(new-compiler): migrate metadata storage from lockfile to LMDB#1955
feat(new-compiler): migrate metadata storage from lockfile to LMDB#1955AndreyHirsa wants to merge 7 commits intomainfrom
Conversation
|
Note Reviews pausedIt looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the Use the following commands to manage reviews:
Use the checkboxes below for quick actions:
📝 WalkthroughWalkthroughReplaces JSON+file-lock metadata with LMDB-backed storage, removing the MetadataManager class and proper-lockfile; adds stateless functions (loadMetadata, saveMetadata, cleanupExistingMetadata, getMetadataPath), new .lingo/metadata-dev and .lingo/metadata-build locations, and tests/docs updates. Changes
Sequence DiagramsequenceDiagram
participant Compiler as Compiler (Plugin)
participant SaveFunc as saveMetadata(dbPath, entries)
participant LoadFunc as loadMetadata(dbPath)
participant DB as LMDB Database
rect rgba(100,150,200,0.5)
Note over Compiler,SaveFunc: Save metadata flow
Compiler->>SaveFunc: saveMetadata(dbPath, entries)
SaveFunc->>DB: openDatabase(dbPath)
DB-->>SaveFunc: db handle
SaveFunc->>DB: write entries (transaction)
SaveFunc->>DB: update __stats__ key
DB-->>SaveFunc: commit
SaveFunc->>DB: close connection
SaveFunc-->>Compiler: confirmation
end
rect rgba(150,200,100,0.5)
Note over Compiler,LoadFunc: Load metadata flow
Compiler->>LoadFunc: loadMetadata(dbPath)
LoadFunc->>DB: openDatabase(dbPath)
DB-->>LoadFunc: db handle
LoadFunc->>DB: read all entries & __stats__
DB-->>LoadFunc: entries + stats
LoadFunc->>DB: close connection
LoadFunc-->>Compiler: MetadataSchema
end
Estimated code review effort🎯 4 (Complex) | ⏱️ ~50 minutes Possibly related PRs
Suggested reviewers
Poem
🚥 Pre-merge checks | ✅ 2 | ❌ 3❌ Failed checks (2 warnings, 1 inconclusive)
✅ Passed checks (2 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing touches
🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Fix all issues with AI agents
In `@packages/new-compiler/src/metadata/manager.test.ts`:
- Around line 309-314: The test is incorrectly treating the synchronous
MetadataManager constructor as async; update the "should throw descriptive error
for invalid path" test to directly assert the synchronous throw by replacing the
await expect(async () => new MetadataManager(invalidPath)).rejects.toThrow()
pattern with a synchronous assertion such as expect(() => new
MetadataManager(invalidPath)).toThrow() (or toThrowError/... if you want to
assert message content), referencing the MetadataManager constructor in that
test.
🧹 Nitpick comments (4)
packages/new-compiler/src/metadata/manager.ts (1)
205-226: Avoid O(n) full scans when updating stats.
Counting all keys on every save scales with database size and can slow large builds. Consider incrementally updatingtotalEntriesper write inside the transaction.♻️ Suggested refactor
- for (const entry of entries) { - this.db.put(entry.hash, entry); - } - - // Count entries explicitly (excluding stats key) for clarity - let entryCount = 0; - for (const { key } of this.db.getRange()) { - if (key !== STATS_KEY) { - entryCount++; - } - } - - const stats = { - totalEntries: entryCount, - lastUpdated: new Date().toISOString(), - }; + const prevStats = + (this.db.get(STATS_KEY) as MetadataSchema["stats"] | undefined) ?? { + totalEntries: 0, + lastUpdated: new Date().toISOString(), + }; + let entryCount = prevStats.totalEntries; + + for (const entry of entries) { + const existed = this.db.get(entry.hash) !== undefined; + this.db.put(entry.hash, entry); + if (!existed) entryCount++; + } + + const stats = { + totalEntries: entryCount, + lastUpdated: new Date().toISOString(), + };packages/new-compiler/src/metadata/manager.test.ts (3)
106-108: Awaiting a synchronous function is unnecessary.
cleanupExistingMetadatareturnsvoid(synchronous), soawaitis a no-op here. While this doesn't break functionality, it's misleading and inconsistent with the function signature.Suggested fix
afterEach(async () => { - await cleanupExistingMetadata(testDbPath); + cleanupExistingMetadata(testDbPath); });
380-384: Avoid returning values fromforEachcallbacks.
worker.kill()returns a boolean, triggering the Biome lint warning. Usefor...ofloop instead for cleaner semantics when you don't need the return values.Suggested fix
} finally { - workers.forEach((w) => w.kill()); + for (const w of workers) { + w.kill(); + } }
237-258: Same issue: awaiting synchronouscleanupExistingMetadatacalls.Lines 243, 303, 304, and 357 also await the synchronous
cleanupExistingMetadatafunction. Consider removingawaitfrom all these call sites for consistency.Also applies to: 286-307
Co-authored-by: coderabbitai[bot] <136622811+coderabbitai[bot]@users.noreply.github.com>
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Fix all issues with AI agents
In `@packages/new-compiler/src/metadata/manager.test.ts`:
- Around line 213-217: The test "should throw descriptive error for invalid
path" is flaky because it uses a hard-coded /root/... path; change it to create
a guaranteed-failing target using the tmp directory: create a temporary file
(e.g., via fs.writeFileSync or fs.mkdtempSync under os.tmpdir()) and pass that
file path to loadMetadata so mkdirSync will deterministically fail; update the
test code around the it block (the test function invoking loadMetadata) to
create and clean up the temp file and assert that loadMetadata(tmpFilePath)
rejects.
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Fix all issues with AI agents
In `@packages/new-compiler/src/plugin/transform/TRANSFORMATION_PIPELINE.md`:
- Around line 164-167: Replace the unsupported hard numbers and the "built-in"
compression claim in the TRANSFORMATION_PIPELINE.md bullet list: remove or
replace the "~1M ops/sec write speed vs ~50K for SQLite" line with a
conservative, sourced or generic statement such as "orders of magnitude faster
than SQLite for non-transactional writes" (or include specific benchmark
citations if you prefer exact figures), and change "Built-in LZ4 compression" to
"Optional LZ4 compression (enabled in this implementation)" to reflect that
compression is not enabled by default in lmdb-js; update the bullets containing
the exact strings "~1M ops/sec write speed vs ~50K for SQLite" and "Built-in LZ4
compression" accordingly.
🧹 Nitpick comments (2)
packages/new-compiler/src/plugin/transform/TRANSFORMATION_PIPELINE.md (2)
204-204: Clarify the concurrency model for bundler workers.The statement "multi-reader, single-writer" is accurate for LMDB's MVCC model, but needs clarification in the context of multiple bundler workers. The PR summary mentions "MVCC for safe concurrent access across bundler workers," but:
- How does LMDB handle concurrent write attempts from multiple worker processes?
- Are writes automatically serialized, or do workers need to retry on conflicts?
- Does each worker process get its own write transaction, or is there cross-process coordination?
Consider adding a sentence or two explaining how concurrent writes from multiple bundler workers are handled to help readers understand the practical implications.
268-268: Consider clarifying the distinction between metadata and cache storage.The document mentions two different storage locations:
- Metadata storage:
.lingo/metadata-{env}/(LMDB database for source strings - Line 160)- Translation cache:
.lingo/cache/{locale}.json(JSON files for translated strings - Line 268)While this separation makes sense architecturally, readers might benefit from a brief clarification about when each is used. Consider adding a note explaining that:
- Metadata stores source strings and context extracted at build time
- Cache stores the actual translations generated for each locale at runtime
This would help readers understand why there are two different storage systems.
| - ~1M ops/sec write speed vs ~50K for SQLite | ||
| - Zero-copy reads from memory-mapped files | ||
| - Built-in LZ4 compression | ||
| - Simple key-value API without SQL overhead |
There was a problem hiding this comment.
🧩 Analysis chain
🌐 Web query:
lmdb-js LZ4 compression support documentation
💡 Result:
lmdb-js supports optional off-thread LZ4 compression for values written via its async write path. It’s off by default and is enabled with the compression option when opening the DB. [1]
Enable compression
import { open } from 'lmdb';
const db = open('my-db', { compression: true });[1]
Configure compression
import fs from 'node:fs';
import { open } from 'lmdb';
const db = open('my-db', {
compression: {
threshold: 500, // only compress values > 500 bytes
dictionary: fs.readFileSync('dict.txt') // optional shared dictionary (Buffer)
}
});Supported settings documented:
threshold(default1000when compression is enabled)dictionary(Buffer; defaults to an internal shared dictionary) [1][2]
Important notes from the docs
- Decompression happens automatically on reads (including
get/getRange). [1][2] - If you enable compression for a DB, you must always open it with the same compression setting so data can be decompressed correctly. [1]
- Child DBs opened from a root DB inherit the root DB’s compression settings by default. [1]
Sources: lmdb-js README “Compression” section [1] and the lmdb npm “Compression” docs [2].
References:
[1] https://github.com/kriszyp/lmdb-js (README → “Compression”)
[2] https://www.npmjs.com/package/lmdb (Compression section)
🌐 Web query:
LMDB write performance benchmark SQLite comparison ops per second
💡 Result:
Benchmarks that report write throughput (ops/sec) for LMDB vs SQLite
1) LumoSQL speedtest-style benchmark (SQLite B-tree vs “SQLite+LMDB” storage backend)
LumoSQL publishes example output (timings) for the same benchmark run across targets, including vanilla SQLite and SQLite built with an LMDB backend. From the example table: “1000 INSERTs” and “25000 INSERTs in a transaction” times are shown for each target. Converting those to inserts/sec:
| Test (from LumoSQL example) | SQLite B-tree time | SQLite B-tree ops/sec | SQLite+LMDB time | SQLite+LMDB ops/sec |
|---|---|---|---|---|
| 1000 INSERTs | 2.789 s | ~359 ops/s | 1.484–1.429 s | ~674–700 ops/s |
| 25000 INSERTs in a transaction | 0.085 s | ~294k ops/s | 0.094–0.096 s | ~266k–260k ops/s |
Source timings: LumoSQL benchmark-filter example output. [1]
Key takeaway from this specific example: LMDB-backend helps a lot for many small autocommit INSERTs, while batched INSERTs inside one transaction are already extremely fast on SQLite’s default B-tree and are in the same ballpark (or faster in that sample) than SQLite+LMDB. [1]
2) LMDB-only “typical comparison” numbers (contextual reference)
Some LMDB docs/libraries cite typical insert rates such as ~350k random inserts/sec and ~580k sequential inserts/sec (workload- and hardware-dependent). This is not a SQLite comparison, but it provides an order-of-magnitude reference for LMDB writes. [2]
Notes for interpreting “ops/sec”
- SQLite write throughput varies wildly depending on whether each INSERT is its own transaction (slow) vs explicit batching in a transaction / WAL / sync settings (often orders-of-magnitude faster). LumoSQL’s example illustrates this directly (hundreds ops/s vs hundreds of thousands ops/s). [1]
Sources:
[1] LumoSQL “benchmark-filter” example output showing per-test times for SQLite vs SQLite+LMDB targets.
[2] zlmdb LMDB examples page showing “Random Insert” and “Sequential Insert” ops/sec figures.
🌐 Web query:
lmdb-js npm library features compression
💡 Result:
lmdb-js (published on npm as lmdb) supports optional native, off-main-thread compression using LZ4—compression is performed in the same worker thread that batches/commits async writes, so it adds minimal main-thread overhead, and decompression is very fast (used on reads like get / range reads). [1][2]
How to enable/configure compression
- Compression is off by default; enable it by passing
compression: true(or an options object) when opening the database. [3] - Options include:
threshold: only compress values larger than this many bytes (default ~1000when enabled). [3]dictionary: provide a shared dictionary buffer to improve compression ratios for small/JSON-like values. [3]
Algorithm: LZ4. [1][2]
Example
import { open } from 'lmdb';
const db = open({
path: './mydb',
compression: { threshold: 500 } // compress values > 500 bytes
});[3]
Sources
- npm package docs for
lmdb(lmdb-js) describing off-main-thread LZ4 compression. [1][2] lmdb-storedocs (older package name) detailingcompression,threshold, anddictionaryoptions. [3]
🏁 Script executed:
cd packages/new-compiler && find . -name "*.md" -o -name "*.js" -o -name "*.ts" | grep -i transform | head -20Repository: lingodotdev/lingo.dev
Length of output: 479
🏁 Script executed:
cat -n packages/new-compiler/src/plugin/transform/TRANSFORMATION_PIPELINE.md | sed -n '160,196p'Repository: lingodotdev/lingo.dev
Length of output: 1070
🏁 Script executed:
rg -i "compression|lmdb" packages/new-compiler/src/plugin/transform/ -A 3 -B 3Repository: lingodotdev/lingo.dev
Length of output: 3947
🏁 Script executed:
rg -i "compression|open.*lmdb" packages/new-compiler/src -A 5 -B 2Repository: lingodotdev/lingo.dev
Length of output: 3314
🏁 Script executed:
grep -r "lmdb" packages/new-compiler/package.jsonRepository: lingodotdev/lingo.dev
Length of output: 85
Update performance claims and clarify compression feature status.
-
Performance numbers lack sources and don't align with published benchmarks. The claim of "~1M ops/sec vs ~50K for SQLite" is unsupported:
- SQLite write throughput varies widely based on transaction mode (~294k ops/sec in transactions, ~359-700 ops/sec for individual commits)
- LMDB typical throughput is 350k–580k sequential inserts/sec, not 1M
- Either provide specific benchmark sources or replace with general statements ("orders of magnitude faster than SQLite for non-transactional writes").
-
LZ4 compression is optional, not "built-in." While the codebase does enable it with
compression: true, lmdb-js ships with compression disabled by default. Clarify: "Optional LZ4 compression (enabled in this implementation)" rather than "Built-in LZ4 compression."
🤖 Prompt for AI Agents
In `@packages/new-compiler/src/plugin/transform/TRANSFORMATION_PIPELINE.md` around
lines 164 - 167, Replace the unsupported hard numbers and the "built-in"
compression claim in the TRANSFORMATION_PIPELINE.md bullet list: remove or
replace the "~1M ops/sec write speed vs ~50K for SQLite" line with a
conservative, sourced or generic statement such as "orders of magnitude faster
than SQLite for non-transactional writes" (or include specific benchmark
citations if you prefer exact figures), and change "Built-in LZ4 compression" to
"Optional LZ4 compression (enabled in this implementation)" to reflect that
compression is not enabled by default in lmdb-js; update the bullets containing
the exact strings "~1M ops/sec write speed vs ~50K for SQLite" and "Built-in LZ4
compression" accordingly.
There was a problem hiding this comment.
Interesting 🤔 I do not have an opinion on LMDB vs Sqlite myself, we do not care about >100k ops per sec I believe, the easier the setup and the better it handles concurrent writes, the better
There was a problem hiding this comment.
@AleksandrSl, that makes perfect sense—LMDB is actually an excellent fit for your priorities! Here's why it aligns well with your requirements:
Concurrent writes across bundler workers: LMDB uses MVCC and allows multiple processes to access the database simultaneously, with write transactions serialized internally while readers never block writers (or vice versa). This is exactly what you need for multiple Vite/Webpack workers writing metadata without ELOCKED errors.
Simple setup: lmdb-js is straightforward—just npm install lmdb and call open(). No database server, no configuration files, no manual schema setup. It uses copy-on-write so there's no recovery needed after crashes, which means less operational overhead.
For your docs: I'd suggest reframing the "Why LMDB?" section to emphasize the concurrency benefits that matter for your use case:
- "Handles concurrent writes from multiple bundler workers via MVCC"
- "Zero-setup embedded database (no server required)"
- "Crash-safe with no recovery procedures needed"
And update the compression line to: "Optional LZ4 compression (enabled in this implementation)" since compression is off by default in lmdb-js.
The raw performance numbers don't really matter for your scale—the key win is eliminating those lockfile errors while keeping setup simple! 🎯
AleksandrSl
left a comment
There was a problem hiding this comment.
Looks reasonable if it passes the real workd case scenario.
In the near future I'd also suggest adding an e2e test, I think we can mimic the high load and see that the page loads fine
| sourceText?: string; | ||
| } = {}, | ||
| ): TranslationEntry { | ||
| const hash = overrides.hash ?? `hash_${Math.random().toString(36).slice(2)}`; |
There was a problem hiding this comment.
nit: we probably can use a real hash function here, to avoid extra mocks?
There was a problem hiding this comment.
Switched to using the generateTranslationHash - no more mocks needed.
| context: { filePath: "test.tsx", componentName: "TestComponent" }, | ||
| location: { filePath: "test.tsx", line: 1, column: 1 }, | ||
| ...overrides, | ||
| } as TranslationEntry; |
There was a problem hiding this comment.
Why does it not comply to the type so we have to cast?
There was a problem hiding this comment.
Right, that was ugly. Refactored.
| }); | ||
|
|
||
| it("should load and preserve all entry fields", async () => { | ||
| const entry: TranslationEntry = { |
| expect(updated.entries["entry-1"].sourceText).toBe("v2"); | ||
|
|
||
| // Empty array should not change anything | ||
| await saveMetadata(testDbPath, []); |
There was a problem hiding this comment.
nit: i think this tests is a bit to big in terms of responsibility, it tests several things (there is even and in the name 😄 )
There was a problem hiding this comment.
Fair, split it into separate tests.
| it("should maintain data integrity after many operations", async () => { | ||
| // Many saves with overlapping keys | ||
| for (let i = 0; i < 10; i++) { | ||
| await saveMetadata(testDbPath, [ |
There was a problem hiding this comment.
Let's make the saves here and above concurrent? They won't be fully parallele, but we may test at least asynchronous work.
I mean - run a lot of saves in the Promise.all()
There was a problem hiding this comment.
Ah, lol, I see they are below, I think we can keep only concurrent versions then to keep amount of the tests reasonable, since concurrent adds more values, and there is nothing specific to catch in the synchronous test
There was a problem hiding this comment.
Makes sense. Dropped the sequential test and kept only the Promise.all version
| * Closing before the loader returns ensures the handle is released even if the | ||
| * worker is killed right after. | ||
| */ | ||
| function openDatabaseConnection(dbPath: string): RootDatabase { |
There was a problem hiding this comment.
As a small improvment if we mostly opening and closing the database in every method, not keeping the connection alive, we can create method like runWithDatabaseConnect or something similar which will do
function runWithDbConnection(f: (db: Database) -> void) {
try {
const db = open()
f(db)
} finally {
close()
}
}
So we simplify the work if anyone will be adding new methods, they won't forget to close the DB.
There was a problem hiding this comment.
Good point. Refactored.
| * | ||
| * @param config - Config with sourceRoot, lingoDir, and environment | ||
| * @returns Absolute path to metadata file | ||
| * Uses transactionSync to batch all writes into a single commit. Async puts |
There was a problem hiding this comment.
I think we can decrease the level of details here and only leave the things that may affect the user. e.g. knowing that this function calls transactionSync means little
| await fsPromises.unlink(tmpPath).catch(() => {}); | ||
| } | ||
| const totalKeys = db.getKeysCount(); | ||
| const entryCount = |
There was a problem hiding this comment.
We can check if we use entryCount anywhere and drop it entirely, I think it was added by an LLM and it was almost free to count it before, bow that we have to get stats and then do a - 1(why - by the way) it's time to consider the pros of this field
There was a problem hiding this comment.
Yep, there is no real value in keeping this. Statistics have been removed. Metadata is now just a Record<string, TranslationEntry>
| - ~1M ops/sec write speed vs ~50K for SQLite | ||
| - Zero-copy reads from memory-mapped files | ||
| - Built-in LZ4 compression | ||
| - Simple key-value API without SQL overhead |
There was a problem hiding this comment.
Interesting 🤔 I do not have an opinion on LMDB vs Sqlite myself, we do not care about >100k ops per sec I believe, the easier the setup and the better it handles concurrent writes, the better
| framework: currentFramework, | ||
| errorType: "transform", | ||
| errorMessage: error instanceof Error ? error.message : "Unknown transform error", | ||
| errorMessage: |
There was a problem hiding this comment.
Interesting that it formats the code another way, did I forget to setup prettier 🤔
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Fix all issues with AI agents
In `@packages/new-compiler/src/metadata/manager.ts`:
- Around line 97-118: The catch block in cleanupExistingMetadata currently
checks for ENOENT which is unreachable because fs.rmSync(..., { force: true })
already silences missing-path errors; remove the ENOENT branch and simplify the
catch to always compute the error message (using error instanceof Error ?
error.message : String(error)) and call logger.warn with that message (retaining
the existing `Failed to cleanup metadata database:` text) so any real errors are
still reported; reference the cleanupExistingMetadata function and the catch
block surrounding fs.rmSync(metadataDbPath, { recursive: true, force: true }).
🧹 Nitpick comments (1)
packages/new-compiler/src/metadata/manager.ts (1)
50-61:fncallback typed as synchronous — consider guarding against accidental async usage.
runWithDbConnectionacceptsfn: (db: RootDatabase) => T. If an async callback were ever passed, thefinallyblock wouldawait closeDatabaseConnection(…)before the returned promise settles, closing the DB while the operation is still in flight. This is fine today since all callers are synchronous, but the signature doesn't prevent misuse.Optional: add an overload or a runtime guard
A lightweight approach — add a JSDoc note:
/** * Opens a database connection, runs the callback, and ensures the connection * is closed afterwards. + * + * ⚠️ `fn` MUST be synchronous. An async callback would cause the DB to close + * before its work completes. */ async function runWithDbConnection<T>(
| export function cleanupExistingMetadata(metadataDbPath: string): void { | ||
| logger.debug(`Cleaning up metadata database: ${metadataDbPath}`); | ||
|
|
||
| try { | ||
| await fsPromises.access(this.filePath); | ||
| } catch { | ||
| await fsPromises.writeFile( | ||
| this.filePath, | ||
| JSON.stringify(createEmptyMetadata(), null, 2), | ||
| "utf-8", | ||
| try { | ||
| fs.rmSync(metadataDbPath, { recursive: true, force: true }); | ||
| logger.info(`🧹 Cleaned up metadata database: ${metadataDbPath}`); | ||
| } catch (error) { | ||
| const code = | ||
| error instanceof Error && "code" in error | ||
| ? (error as NodeJS.ErrnoException).code | ||
| : undefined; | ||
| const message = error instanceof Error ? error.message : String(error); | ||
|
|
||
| if (code === "ENOENT") { | ||
| logger.debug( | ||
| `Metadata database already deleted or doesn't exist: ${metadataDbPath}`, | ||
| ); | ||
| return; | ||
| } | ||
|
|
||
| const release = await lockfile.lock(this.filePath, { | ||
| retries: { | ||
| retries: 20, | ||
| minTimeout: 50, | ||
| maxTimeout: 2000, | ||
| }, | ||
| stale: 5000, | ||
| }); | ||
|
|
||
| try { | ||
| // Re-load metadata inside lock to get latest state | ||
| const currentMetadata = await this.loadMetadata(); | ||
| for (const entry of entries) { | ||
| currentMetadata.entries[entry.hash] = entry; | ||
| } | ||
| await this.saveMetadata(currentMetadata); | ||
| return currentMetadata; | ||
| } finally { | ||
| await release(); | ||
| } | ||
| logger.warn(`Failed to cleanup metadata database: ${message}`); | ||
| } |
There was a problem hiding this comment.
ENOENT branch in the catch is dead code — force: true already suppresses it.
fs.rmSync with { force: true } silently ignores ENOENT/ENOTDIR, so the explicit ENOENT check on line 110 is unreachable.
Proposed simplification
export function cleanupExistingMetadata(metadataDbPath: string): void {
logger.debug(`Cleaning up metadata database: ${metadataDbPath}`);
try {
fs.rmSync(metadataDbPath, { recursive: true, force: true });
logger.info(`🧹 Cleaned up metadata database: ${metadataDbPath}`);
} catch (error) {
const message = error instanceof Error ? error.message : String(error);
- const code =
- error instanceof Error && "code" in error
- ? (error as NodeJS.ErrnoException).code
- : undefined;
-
- if (code === "ENOENT") {
- logger.debug(
- `Metadata database already deleted or doesn't exist: ${metadataDbPath}`,
- );
- return;
- }
-
logger.warn(`Failed to cleanup metadata database: ${message}`);
}
}📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| export function cleanupExistingMetadata(metadataDbPath: string): void { | |
| logger.debug(`Cleaning up metadata database: ${metadataDbPath}`); | |
| try { | |
| await fsPromises.access(this.filePath); | |
| } catch { | |
| await fsPromises.writeFile( | |
| this.filePath, | |
| JSON.stringify(createEmptyMetadata(), null, 2), | |
| "utf-8", | |
| try { | |
| fs.rmSync(metadataDbPath, { recursive: true, force: true }); | |
| logger.info(`🧹 Cleaned up metadata database: ${metadataDbPath}`); | |
| } catch (error) { | |
| const code = | |
| error instanceof Error && "code" in error | |
| ? (error as NodeJS.ErrnoException).code | |
| : undefined; | |
| const message = error instanceof Error ? error.message : String(error); | |
| if (code === "ENOENT") { | |
| logger.debug( | |
| `Metadata database already deleted or doesn't exist: ${metadataDbPath}`, | |
| ); | |
| return; | |
| } | |
| const release = await lockfile.lock(this.filePath, { | |
| retries: { | |
| retries: 20, | |
| minTimeout: 50, | |
| maxTimeout: 2000, | |
| }, | |
| stale: 5000, | |
| }); | |
| try { | |
| // Re-load metadata inside lock to get latest state | |
| const currentMetadata = await this.loadMetadata(); | |
| for (const entry of entries) { | |
| currentMetadata.entries[entry.hash] = entry; | |
| } | |
| await this.saveMetadata(currentMetadata); | |
| return currentMetadata; | |
| } finally { | |
| await release(); | |
| } | |
| logger.warn(`Failed to cleanup metadata database: ${message}`); | |
| } | |
| export function cleanupExistingMetadata(metadataDbPath: string): void { | |
| logger.debug(`Cleaning up metadata database: ${metadataDbPath}`); | |
| try { | |
| fs.rmSync(metadataDbPath, { recursive: true, force: true }); | |
| logger.info(`🧹 Cleaned up metadata database: ${metadataDbPath}`); | |
| } catch (error) { | |
| const message = error instanceof Error ? error.message : String(error); | |
| logger.warn(`Failed to cleanup metadata database: ${message}`); | |
| } | |
| } |
🤖 Prompt for AI Agents
In `@packages/new-compiler/src/metadata/manager.ts` around lines 97 - 118, The
catch block in cleanupExistingMetadata currently checks for ENOENT which is
unreachable because fs.rmSync(..., { force: true }) already silences
missing-path errors; remove the ENOENT branch and simplify the catch to always
compute the error message (using error instanceof Error ? error.message :
String(error)) and call logger.warn with that message (retaining the existing
`Failed to cleanup metadata database:` text) so any real errors are still
reported; reference the cleanupExistingMetadata function and the catch block
surrounding fs.rmSync(metadataDbPath, { recursive: true, force: true }).
Summary
Migrate metadata storage from JSON files with lockfile to LMDB database for improved concurrent access and elimination of ELOCKED errors.
Changes
Testing
Business logic tests added:
Visuals
N/A - No UI changes
Checklist
Closes #1636
Summary by CodeRabbit
New Features
Documentation
Tests
Chores