Keyboard shortcuts

Press or to navigate between chapters

Press S or / to search in the book

Press ? to show this help

Press Esc to hide this help

Tutorial: JSONL Incremental Parser

Build a high-performance streaming JSON Lines parser using synkit’s incremental parsing infrastructure.

Source Code

📦 Complete source: examples/jsonl-parser

What You’ll Learn

  1. ChunkBoundary - Define where to split token streams
  2. IncrementalLexer - Buffer partial input, emit complete tokens
  3. IncrementalParse - Parse from token buffers with checkpoints
  4. Async streaming - tokio and futures integration
  5. Stress testing - Validate memory stability under load

JSON Lines Format

JSON Lines uses newline-delimited JSON:

{"user": "alice", "action": "login"}
{"user": "bob", "action": "purchase", "amount": 42.50}
{"user": "alice", "action": "logout"}

Each line is a complete JSON value. This makes JSONL ideal for:

  • Log processing
  • Event streams
  • Large dataset processing
  • Network protocols

Why Incremental Parsing?

Traditional parsing loads entire input into memory:

let input = fs::read_to_string("10gb_logs.jsonl")?;  // ❌ OOM
let docs: Vec<Log> = parse(&input)?;

Incremental parsing processes chunks:

let mut lexer = JsonIncrementalLexer::new();
while let Some(chunk) = reader.read_chunk().await {
    for token in lexer.feed(&chunk)? {
        // Process tokens as they arrive
    }
}

Prerequisites

  • Completed the TOML Parser Tutorial (or familiarity with synkit basics)
  • Understanding of async Rust (for chapters 5-6)

Chapters

ChapterTopicKey Concepts
1. Token DefinitionToken enum and parser_kit!logos patterns, #[no_to_tokens]
2. Chunk BoundariesChunkBoundary traitdepth tracking, boundary detection
3. Incremental LexerIncrementalLexer traitbuffering, offset tracking
4. Incremental ParseIncrementalParse traitcheckpoints, partial results
5. Async Streamingtokio/futures integrationchannels, backpressure
6. Stress TestingMemory stability1M+ events, leak detection