The Core Dichotomy: Structure vs. Content
⚙️
Universal Grammar (UG) as "Firmware"
Represents the immutable, deep structures of language. It's the stable, rule-governed framework (syntax, phrase structures) that is universal across contexts.
🎨
Systemic Functional Linguistics (SFL) as "Software"
Represents the malleable, functional content of language. It's how specific utterances create meaning in a particular context (e.g., mood, theme, participant roles).
Interactive Schema Explorer
Select a view and click on a model to see its details.
Model Details
Click on a model in the diagram to view its properties and associations here.
Hybrid Data Processing Pipeline
This 5-stage pipeline processes raw text to populate both the UG and SFL models. Click each stage to expand.
Comparison with Standard RAG
✅ Advantages of this Architecture
- Contextual Precision: Retrieves information based on grammatical role and function, not just keyword similarity.
- Reduced Hallucinations: Provides the LLM with structured, less ambiguous context, reducing factual errors.
- Explainability: Queries can be traced through a formal linguistic structure, making results more transparent.
- Complex Queries: Enables queries that combine structural and semantic criteria (e.g., "Find all clauses where 'the company' is the actor in a material process").
❌ Challenges & Disadvantages
- Complexity: Requires significant upfront design and linguistic expertise to model the database schema.
- Performance Overhead: Parsing text into deep linguistic structures is computationally more expensive than vectorization.
- Scalability: Complex relational queries might be slower at massive scale compared to optimized vector index lookups.
- Brittleness: The formal structure might struggle to flexibly handle highly idiomatic or ungrammatical language.