R&D Innovation Lab

Advanced AI Research

Code-switched NLP, multi-tier memory retrieval, and on-demand social graph extraction built for the 500M people who think in more than one language at once

EN · HI · PA

3 Languages + 2 Code-Switches

3-tier

Memory Architecture

E2E

AES-256 Encrypted

∅ms

Memory Overhead

Active Research Areas

Hard engineering problems at the intersection of code-switched NLP, episodic memory, and social graph inference

Code-Switched Emotion Detection

A single tokenization pipeline handles intra-sentence language switches across three scripts — English, Hindi (Devanagari), and Punjabi (Gurmukhi) — plus two code-switched modes: Hinglish and PunjabiEnglish. Script detection routes each segment to the appropriate sub-model with no preprocessing step. Emotion labels are unified across all five surface forms.

Code-SwitchingScript DetectionUnified Tokenizer

Multi-Tier Episodic Memory

Three retrieval tiers assembled on-demand per request: a hot cache (last 7 days, O(1) key lookup), a warm tier (emotional episode extraction with temporal triggers on phrases like "last week" or "kal"), and a cold tier (dense vector search over older episodes using cosine similarity against a 768-dim embedding index). Total injection overhead: zero added latency on cache hit.

Vector SearchEpisodic RetrievalTemporal Triggers

Social Graph Extraction

After each conversation turn, a background extraction pipeline identifies named entities, infers relationship types (colleague, partner, family), and writes structured records to a partitioned vector collection — one partition per community context. This builds a persistent social knowledge graph from unstructured natural conversation with no explicit user input.

NER PipelineRelationship InferencePartitioned Graph

Zero-Knowledge Conversation Storage

All conversation messages are AES-256 encrypted at the application layer before write — keys are derived per-user, never stored alongside ciphertext. The storage layer sees only opaque blobs. Memory retrieval, embedding, and summarization all operate on decrypted payloads in-process with no plaintext persistence. Encryption adds no user-facing latency.

AES-256Per-User Key DerivationZero Plaintext at Rest

Open Problems

Unsolved: reliable emotion classification on intra-sentence switches where the sentiment word and the subject are in different scripts. Coreference resolution across sessions without a persistent entity store. Measuring response quality for emotional support without ground-truth labels. These are the real frontiers.

Live Architecture

Request Pipeline

Four sequential stages. Every message, every time.

Stage 01

Language Detection

Script analysis, Hinglish marker matching, language routing — single tokenization pass before the LLM call

< 2ms

Stage 02

Memory Assembly

Hot cache lookup → warm episode extraction → cold vector retrieval, resolved in priority order and token-budgeted

O(1) hot · ~40ms cold

Stage 03

Grounded Generation

Language pack, memory context, and persona signals compose a structured prompt — culturally-grounded, zero hallucinated facts

LLM latency only

Stage 04

Async Extraction

Post-response: entity extraction, session stats, memory indexing, graph writes — all non-blocking, zero impact on response latency

0ms (async)

request

response

Engineering Principles

Encrypt Everything, Trust Nothing

AES-256 at the application layer before any DB write. The storage tier is treated as untrusted. Decryption happens in-process, never cached to disk.

Retrieval on the Critical Path Must Be O(1)

Hot-tier memory is a keyed cache lookup. Cold-tier vector search only triggers when temporal signals appear in the message. No semantic search on every request.

Language as a First-Class Signal

Script and language identity are detected per-message, not per-session. Response style, vocabulary, and slang mirroring adapt within a single conversation turn.

Post-Response Work Is Always Async

Entity extraction, graph writes, session stats, and embedding indexing are all fire-and-forget after the stream completes. Response latency is bounded by generation, not bookkeeping.

Engineering Roadmap

Concrete technical objectives, not aspirational metrics

Cross-Session Coreference

Resolve entity mentions across sessions without a full entity store — link "he" in session 12 to the named person from session 3

Intra-Sentence Switch Classification

Accurate emotion label when sentiment word and subject are in different scripts within the same clause

Adaptive Persona Calibration

Learn per-user drama intensity and slang mirroring coefficients from conversation history rather than static defaults

Response Quality Signal Without Labels

Proxy metrics for emotional support quality — return rate, session depth, sentiment shift — without requiring human annotation

Interested in our research or want to collaborate?

Experience the Innovation

Learn more about our mission or explore how it works