Projects

Things I've Built

Most of these started with a question I couldn't stop thinking about. Some turned into production systems. Others taught me why the problem was hard.

---

Temporal CNN-Based Intrusion Detection System

Capstone Project · Active

The Problem: DDoS detection is a sequence problem, not a snapshot problem. A single packet tells you almost nothing. A sequence of packets — their timing, inter-arrival deltas, rate ramp — tells you everything. Most detection systems either ignore this (static thresholds), overfit it (LSTM on sparse flows), or drown in it (317 features per flow, all noise).

The Architecture: Temporal Convolutional Network trained on BCCC-Cloud-DDoS-2024 — 540,000+ real flows, mix of DDoS and benign traffic. TCNs handle sequential patterns better than LSTMs on this kind of data: no vanishing gradients, parallelizable training, stable receptive fields across variable-length flows.

Feature Engineering:

  1. The model doesn't need all of them; most are correlated or irrelevant to attack signatures.

Built an automated feature selection pipeline that identified 32 critical indicators. 90% dimensionality reduction. Training efficiency improved 85%. Detection accuracy held. The 32 features that survived are the ones that actually discriminate attacks — not statistical noise amplified by high dimensionality.

Explainability: Integrating Gemini API to generate natural language alert explanations. Instead of ALERT: flow_id=47293 label=anomaly confidence=0.94, security teams get:

> "Coordinated SYN flood — 47 source IPs, packet rate 400× baseline, sustained 3-minute window targeting port 443. Consistent with volumetric DDoS. Recommend upstream rate limiting."

The detection accuracy matters. So does whether the team can act on it in 30 seconds.

Tech: Python · TensorFlow · Temporal CNN · Gemini API · Pandas · scikit-learn · network flow analysis

Status: Tuning false positive rate. Detection without alert fatigue is the actual benchmark.

---

QUIC Router Simulation

Side Project · Ongoing

The Question: When 10,000 flows compete for 1Gbps, which packet gets sent first — and what does that choice do to tail latency?

What I'm Building: Simulated QUIC router environment to test queue scheduling algorithms under real load conditions:

Measuring P99 tail latency specifically — not mean, not median. P99 is where video streaming stutters, where WebRTC calls drop frames, where the user experience degrades. Mean latency looks fine right up until it doesn't.

Why QUIC: QUIC is the transport layer behind HTTP/3. Every modern browser uses it. Understanding queue scheduling behavior at this layer means understanding where real internet bottlenecks form — not in theory, but under concurrent mixed-priority traffic.

Tech: Python · network simulation · QUIC protocol stack · algorithmic queue management

---

DSA Coaching Platform

Full-Stack Project

The Problem: Most DSA prep resources are either passive (YouTube videos) or expensive (paid tutors). Students who learn by doing and explaining need something in between — a space where they can work through problems, see structured breakdowns, and build intuition rather than memorize patterns.

What I Built: Full-stack web platform for interactive DSA coaching. Students work through curated problem sets organized by concept — not just difficulty. Each problem includes structured approach breakdowns, edge case analysis, and complexity proofs. Built the backend API, content system, and frontend from scratch.

What It Taught Me: Content organization is a design problem. How you structure information determines how people learn from it. The graph thinking applies: prerequisites are edges, concepts are nodes, and good pedagogy is shortest-path from confusion to understanding.

Tech: Python · FastAPI · React · PostgreSQL · Docker

---

Self-Hosted AI Inference Stack

Infrastructure Project · Ongoing

Why Self-Host: Running inference through APIs gives you outputs. Running it on your own hardware gives you understanding. Quantization tradeoffs, memory bandwidth ceilings, latency under concurrent requests, thermal throttling at sustained load — these are the constraints that matter when you're building production AI systems, and you don't learn them from API calls.

What I Run: Local LLM inference stack on personal hardware. Quantized models (GGUF/GGML), inference backends, load testing under simulated concurrent requests. Measuring actual throughput, token/sec at various batch sizes, and latency degradation curves.

What I've Learned:

Tech: Linux · CUDA/CPU inference backends · llama.cpp · Python · local networking

---

Network Intrusion Detection Using Deep Learning

Research Project · Anna University

The Challenge: Detect 9 different attack types (DoS, exploits, reconnaissance, brute force, backdoors, and more) from raw network flow records. Dataset: UNSW-NB15 — 2.5 million flows, 49 raw features, real attack captures.

The Architecture: Hybrid LSTM-CNN: LSTM to capture sequential flow behavior across time, CNN to detect local feature patterns within individual flows. Together they handle both temporal dependencies and spatial feature interactions.

The Optimization Problem:

  1. Many correlated. Some carry no signal for attack discrimination. Training on all 42 adds noise and inflates model complexity.

Built a two-stage metaheuristic pipeline:

  1. Sine Cosine Algorithm (SCA) — global search across feature space
  2. Particle Swarm Optimization (PSO) — local refinement of candidate feature sets

Result: 42 → 18 features. False positive rate dropped 25%. Real-time detection latency improved. The interesting part: PSO from swarm intelligence literature solved a network security problem better than manual feature engineering.

What I Learned: Feature selection is the work. Not a preprocessing step. The 18 features that survived are the ones that actually separate attack behavior from normal flow patterns. Everything else is noise that makes your model worse.

Tech: Python · TensorFlow · LSTM · CNN · metaheuristic optimization · UNSW-NB15

---

Secure Digital Library Platform

Production System · Anna University

The Context:

  1. Different access levels for students, faculty, and administrators. Had to handle traffic spikes during exam season without falling over.

What I Built: Full-stack platform with OAuth 2.0 authentication, role-based access control, and containerized microservices. PostgreSQL backend. Docker deployment for horizontal scaling.

The Results:

  1. 5% uptime. Sub-200ms API responses under peak load. Students could access resources when they needed them — exam night included.

What Mattered: Session management correctness, RBAC implementation that didn't have privilege escalation gaps, and database queries that didn't degrade under concurrent reads. The boring infrastructure correctness that makes or breaks production systems.

Tech: Node.js · PostgreSQL · React · Docker · OAuth 2.0 · RBAC

---

Face Detection Mobile App

Undergrad Project

The Architecture: Android app (Java) → FastAPI backend on Heroku → OpenCV Haar Cascade detection → bounding box coordinates → real-time camera overlay.

Async request handling on the backend so multiple concurrent users don't block each other. RESTful API with documented endpoints so the mobile client stays decoupled from detection logic.

Why I Built It: Wanted to understand the full stack: mobile client, API design, computer vision algorithm, cloud deployment, how data actually flows from camera pixel to screen annotation. The full graph, end to end.

Tech: Android · Java · FastAPI · OpenCV · Heroku · REST

Source: GitLab

---

NFT Tracking Dashboard

24-Hour Hackathon · NIT Trichy · First Place

The Situation: NIT Trichy blockchain competition. 24 hours. Tool I'd never touched: Google Apps Script.

What I Built: Real-time dashboard tracking NFT transactions across the blockchain. Live data pulls, transaction pattern visualization, working in 24 hours.

What It Taught Me: You can absorb a new tool fast when you're shipping something real. Also: sometimes the right tool is whatever gets it working in time, not the theoretically perfect solution.

Tech: Google Apps Script · blockchain APIs · rapid prototyping

---

3D Campus Digital Twin

Undergrad Exploration

Modeled my entire undergrad campus in 3D using Blender. Completely unrelated to my career. Taught me a lot about representing complex spatial systems — how you abstract real-world geometry into navigable, queryable representations.

Sometimes the detours are worth it.

---

What Connects These

Feature selection appears in three separate projects. Graph thinking shows up in every one. The gap between benchmark accuracy and production usefulness is the recurring constraint.

Whether it's packets through router queues, network flows hiding attacks, or users hitting a database during exams — the structure is the same. Understand the graph. Find where it breaks. Build the thing that holds.

---

Want to talk about any of these? lokeshlks01@gmail.com · GitHub