
Amiru
Umavin Mallawa Arachchi
I'm an Gen AI Engineer and Software Engineering student passionate about building intelligent systems that actually work in production. I specialize in the full AI engineering stack, from designing LLM-powered pipelines to deploying multi-agent systems. My core toolkit spans Python, LangChain, LangGraph, OpenAI API, HuggingFace Transformers, Pinecone, FAISS, and RAG architecture. I work with agentic frameworks including CrewAI and AutoGen, and I'm experienced in prompt engineering, vector database design, and LLM fine-tuning workflows. On the software engineering side, I bring solid full-stack depth — React, Next.js, FastAPI, Node.js, TypeScript, REST APIs, and SQL/NoSQL databases. I've worked across the stack in Java, Python, JavaScript, and C#, with a strong foundation in object-oriented design, system architecture, and version control with Git. Beyond the code, I think carefully about how AI systems are structured — memory management, tool use, orchestration, observability — because building agents that are reliable matters more than building agents that are impressive. Currently in my final year of BSc (Hons) Software Engineering at Cardiff Metropolitan University, enrolled in the STEMLink AI Engineer Bootcamp, and publishing technical content under my personal brand Amiru AI Lab and GitHub and Medium. Always open to AI engineering internships, collaborations, and conversations about what's next in the space.
Tech stack
Projects
Multi-Task NLP Intelligence Suite (Natural Language Processing - ML Project)
🔖 Introduction About the project The Multi-Task NLP Intelligence Suite is a unified neural computing platform designed for developers and AI enthusiasts in the Software & AI Research industry. The project aims to centralize five core NLP operations—Summarization, Sentiment Analysis, Zero-Shot Classification, Named Entity Recognition, and Question Answering—into a single, high-performance interface. I built this suite to bridge the gap between complex backend transformer models and a developer-centric user experience. 📝 1. Summarization Target Model: facebook/bart-large-cnn Input Text (Long-form): The James Webb Space Telescope (JWST) is a space telescope developed by NASA with contributions from the European Space Agency (ESA) and the Canadian Space Agency (CSA). It is planned to succeed the Hubble Space Telescope as NASA's flagship mission in astrophysics. JWST was launched on 25 December 2021 on an Ariane 5 rocket from Kourou, French Guiana, and arrived at the Sun–Earth L2 Lagrange point in January 2022. The first image from JWST was released to the public on 11 July 2022. JWST is designed to provide improved infrared resolution and sensitivity over Hubble, viewing objects up to 100 times fainter than the faintest objects detectable by Hubble. This will enable a broad range of investigations across the fields of astronomy and cosmology, such as observation of some of the oldest, most distant, events and objects in the universe, and characterizing the atmospheres of potentially habitable exoplanets. 😊 2. Sentiment Analysis Target Model: cardiffnlp/twitter-roberta-base-sentiment-latest Positive Fragment: The new software update is absolutely incredible! Performance has doubled on my device and the interface is stunning. Negative Fragment: I am deeply disappointed with the latest service outage. It has completely disrupted our workflow for three days. Neutral Fragment: The meeting is scheduled for 3:00 PM in the conference room. Please bring your updated reports. 🎯 3. Zero-Shot Classification Target Model: facebook/bart-large-mnli Input Text: The local team won the championship after a thrilling overtime victory, securing their third trophy in five years. 🏷️ 4. Named Entity Recognition (NER) Target Model: dslim/bert-base-NER Input Text: Elon Musk, the CEO of Tesla, met with officials in Berlin to discuss the expansion of Giga Berlin. ❓ 5. Contextual Q&A Target Model: deepset/roberta-base-squad2 Context: The Eiffel Tower is a wrought-iron lattice tower on the Champ de Mars in Paris, France. It is named after the engineer Gustave Eiffel, whose company designed and built the tower. Constructed from 1887 to 1889 as the entrance to the 1889 World's Fair, it was initially criticized by some of France's leading artists and intellectuals for its design, but it has become a global cultural icon of France and one of the most recognizable structures in the world. Questions: Who designed the Eiffel Tower? When was it constructed? Where is it located? 🤔 Problem space Problems to solve/Requirements to Create The current landscape of NLP model testing is often fragmented, requiring developers to switch between disparate tools or write custom scripts for even simple inference checks. 👉 Fragmented Inference Workflows Developers often need to interact with multiple HuggingFace models separately, leading to a disconnected workflow when building multi-task AI pipelines. Current solution Users typically resort to a mix of Python scripts, Jupyter Notebooks, or the HuggingFace web playground, which lacks a unified interface for custom multi-task testing How do we know it is a problem Excessive context-switching between model playgrounds. High barrier to entry for non-technical stakeholders to "see" the model performance. 👉 Lack of "Code-First" Visualization Most NLP playgrounds are designed for general users, missing the technical metadata (like confidence scores, compression ratios, and probability matrices) that developers need for debugging. Current solution Console logs or raw JSON blobs are the standard way developers inspect model outputs, which are difficult to parse visually for patterns. How do we know it is a problem Developer feedback on the difficulty of visualizing NER entity spans in raw text. Speed of debugging is hindered by manual mapping of "LABEL_0" to human-readable sentiment. Why solve these problems? Addressing these friction points is critical for accelerating the prototyping phase of AI-driven applications. Speed of Prototyping: Allows engineers to validate model suitability for specific datasets in seconds. Improved Collaboration: Provides a visual "Ground Truth" that can be shared with non-developer team members. Goals Company objective 🎯 To establish a leading developer-focused AI toolset that simplifies the integration of Large Language Models (LLMs) and Transformers into production environments. Project goals Unified Interface: Built a single-page application (SPA) that seamlessly switches between 5 complex NLP engines without reloading. DevKit Aesthetic: Crafted a minimalist, high-contrast UI that mirrors a code editor, making it feel native to a developer’s workflow. Low-Latency Feedback: Optimized the FastAPI backend to handle multi-task inference with minimal overhead, utilizing GPU acceleration where available. User Stories AI Engineer A technical user who needs to quickly verify if facebook/bart-large-cnn is suitable for their specific technical documentation datasets. Goals: Validate summarization quality and compression ratios. Needs: Access to raw inference metrics and easy cURL export for integration tests. Product Manager A non-technical user who wants to "feel" the model's performance on customer feedback before approving a feature roadmap. Goals: See sentiment analysis trends and NER entity extractions. Needs: A clean, understandable UI that visually highlights entities and positive/negative scores without looking at code. 🌟 Design space UI Design The design follows a "Side-by-Side Editor" flow. The left panel serves as the Terminal/Editor where users input raw text (mocked as a TypeScript constant), and the right panel serves as the Inference Report , mimicking an IDE’s output window. High-fidelity design The interface utilizes a sleek dark mode with neon accents—green for success/summarization, purple for classification, and orange for Q&A. This color coding provides instant categorical recognition. Design system 🎨 We utilized Tailwind CSS 4 and Lucide React icons to build a custom design system focused on: Consistency: Uniform border-radius and "glassmorphism" effects across all tool modules. Modularity: Each NLP tool is a standalone script simulation, making it easy to add new "files" to the sidebar explorer. Responsiveness: Recently updated to support a mobile sidebar and stacked panel architecture for use on the go. Development Phase Technology Stack Selection 1. Backend - FastAPI & HuggingFace Transformers Why FastAPI? Performance: Highly asynchronous, allowing multiple inference requests to be queued efficiently. Auto-Documentation: Automatic Swagger/OpenAPI generation for the 5 NLP endpoints. Why HuggingFace Transformers? SOTA Accuracy: Access to industry-leading models like BART for summarization and RoBERTa for sentiment. Unified Pipeline API: Allows the backend to switch between different model tasks with a consistent implementation pattern. 2. Frontend - Next.js 15 (App Router) Reusable UI Components: Used for building modular tool views. TypeScript First: Ensures type-safety between the complex backend schemas and the frontend state. High-Level Architecture Diagram The system uses a decoupled client-server architecture . The Next.js frontend communicates with the FastAPI neural engine via structured JSON POST requests. NLP-High-Level-Architecture Diagram Key Features of the Software 1. Adaptive Summarization Engine Decision: Chose facebook/bart-large-cnn for its high compression quality but faced a 1024-token limit. Implementation: Developed a recursive chunking algorithm that splits long documents into 900-word segments, summarizes them individually, and then re-summarizes the combined fragments to ensure a coherent final output. 2. NER Token Visualizer Decision: Standard text output was insufficient for identifying entities. Implementation: Built a custom regex-based annotation parser that injects Tailwind-styled components into the text stream, allowing users to "see" entities (PER, ORG, LOC) inline with vivid color coding. Challenges Faced and Solutions Problem: 1024 Token Window Constraint Most SOTA transformer models (like BART) have a hard input limit of 1024 tokens. When users pasted long articles, the model would fail or truncate the most important data. Solution: Recursive Sliding Window Summarization Instead of simple truncation, we implemented a chunking logic: Split: Data is split into manageable chunks (approx. 900 words). Batch Process: Each chunk is summarized by the neural engine. Consolidate: The results are combined and if they still exceed the limit, a second-pass "Meta-Summarization" is triggered to produce the final brief. Future Vision / next steps Long-term vision V2: Custom Model Uploads: Allow developers to provide their own HuggingFace model IDs to test on the suite. V3: Batch File Processing: Enable drag-and-drop for .txt and .pdf files to process massive datasets in one click. UI Enhancement: Adding interactive "Confidence Tuning" sliders to filter NER and Sentiment results in real-time.