Conversational NLP

Artificial Intelligence

AIJavaResearchNatural Language Processing

Summary

A research-driven AI personal assistant prototype utilizing Natural Language Processing (NLP) and the SURR algorithm to categorize user input and provide context-aware responses.

Details

Sundae was developed as a conversational personal assistant designed to understand and react to specific user topics by mimicking the functionality of intelligent assistants like Siri or Alexa. The core objective was to implement a robust Natural Language Processing (NLP) stack capable of handling relaxed conversations through structured text analysis. The system architecture focuses on Spoken Language Understanding (SLU) through a series of deterministic processing stages designed to translate raw input into actionable intents.

The SURR Algorithm and Semantic Parsing

The project utilizes the SURR Algorithm, which processes user text through tree structures consisting of parent and child nodes to identify optimal solutions.

Semantic Parsing: The engine breaks down input into chunks rather than individual words, passing them through a Semantic Parser to identify linguistic structures.
Recursive Pathfinding: Implemented a FindPath(Slots) procedure that recursively splits input slots to map nodes within the decision trees.
Task Classification: Once tagged, the input is categorized into functional domains such as greetings, device navigation, or scheduling.

AIML and PandasBots Integration

To manage conversation flows and knowledge retrieval, Sundae leverages PandasBots and AIML (Artificial Intelligence Markup Language).

Pattern Categorization: User input is tagged with specific tokens that allow the engine to categorize intents (e.g., "happy", "greetings") and follow distinct response patterns.
Contextual Response Generation: The system utilizes a Natural Language Generator to produce accurate, specific responses based on the identified semantic frames.

Feature and Keyword Extraction

A critical component of the pipeline is the Context Catcher, which performs three vital tasks after semantic analysis:

Feature Extraction: Identifying the core intent of the user request.
Keyword Extraction: Pulling specific parameters from the input to increase task completion accuracy.
User Remembrance: The system was modified to learn and retain basic user information, such as names, to provide a personalized conversational experience.

Research Outcomes

The prototype successfully demonstrated a complete NLP pipeline—from Automatic Speech Recognition (converted to text) to a Speech Synthesizer output.

Linguistic Depth: By utilizing AIML files where each tag represents a whole .aiml file, the algorithm creates a sophisticated method for understanding natural language beyond simple keyword matching.
Scalability: Future iterations of the project aim to incorporate Neural Networks and Machine Learning to expand the assistant's wide-scale memory and learning capabilities.