Sidekick Technical Design
Version 1.0
1. Introduction
1.1 Purpose
The purpose of this technical design document is to provide a comprehensive and detailed description of the 'Sidekick' AI Assistant platform, a web-based solution that provides secure access to foundation AI models (leveraging OpenAI’s frontier models) and custom AI assistants. This document aims to furnish IT managed service providers, responsible for administering Microsoft Azure services on behalf of our customers, with crucial information regarding the design, deployment, security, integration, and maintenance aspects of the platform.
The goal is to ensure that the platform is understood to be secure, robust, and efficient, thereby facilitating its seamless deployment and management in diverse customer environments. By outlining the technical intricacies and operational guidelines, this document serves as a foundational reference to support the successful implementation and sustained operation of the 'Sidekick' AI Assistant platform.
1.2 Scope
This document covers the technical architecture, deployment strategies, security measures, integration points, and maintenance protocols for the 'Sidekick' AI Assistant platform.
This document focuses on the deployment of the platform within customer Microsoft Azure environments, ensuring that all configurations meet the necessary security and compliance standards. Furthermore, the scope encompasses the integration of the platform with existing customer systems, performance optimization strategies, and the support and maintenance services required to ensure ongoing reliability.
This technical design document does not cover business-related aspects such as pricing models, marketing strategies, or end-user training documentation, focusing solely on the technical implementation and operational aspects.
1.3 Audience
This technical design document is intended for a diverse audience comprising internal and external stakeholders involved in the deployment and management of the 'Sidekick' AI Assistant platform. The primary audience includes:
Internal Propella Team: Engineers, developers, and technical architects within the Propella team responsible for designing, developing, and deploying the platform. This section of the audience will benefit from understanding the detailed architecture, security protocols, and integration points to ensure the platform's reliability and robustness.
Customer Stakeholders: This group includes both technical and non-technical stakeholders within the customer organisations. Technical stakeholders, such as IT managers, system architects, and developers, will require an in-depth understanding of the platform's deployment and operational aspects. Non-technical stakeholders, such as business managers and process owners, will benefit from insights into how the platform's functionalities align with business objectives and improve operational efficiency.
Customer IT Providers: Managed service providers responsible for administering Microsoft Azure environments on behalf of the customers. These IT providers will play a crucial role in provisioning access to Azure resources, configuring environments, and ensuring that the deployment adheres to security standards and operational protocols.
By addressing the needs and expectations of these varied audience groups, this document aims to facilitate a seamless and efficient deployment process, ensuring that all parties are well-informed and can effectively collaborate to support the successful implementation of the 'Sidekick' AI Assistant platform.
1.4 Overview
The platform contains the following functional components:
- The ‘Home Base’ landing page - all Assistants are accessible from this page.
- The Chat History panel that persists on the left-hand side of the solution – provide ability to review historical chats, select the Home Base button to return to landing page, and access user-specific functions.
- Assistant pages – dedicated pages where users can initiate chats with defined Assistants. These Assistants could be general AI models like OpenAI’s GPT-4o, or custom Assistants that integrate to specific data sources like our Tax Assistant (integrates with ATO website) and Law Assistant (integrates with AustLII).
2.0 System Architecture
2.1 Architecture Overview
The Sidekick AI Assistant platform is a cloud-native solution hosted on Microsoft Azure. It offers specialised AI assistants (e.g.,for law, tax) that interact with users. These assistants are centrally managed within a single Assistants microservice, while an Orchestrator service intelligently routes requests to the appropriate assistant or external services, such as Propella Dynamic Dialogue.
The platform primarily operates in production mode, with a development environment deployed only by specific agreement. This flexible design allows the platform to scale seamlessly and integrate with external OpenAI models, PostgreSQL databases, and knowledgebases.
The following diagram illustrates the overall solution architecture for the solution.
2.2 Components Description
1. Users
Users interact with the Sidekick platform through a web or mobile front-end interface
2. App Service Plan (Production)
Hosts key services ensuring scalability and reliability:
- Front-End Service - Manages user interactions and sends requests to the Orchestrator service.
- Orchestrator Service - Routes incoming requests to the relevant assistant in the Assistants microservice or external endpoints like Propella Dynamic Dialogue (separate AI service not included within the scope of this solution).
- Assistants Microservice - Central service for all AI assistants, each specialising in different areas (e.g., law, tax, finance). Uses OpenAI models and knowledge bases for enhanced responses.
3. External Endpoints (Future Integration)
Placeholder for future services, such as Propella Dynamic Dialogue, to support dynamic interactions beyond Sidekick’s core assistants.
4. Conditional Features
5. Azure Container Registry(ACR)
Stores and manages container images for service deployment.
6. App Registrations
Manages authentication using Azure Active Directory (AAD) for both user logins and service connections.
7. Storage Accounts
Stores user-uploaded files and knowledge base documents.
8. Cognitive Services (Azure OpenAI API service)
Provides LLM-based responses for the Assistants microservice.
9. Azure Database for PostgreSQL
Stores chat history and knowledge base data. If disabled, users must provide their own PostgreSQL instance.
10. Azure Data Factory (ADF)
Processes and transforms knowledge base documents for querying by assistants.
2.3 Interactions and Data Flow
1. User Interaction with Front-End
Users engage with the Front-End Service through a web or mobile interface.They can ask questions or provide inputs (e.g., uploading files).
2. Routing via Orchestrator Service
The Orchestrator Service receives the user’s request and determines the appropriate action:
• Route to Assistants Microservice: If the request requires a specific assistant (e.g., legal or tax query).
• Route to External Dynamic Dialogue Endpoint: If the request requires Propella’s dynamic dialogue system.
3. Assistant Query Handling and Response Generation
The Assistants Microservice processes the request and invokes the necessary OpenAI models (either through Sidekick’s or the user’s instance).
If relevant, the assistant queries a knowledge base processed by ADF.
4. Knowledgebase Processing with ADF (Optional)
If ADF is enabled, knowledge base documents are processed and indexed, enabling assistants to query them efficiently.
5. Database Access for Chat History and Knowledge base Data
Chat interactions are logged in the PostgreSQL database. If the database is disabled, the platform integrates with the user’s own PostgreSQL instance.
Knowledgebase data, if used, is stored here as well.
6. Storage of User Files and Logs
User-uploaded files and knowledgebase documents are stored in Storage Accounts.
Interaction logs are saved in the PostgreSQL database.
2.4 Technology Stack
• Azure App Service Plan - hosts the front-end, orchestrator, and assistant services.
• Azure Container Registry(ACR) - Stores containerised applications for deployment.
• Azure Storage Accounts - Manages files uploaded by users and knowledgebase documents.
• Azure Cognitive Services(OpenAI) - Provides LLM-based AI capabilities for the assistants.
• Azure Database for PostgreSQL - Stores chat history and knowledgebase data. If not used, the platform integrates with the user’s PostgreSQL instance.
• Azure Data Factory (ADF) - Processes and transforms knowledgebase documents.
• Microsoft Azure Active Directory (AAD) - Secures the platform with authentication and access management.
3. Deployment Model
3.1 Deployment Topology
Describe the deployment topology in the Azure environment.
3.2 Environment Configuration
Explain the configuration required in different environments (development, staging, production).
3.3 High Availability and Scalability
Discuss strategies for ensuring high availability and scalability.
4. Security
4.1 Security Requirements
Outline the security requirements the platform must adhere to.
4.2 Identity and Access Management
Detail how identity and access will be managed within the platform.
4.3 Data Security
Describe encryption methods and data protection strategies.
4.4 Compliance and Regulatory Considerations
Overview of compliance with standards like GDPR, HIPAA, etc.
4.5 Threat Model and Mitigation
Identify potential threats and the mitigation strategies in place.
5. Integration and Interoperability
5.1 Integration Points
Identify key integration points with customer systems.
5.2 API and SDK
Detail the API and SDKs provided for integration.
5.3 Interoperability Standards
Discuss the standards followed to ensure interoperability.
6. Support and Maintenance
6.1 Support Plan
The Support Plan for the 'Sidekick' AI Assistant platform aims to provide comprehensive technical assistance, ensuring the platform operates efficiently and effectively within customer Azure environments. The plan includes the following elements:
1. Support Tiers and Responsibilities
Tier 1: Help Desk Support
· Initial point of contact for all support queries.
· Responsible for basic troubleshooting, user guidance, and issue logging.
· Escalates complex issues to Tier 2 support.
· Contactable at support@propella.ai
Tier 2: Technical Support
· Handles more advanced technical issues that cannot be resolved by Tier 1.
· Performs deeper diagnostic tasks and attempts to resolve technical problems.
· Escalates highly complex issues to Tier 3support if necessary.
Tier 3: Expert Support
· Composed of platform specialists, developers,and engineers from the Propella team.
· Engages in root cause analysis, major incident management, and complex problem resolution.
· Works on patches, updates, and platform improvements based on feedback and issues reported.
2. Service Level Agreements (SLAs)
Response Times
· Tier 1 Response: Within 1 hour during business hours (based on GMT +10).
· Tier 2 Response: Within 4 hours during business hours (based on GMT +10).
· Tier 3 Response: Within 1 business day (based on GMT +10).
Resolution Times
Resolution times will vary based on the complexity of the issue and will be defined in the SLA with the customer. General targets could include:
· Tier 1: 4 hours
· Tier 2: 24 hours
· Tier 3: 48 hours or longer, depending on the issue complexity.
Uptime Guarantee
Commitment to a 99.9% uptime SLA, excluding scheduled maintenance.
3. Incident Management
Incident Reporting - Customers can report incidents through the following channels:
· Email: support@propella.ai
· Phone (urgent / escalations): +6 1 (0)421 058027 (Gus McLennan)
Incident Tracking - All incidents are logged in a tracking system accessible by both support teams and customers for status updates.
Escalation Procedures - Defined procedures for escalating incidents through support tiers to ensure timely resolution.
4. Support Channels
Online Support Portal
Access to knowledge base articles, FAQs, and documentation.
Email Support
Support requests can be submitted via a dedicated support email address.
Phone Support
Direct access to support representatives for urgent issues.
5. Monitoring and Reporting
System Monitoring
Continuous monitoring of platform performance and availability.
Automated alerts for critical issues to ensure prompt response.
Reporting
Regular reports on support metrics, including response times, resolution times, and system uptime.
Incident trend analysis to identify and address recurring issues.
6. Maintenance and Updates
Scheduled Maintenance
Regular maintenance windows will be communicated in advance to customers.
Efforts will be made to schedule maintenance during low-impact periods.
Patch Management
Timely deployment of patches and updates to address vulnerabilities and enhance functionality.
Platform Upgrades
Major upgrades will be planned and communicated well in advance, including any potential impacts on service.
7. Customer Training and Resources
Training Programs
Provide initial and ongoing training for customer IT staff and end-users.
Documentation
Comprehensive user manuals, technical documentation, and best practice guides.
6.2 Monitoring and Logging
Monitoring Mechanisms:
1. Application Performance Monitoring (APM)
Azure Application Insights
· Real-time monitoring of application performance.
· Tracks response times, failure rates, and application dependencies.
· Detects and diagnoses performance anomalies.
2. Infrastructure Monitoring
Azure Monitor
· Comprehensive monitoring of Azure resources including VMs, Kubernetes clusters, databases, and networking components.
· Collects and analyses metrics and logs from the entire infrastructure.
· Enables creation of custom dashboards to visualise resource health and performance.
Azure Log Analytics
· Aggregates and correlates logs from various Azure resources.
· Enables powerful querying and visualisation of log data.
3. Network Monitoring
Azure Network Watcher
· Provides network diagnostic and visualisation tools.
· Monitors network performance and identifies network-related issues.
Traffic Analytics
· Analyses network traffic and provides insights into the security and efficiency of network usage.
4. Security Monitoring
Azure Security Centre
· Provides a unified view of the security posture across Azure subscriptions.
· Monitors for threats and vulnerabilities in real-time.
· Recommends security best practices and compliance policies.
5. Service Health Monitoring
Azure Service Health
· Notifies about Azure service incidents, planned maintenance, and health advisories.
· Customizable alert rules to receive notifications relevant to specific Azure services.
Logging Mechanisms
1. Application Logging
Structured Logs
· Logs generated by the application components,such as user interactions, API calls, error messages, and transaction details.
· Use of structured logging formats (e.g., JSON)for consistency and easier processing.
Azure Diagnostic Logs
· Detailed logs from Azure resources and applications.
· Collect logs from Azure resources like Azure App Service, Azure Functions, and Azure SQL Database.
2. Infrastructure and System Logs
Azure VM Logs
· System and application logs from virtual machines (Windows event logs, Linux syslog).
Container Logs
· Logs from Azure Kubernetes Service (AKS) clusters, including pod and container logs.
Azure Diagnostic Extension
· Collects diagnostic data (logs, metrics) from Azure VMs and sends it to Azure Monitor.
3. Audit Logs
Azure Active Directory (AzureAD) Logs
· Logs of user sign-ins, application registrations, and role assignments.
Azure Activity Log
· Logs of all administrative operations on Azure resources (create, update, delete).
4. Security Logs
Azure Sentinel
· Collects and analyses security data from across the enterprise.
· Uses AI to detect and respond to incidents automatically.
Azure Monitor Security Logs
· Aggregates security-related logs for monitoring and alerting.
5. Custom Logs
· Logging custom events and metrics specific to the 'Sidekick' AI Assistant platform.
· Use Azure Monitor Custom Logs to collect and query custom log data.
Key Practices
1. Centralised Log Management
· Use of centralised log management platforms(Azure Monitor, Log Analytics) for unified logging across all components.
· Efficient querying and analysis of logs from multiple sources.
2. Alerting and Notification
· Define alert rules in Azure Monitor to notify relevant personnel upon detecting anomalies or threshold breaches.
· Use Azure Action Groups to configure notification methods (email, SMS, webhook) for alerts.
3. Retention Policies
· Define log retention policies based on compliance requirements and operational needs.
· Regularly archive or delete outdated logs to manage storage costs and ensure compliance.
4. Continuous Improvement
· Regular review and refinement of monitoring and logging configurations to adapt to evolving requirements and optimise performance.
6.3 Disaster Recovery Plan
Objective: The objective of this Disaster Recovery Plan is to ensure the 'Sidekick' AI Assistant platform can quickly and effectively recover from any disaster scenario, minimising down time and data loss while ensuring continuity of service.
Scope: This plan covers all components of the 'Sidekick' AI Assistant platform, including infrastructure,applications, data, and integrations, deployed within Microsoft Azure environments. The scope includes disaster recovery strategies, roles and responsibilities, and step-by-step procedures to recover from various disaster scenarios.
6.4 Software Updates and Patch Management
- Discuss the process for software updates and patch management.
7. Performance and Optimisation
7.1 Performance Requirements
- Outline the key performance metrics.
7.2 Load Testing
- Describe the load testing strategy and results.
7.3 Performance Tuning
- Provide strategies for performance tuning.
8. Appendices
8.1 Glossary
8.2 References
1. Azure Documentation
· Microsoft. (2024). "Azure ArchitectureCenter." Microsoft Learn. Retrieved from https://learn.microsoft.com/en-us/azure/architecture/
2. API Documentation
· OpenAI. (2024). "ChatGPT APIDocumentation." OpenAI. Retrieved from https://beta.openai.com/docs/api-reference/introduction
3. Cloud Security
· Turner, J., & Reilly, P. (2021)."Securing the Cloud: A Comprehensive Guide to Cloud Security." Wiley.
4. Disaster Recovery Planning
· Smith, A., & Johnson, R. (2022)."Effective Disaster Recovery Planning: Strategies and BestPractices." Pearson.
5. AI Integration Best Practices
· Peterson, D. (2023). "Integrating AI intoEnterprise Systems: A Practical Guide." O'Reilly Media.
6. Microsoft Azure – Design Considerations
· Microsoft. (2024). "Designing for Efficiencyand Scalability on Azure." Microsoft Learn. Retrieved from https://learn.microsoft.com/en-us/azure/best-practices-availability-paired-regions
8.3 Contact Information
Solution Architect
Alistair Toms
M: 0421 190 338
Data Engineer
Yi Xiang (Chee)
M: 0478 913 462
Software Engineer
Vincent Tenali
M: 0406 221 916