Skip to content

Infrastructure

SapienAI consists of several components. This document outlines the infrastructure required to run SapienAI.

1. Architecture Overview

SapienAI is built with four core components.

  • API: The acid_backend service in the Docker Compose file. This houses the core application logic.
  • Frontend Service: The acid_frontend service in the Docker Compose file. This serves the user interface.
  • Python Service: The acid_python_svc service in the Docker Compose file. This handles tasks and integrations better suited to Python or which are not readily available in the API service (which is built using Node.js). For example, this service provides a TikToken interface for accurate token counting.
  • File Service: The acid_file_svc service in the Docker Compose file. Handles all file-related operations. More replicas of this service can be added to handle larger quantities of file uploads.
  • Document Preview: The acid_doc_preview_svc service in the Docker Compose file. This service specifically provides a sandboxed environment for compiling LaTeX documents and generating previews. It is designed to be isolated from the main application logic for security and stability. It is a large image so if storage space is a concern and you are not writing LaTeX documents, you can safely remove this service from the docker-compose.yml file.

2. Other Required Services

In addition to the core components, SapienAI requires several other services to function. These services are defined in the Docker Compose file, but can be hosted separately if needed. If you choose to host them separately, ensure that the environment variables in the .env file are updated accordingly.

  • Redis: Used for caching and session management. The redis service in the Docker Compose file. Both Redis Cluster and standalone Redis setups are supported. Configuration instructions can be found here
  • Weaviate: An opensource vector database. The weaviate service in the Docker Compose file. This is currently used as the core database for the application. Configuration instructions can be found here
  • File Storage: The application requires a file storage service to handle uploaded files. This can be a local service, (such as the minio example in the Docker Compose file) or an external service like AWS S3, Azure Blob Storage, or Google Cloud Storage.

3. Component Communication

The API service exposes a number of endpoints that the other core components can use to interact with the application and the configured AI models. To ensure secure communication, and reduce the risk of exposing endpoints that call AI models, mTLS is required for components that communicate with the API service's AI endpoints.

To facilitate this, each component must be configured with the appropriate certificates and keys to establish mTLS connections. To aid in this, the docker-compose.yml file includes a certs volume that is mounted to the required services as well as a certgen service that generates the required certificates.

The certgen service is a simple container that runs the script in certgen/gen.sh to generate the necessary certificates and keys for mTLS. This will run automatically when you start the Docker Compose stack. The generated certificates will be stored in the certs volume, which is shared among the services that require mTLS.

AcademicID - Smart & ethical AI for academia