Proof of concept for an LLM-powered chat application built with Zero as requested by Aaron Boodman on this twitter thread.
The goal of this project is to show how Zero can be used to stream LLM responses to the client. Other requirements mentioned are:
- svelte
- spa-only, no ssr (it would be cool to ssr the read-only snapshot, but a different project)
- chat are saved in sidebar and switch instantly between them
- LIKE-based text filter over prev chats
- deploy w/ sst
- I don't care if you can choose models, just pick one. The point is not to demonstrate AI.
- Integration with AI should be at server-level. Responses should be accumulated into chunks (roughly paras) and saved into db at that granularity on the server. Use zero's sync for streaming to UI.
- Support ✨sharing✨ a chat in either r/o or r/w mode
- Zero for the sync engine
- Svelte (SvelteKit with static adapter) for the frontend
- Hono for the Github OAuth flow
- SST for deployment to AWS
- OpenAI SDK for generating the LLM responses
- Of course some other things I'd call standard:
This monorepo is based on the SST monorepo template.
infra
- SST & Pulumi Resources (infrastructure as code)packages/web
- Svelte frontendpackages/functions
- Lambda functionspackages/database
- Database schema and migrations
- S3 for static assets
- CloudFront for CDN
- RDS Postgres for the database
- Lambda for the hono server and generating the LLM responses
- ECS for the Zero sync engine
- Amazon VPC for communication between the services
- Cloudflare for DNS and Proxy
The infra
directory contains the SST & Pulumi resources which describe the infrastructure.
- The Svelte frontend is a single-page application that uses a Zero sync client to insert new data.
- The Zero sync client is connected to the Zero sync cache running in AWS ECS.
- The Zero sync cache replicates an RDS Postgres database.
- On user message insert, the RDS database triggers a Lambda function to generate a response from the LLM.
- The LLM response is accumulated into chunks and saved into the RDS database.
- The Zero cache syncs the changes to all the clients.
- The frontend displays the message and the LLM response.
First things first, install the dependencies:
pnpm install
Follow the instructions in the SST docs to set up your AWS account and permissions and then login to AWS with the CLI:
pnpm sso --sso-session={your-sso-session}
Note
I use Cloudflare for DNS and Proxy. If you also do that, you need to set the Cloudflare environment variables:
CLOUDFLARE_ZONE_ID={your-cloudflare-zone-id}
CLOUDFLARE_API_TOKEN={your-cloudflare-api-token}
CLOUDFLARE_ACCOUNT_ID={your-cloudflare-account-id}
In infra/constants.ts
you can set the domains you want to use.
The domains can be removed from the resources if you don't want to use them. (infra/web
and infra/zero
)
Important
You can't proxy websockets through Cloudflare, so for the Zero instance the proxy is disabled.
Now you can deploy the infrastructure with a single sst deploy
. (Note: You need to be root to deploy to production because of the tunnel that needs to be registered.)
sudo sst deploy --stage production