diff --git a/DESIGN.md b/DESIGN.md
new file mode 100644
index 0000000..3c23f21
--- /dev/null
+++ b/DESIGN.md
@@ -0,0 +1,98 @@
+# System Design
+
+This document contains a systems design for a product capable of helping software engineers by accepting prompts
+from a variety of sources (Github, Jira, IDEs, etc) and autonomously retrieving context, creating a plan to solve the
+prompt, executing the plan, verifying the results, and then pushing the results back to the most appropriate tool.
+
+At a very high level the system looks like this: \
+
+
+Components:
+
+* API - A REST based API capable of accepting incoming webhooks from a variety of tools. The webhooks will generate
+ tasks that will require context gathering, planning, execution, and testing to resolve. These get passed to the
+ indexer to gather context.
+* Indexer - An event based processor that accepts tasks from the API and gathers context (e.g. Git files / commits, Jira
+ ticket status, etc) that will need to be indexed and stored for efficient retrieval by later stages.
+* Agent Runner - Takes the task and associated context generated by the Indexer and generates an execution plan. Works
+ synchronously with the task executor to execute the plan. As tasks are executed the plan should be adjusted to ensure
+ the task is accomplished.
+* Task Executor - A protected process running with [gVisor](https://gvisor.dev/) (a container security orchestration
+ mechanism) that has a set of tools available to it that the agent runner can execute to perform its task. The executor
+ will have a Network Policy applied such that network access is restricted the bare minimum required to use the
+ tools. \
+ Example tools include:
+ * Code Context - Select appropriate context for code generators.
+ * Code Generators - Generate code for given tasks.
+ * Compilers - Ensure code compiles.
+ * Linters - Ensure code is well formatted and doesn't violate any defined coding standards.
+ * Run Tests - Ensure tests (unit, integration, system) continue to pass after making changes, make sure new tests
+ pass.
+* Result Processor - Responsible for receiving the result of a task from the Agent Runner and disseminating it to
+ interested parties through the API and directly to integrated services.
+
+## System Dependencies
+
+* The solution sits on top of `Amazon Web Services` as an industry standard compute provider. We intentionally will
+ not use AWS products that do not have good analogs with other compute providers (e.g. Kinesis, Nova, Bedrock, etc) to
+ avoid vendor lock in.
+* The solution is built and deployed on top of `Elastic Kubernetes Service` to provide a flexible orchestration layer
+ that will allow us to deploy, scale, monitor, and repair the application with relatively low effort. Updates with EKS
+ can be orchestrated such that they are delivered without downtime to consumers.
+* For asynchronous event flows we'll use `Apache Kafka` this will allow us to handle a very large event volume with low
+ performance overhead. Events can be processed as cluster capacity allows and events will not be dropped in the event
+ of an application availability issue.
+* For observability, we'll use `Prometheus` and `Grafana` in the application to provide metrics. For logs we'll use `Grafana
+ Loki`. This will allow us to see how the application is performing as well as identify any issues as they arise.
+* To provide large language and embedding models we can host `ollama` on top of GPU equipped GKE nodes. Models can be
+ distributed via persistent volumes and models can be pre-loaded into vRAM with an init container. Autoscalers can be
+ used to scale up and down specific model versions based on demand. This doesn't preclude using LLM-as-a-service
+ providers.
+* Persistent data storage will be done via `PostgreSQL` hosted on top of `Amazon Relational Database Service`. The
+ `pgvector`
+ extension will be used to provide efficient vector searches for searching embeddings.
+
+## Scaling Considerations
+The system can dynamically scale based on load. A horizontal pod autoscaler can be used on each component of the system
+to allow the system to scale up or down based on the current load. For "compute" instances CPU utilization can be used
+to determine when to scale. For "gpu" instances an external metric measuring GPU utilization can be used to determine
+when scaling operations are appropriate. For efficient usage of GPU vRAM and load spreading, models can be packaged
+together such that they saturate most of the available vRAM, they can be scaled independently.
+
+In order to accommodate load bursts the system will operate largely asynchronously. Boundaries between system components
+will be buffered with Kafka to allow the components to only consume what they're able to handle without data getting
+lost or the need for complex retry mechanisms. If the number of models gets large a proxy could be developed that can
+dynamically route requests to the appropriate backend with the appropriate model pre-loaded.
+
+## Testing / Model Migration Strategy
+
+An essential property of any AI based system is the ability to measure the performance of the system over time. This is
+important to ensure that models can be safely migrated as the market evolves and better models are released.
+
+A simple approach to measure performance over time is to create a representative set of example tasks that should be
+run when changes are introduced. Performance should be measured against the baseline on a number of different metrics
+such as:
+
+* Number of agentic iterations required to solve the task (less is better).
+* Amount of code / responses generated (less is better).
+* Success rate (more is better).
+
+Over time, as problematic areas are identified, new tasks should be introduced to the set to improve the training data.
+
+## Migrating / Managing Embeddings
+
+One particularly sensitive area for migrating models is around embeddings models. Better models are routinely published
+but, it is expensive to re-index data, especially if the volumes are large.
+
+The vector database should store the model that produced each embedding. When new embedding models are introduced the
+indexer should use the new embedding model to index new content, but should allow old content to be searched using old
+models. If models are permanently retired the content should be re-indexed with a supported embeddings model. The vector
+database should allow the same content to be indexed with multiple models at the same time.
+
+## Data Flow
+
+
+
+## Sequence Diagram
+
+
\ No newline at end of file
diff --git a/README.md b/README.md
index 716a3d5..6a51bc4 100644
--- a/README.md
+++ b/README.md
@@ -20,6 +20,8 @@ work.
The reason this approach was taken was that it could be implemented quickly and produces reasonable looking results
while being a solid platform for further iteration.
+A design for a more full-featured and robust implementation can be found in [DESIGN.md](DESIGN.md).
+
## Code Structure
| Package | Description |
diff --git a/diagrams/arch-diagram.drawio b/diagrams/arch-diagram.drawio
new file mode 100644
index 0000000..b4c1f61
--- /dev/null
+++ b/diagrams/arch-diagram.drawio
@@ -0,0 +1,174 @@
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
diff --git a/diagrams/arch-diagram.png b/diagrams/arch-diagram.png
new file mode 100644
index 0000000..d4f6521
Binary files /dev/null and b/diagrams/arch-diagram.png differ
diff --git a/diagrams/dataflow-diagram.drawio b/diagrams/dataflow-diagram.drawio
new file mode 100644
index 0000000..a881d4e
--- /dev/null
+++ b/diagrams/dataflow-diagram.drawio
@@ -0,0 +1,193 @@
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
diff --git a/diagrams/dataflow-diagram.png b/diagrams/dataflow-diagram.png
new file mode 100644
index 0000000..8c6e5b7
Binary files /dev/null and b/diagrams/dataflow-diagram.png differ
diff --git a/diagrams/sequence-diagram.drawio b/diagrams/sequence-diagram.drawio
new file mode 100644
index 0000000..c56eb58
--- /dev/null
+++ b/diagrams/sequence-diagram.drawio
@@ -0,0 +1,217 @@
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
diff --git a/diagrams/sequence-diagram.png b/diagrams/sequence-diagram.png
new file mode 100644
index 0000000..4f734c4
Binary files /dev/null and b/diagrams/sequence-diagram.png differ