For context, this is an assignment I was given. Below is a detailed page on how I implemented the entire logic, my thought process, and everything that constitutes the foundation of this application.
These are the core steps I typically follow for most of my developments:
Step 1: Ideation & Technical Document
I take the business user inputs, structure them in a more technical way, and conduct a brainstorming session with an AI assistant (Gemini, Claude, or OpenAI). Once I am satisfied that all the missing pieces are covered and I have a clear picture of the APIs, database schema, and folder structure, I convert all of this into a comprehensive technical document.
Step 2: Boilerplate Code
Once the technical document is clear and unambiguous, I start by creating very basic boilerplate code that strictly adheres to the folder structure outlined in the technical document. This establishes the foundation.
Step 3: Pragmatic TDD
Per the requirements, I tried to stay as close as possible strictly to Test-Driven Development
(Red → Green → Refactor). However, showing the practical use of TDD via
granular commit history can be a tedious job (committing each individual failing test, working
code, again testing, working, etc.).
To be pragmatic, I took a bit of leniency and committed grouped functional tests at once alongside the working code that satisfied them.
Step 4: Layered Step-by-Step Build
I also like to write down a step-by-step approach of what I want to build. As you can see in the commit history, I will first write very basic, minimally functional code without validation, overarching business rules, etc., and then slowly iteratively add up each layer to make the application feel complete and robust.
Step 5: Production-Ready Additions
I added a substantial amount of "extra" code like validations, user authentication, advanced filters (e.g., ilike matching), offsets for pagination, rate limiting, and global logging.
The primary reason for this is that a core requirement stated this code needs to be production-ready, which implies these foundational security and usability features are must-have basic needs the code should support out of the box.
Technical Reference
The following sections outline the precise technical architecture, system design, and API contracts used to build this backend application.
1. Project Overview
A high-performance, TDD-driven Python backend utilizing FastAPI, SQLAlchemy, and Alembic. The system manages employee data and calculates dynamic salary deductions based on country-specific tax policies stored in a versioned, data-driven format.
2. Test Credentials
The application endpoints are protected by Basic Authentication. Use the following default seeded credentials depending on the role you want to evaluate:
| Role | Username | Password | Permissions |
|---|---|---|---|
| Admin | admin |
adminpass |
CREATE, READ, UPDATE, DELETE |
| Manager | manager |
managerpass |
CREATE, READ, UPDATE |
| Staff | staff |
staffpass |
READ (View Only) |
3. Architecture Principles
- Clean Architecture (Hybrid): Layered separation between API (Interface), Services (Business Logic), and Models (Persistence).
- Active Record Pattern: Database models are "smart" and encapsulate their
own CRUD logic (e.g.,
Employee.save(),Country.get_by_iso()). - Strict TDD: Development follows the
Red → Green → Refactorcycle. - Data-Driven Policy: Tax logic is not hardcoded; it is retrieved from the
countriestable at runtime. - Statelessness: The API is designed to be horizontally scalable and Docker-ready.
We are utilizing a Hybrid Repository Pattern:
- The Models: Act as "Smart Entities" containing self-contained CRUD logic.
- The Repositories: Act as the orchestration layer used by Services to avoid raw SQL generation. Allows for easy mocking in TDD.
4. Technology Stack
- Language: Python 3.10+
- Framework: FastAPI (Asynchronous support)
- ORM: SQLAlchemy 2.0+
- Database: SQLite (Initial), PostgreSQL (Target Production)
- Migrations: Alembic
- Validation: Pydantic v2
- Testing: Pytest (Unit & Integration)
- Logging: Structured Logging
5. System Design & Data Models
| Table | Column | Type | Constraints |
|---|---|---|---|
| Country |
iso_code
|
String(2) | Primary Key, Unique |
name
|
String | Unique | |
current_deduction
|
Float | Default TDS % | |
tax_slabs
|
JSON | Meta-info for versioned slabs | |
| Employee | id
|
UUID | Primary Key |
full_name
|
String | Required | |
salary
|
Float | Precision managed in Service | |
country_iso
|
String(2) | Foreign Key -> Country | |
| UserAccess |
username
|
String | Primary Key, Indexed |
role_name
|
String | Required | |
permissions
|
JSON | Required |
6. API Contract (v1)
All API endpoints return a unified response schema utilizing the StandardResponse
model wrapper:
{
"status": "success|error",
"status_code": 200,
"message": "Operation description",
"data": { ... },
"errors": null,
"pagination": null
}
Core Endpoints
POST /api/v1/countries: Upsert Country.GET /api/v1/countries/list: List all countries.POST /api/v1/employees/create: Create new employee.GET /api/v1/employees/list: List all employees.GET /api/v1/employees/salary-breakdown/{id}: Returns Gross, Total Deductions (calculated via Country DB record), and Net Salary.GET /api/v1/analytics/salary/by-country/{iso}: Min, Max, Avg salary.
7. Directory Structure
.
├── src/ # Application Source
│ ├── api/ # v1 routers and schemas
│ ├── core/ # Config, logger, exceptions, middlewares
│ ├── database/ # Base model (CRUD logic), session
│ ├── models/ # DB Entities
│ ├── repositories/ # Data Orchestration
│ ├── services/ # Math calculation and Analytics
│ ├── migrations/ # Alembic
│ ├── app.py # FastAPI Initialization
│ └── migration.py # CLI Utility
├── tests/ # Pytest Suite
└── Dockerfile # Multi-stage production build
Future Enhancement Roadmap
The following enhancements can be implemented to further optimize performance and scalability as the system grows to handle millions of records:
1. Caching Mechanism
Implement multi-layer caching (Redis/Memcached) for frequently accessed data:
- Cache country tax configurations with TTL-based invalidation
- Cache salary analytics results to avoid recalculation on repeated requests
- Implement cache warming strategies for peak-access periods
- Use ETag/HTTP caching headers for client-side optimization
Benefits: Reduced database load, significantly improved response times for analytics queries, better scalability under high concurrent load.
2. Pre-Computed Employee Counts with Database Triggers
Store aggregated employee counts per country with automatic synchronization via database triggers:
- Add
employee_countandtotal_salary_sumfields to theCountrymodel - Create
AFTER INSERT/DELETE/UPDATEtriggers on theEmployeetable to maintain country totals automatically - Index these aggregated fields for O(1) lookups and fast sorting operations
- Implement similar pre-computed sum and average fields for salary statistics by country
Why This Matters: When the employee count reaches millions, computing aggregations on-the-fly becomes prohibitively expensive. Pre-computed values with trigger-maintained accuracy enable constant-time lookups while ensuring data consistency.
3. Job Title-Based Analytics with Pre-Computation
Extend the trigger-based pre-computation pattern to job titles and departments:
- Create a
JobTitleAnalyticstable with fields:job_title,employee_count,total_salary,avg_salary,min_salary,max_salary - Implement database triggers to update aggregates whenever employee records are created, updated, or deleted
- Support salary range queries by job title with pre-computed min/max values
- Enable cross-department comparison and cost center analysis without expensive joins
- Store update timestamps for audit trails and change tracking
Why This Matters: Real-time analytics across multiple dimensions (country, job title, department) requires pre-computed aggregations at scale. This approach maintains data consistency while enabling sub-millisecond query responses.
Implementation Strategy
These enhancements follow pragmatic TDD practices:
- Start with integration tests for the new features (triggers, caching logic)
- Implement database migrations using Alembic with reversible scripts
- Gradually refactor existing analytics endpoints to leverage pre-computed data
- Add cache invalidation strategies to maintain consistency
- Profile and benchmark against the previous implementation
- Roll out feature flags to enable gradual adoption
Conclusion
I might have gone a bit low in complexity at some places and high at other places. But this is what I was able to securely architect and code in approximately 5 hours. I tried my best to stay as close as possible to the TDD system. If I had more time, there are certainly more enhancements I would wish to implement.
Specifically, this application is architected with scalability in mind. The three key enhancements outlined in the Future Enhancement Roadmap section are designed to handle growth to millions of employees without performance degradation:
- Caching Mechanism: Dramatically reduce database load and response times for repeated queries
- Pre-Computed Aggregates with Database Triggers: Enable O(1) lookups for country-level statistics at massive scale
- Job Title Analytics: Real-time multi-dimensional analytics without expensive joins
🚀 Live Deployment
This application has been deployed and is running live at:
You can access the live API, test credentials, and swagger documentation directly from the deployed instance.