
The API was built on stateless HTTP from day one, but we had no shared standard for naming, implementation, or when to use which pattern. Routes were action-oriented instead of resource-oriented (e.g. “get this” or “do that” in the URL). That made understanding the API non-trivial, and made it hard to reuse data across endpoints, as the response was custom tailored to the specific action.
We used SQLAlchemy and had wrappers for common DB operations (bulk insert/update, commit/rollback), but ORM usage was on a case-by-case basis - no standard approach and no systematic optimization of queries. That was fine at first; as traffic and data grew, it became a bottleneck.
We introduced a RESTful API standard, documented in a confluence page at the time - jumping to the present, the standard is part of our AI agents skills. The rules are clear: resource-based URLs in kebab-case and plural, HTTP verbs with defined semantics (GET for read, PUT for update, etc.), filter/select/range and `with_nested` query params, and sub-collections under a primary resource. We shared this with the team and made adherence in PRs a requirement.
Because the approach was generic (resources, filters, range, nested data), we could implement generic request handlers: shared logic for parsing query params and building queries. That let us implement standard GET and PUT collection endpoints with very little route-specific code.
GET – Parse query params, build the query (filters, order, range), choose schema, run query, serialize:
@admin_blueprint2.route('/seasonal-activities', methods=['GET'])
@roles_accepted(*ALL_ADMINS, UserRoles.AWS)
def get_seasonal_activities():
query_params = {**request.args}
schema = QueryParams(query_params).get_schema(
SeasonalActivity, SeasonalActivityWithNestedSchema,
seasonal_activities_with_nested_schema, seasonal_activities_schema, True
)
return collection_get(request, SeasonalActivity, schema)
PUT – Parse body (list of entities with `id` or filter + single object), apply changes, commit:
@admin_blueprint2.route('/seasonal-activities', methods=['PUT'])
@roles_accepted(*ALL_ADMINS)
def update_seasonal_activities():
return collection_put(request, SeasonalActivity)
We still implement custom logic where a resource needs it (e.g. special validations or side effects), but the majority of collection GET and PUT flows go through these handlers. Over time we extended the request handlers to support join tables, extra conditions, dynamic Marshmallow schema selection, and range (X-Range / X-Items-Range) for pagination, so more complex resources could stay in the same pattern.
Our SQLAlchmey ORM usage and `db_utils` wrappers gave us a consistent way to do commit/rollback, clear faulty data, and handle bulk actions. The main challenge was how we used the ORM across the codebase, especially around joins and relationship loading.
Lazy loading was convenient: you touch `entity.relationship` and SQLAlchemy runs a query. The trap was serialization: when a Marshmallow schema rendered a collection of entities, it often accessed a relationship on each entity. That meant one query per entity (N+1 queries), which didn’t scale.
We started continuously reviewing endpoint performance, looking for N+1s and heavy queries. The levers were:
The generic request handlers support this end-to-end. The client can request nested data via `with_nested` query param(e.g. `with_nested=gateway_config,sensor_config`). We then:
- Choose or create a schema that includes only the requested nested fields (`get_schema`).
- Apply load options only for those relationships (`filter_load_options_by_with_nested`), so we don’t over-fetch.
@admin_blueprint2.route('/carbon-templates', methods=['GET'])
@roles_accepted(*ALL_ADMINS)
def get_carbon_templates():
query_params = {**request.args}
query_params_parser = QueryParams(query_params, raise_error_on_invalid_column_filter=False)
schema = query_params_parser.get_schema(
CarbonTemplate, carbon_template_schema,
carbon_templates_with_nested_schema, carbon_templates_schema
)
load_options = query_params_parser.filter_load_options_by_with_nested({
'seasonal_activities': [subqueryload(CarbonTemplate.seasonal_activities)],
})
return collection_get(
request, CarbonTemplate, schema,
raise_error_on_invalid_column=False,
load_options=load_options,
)
So: dynamic schema (flat vs nested, or a subset of nested fields) and dynamic load options (only what the client asked for) keep both DB and response size under control.
For large tables, we introduced table partitioning (e.g. by time or another partition key). That only helps if:
We also reconsidered indexes: on partitioned tables, indexes should be compound with the partition key to be effective. Adding an index without the partition key often doesn’t help as much as we’d expect.
We used RDS Performance Insights (and DB logs in general) to see which queries caused load or wait. For the heaviest ones we:
This was iterative: find a slow or busy query, optimize it, then look for the next one.
On the client we wanted to call the API less and use data more across components.
Example: fetching sensors with parallel range requests via `getWithParallelRequests`, and using it from a React Query hook
export const getWithParallelRequests = <T>(
url: string,
queryParams: any = {},
pageSize: number = DEFAULT_PAGE_SIZE,
maxParallelRequests: number = DEFAULT_MAX_PARALLEL_REQUESTS
): Promise<T[]> => {
return lastValueFrom(
fetchFirstPageWithRetry<T>(url, queryParams, pageSize).pipe(
mergeMap(({ data, headers }) => {
if (headers['x-items-all-range'] === 'true') return of(data);
const totalItems = parseInt(headers['x-items-range']?.split('/')[1]);
const resultsArray = Array(totalItems);
fillResultsArray(resultsArray, data, 0);
const pageRanges = [];
for (let i = pageSize; i < totalItems; i += pageSize) {
pageRanges.push({ start: i, end: Math.min(i + pageSize - 1, totalItems - 1) });
}
return from(pageRanges).pipe(
mergeMap(({ start, end }) => from(fetchRangeWithRetry<T>(url, queryParams, start, end)),
maxParallelRequests),
reduce((results, { start, data }) => {
fillResultsArray(results, data, start);
return results;
}, resultsArray)
);
})
)
);
};
export const fetchSensors = ({ queryKey }: { queryKey: QueryKey }): Promise<Sensor[]> => {
const [_key, queryParams, isParallel] = queryKey;
if (isParallel) {
return getWithParallelRequests<Sensor>('/sensors', queryParams);
}
return axios.get<Sensor[]>('/sensors', { params: queryParams }).then((res) => res.data);
};
We have not moved to client-side paging (e.g. “fetch only a specific page and show it in the UI until the user moves to the next page”) because our users need to filter quickly; we don't want to rely server filters which will require an additional server call for any change in the filter by the user. If we had a case where the dataset was too large even with range and filters, we’d consider client-side paging or virtualized lists for that flow.
Scaling the API server came from:
Together, these changes let us grow traffic and data size while keeping the API predictable for both backend and frontend teams.