Beyond REST maturity levels:
Real life, high-load REST APIs
ConFoo, Montreal, Canada - February 24th, 2023
© David Buchmann
David Buchmann - david@liip.ch
PHP Engineer, Liip SA, Switzerland
Richardson REST maturity model
- Remote function calls over HTTP
- Resources
- HTTP verbs
- Hypermedia Controls
REST maturity levels is a good way to think conceptually about APIs
As promised, this talk is not about the REST maturity concept, but about tools for building APIs
You've got a "situation"
- Data in enterprise systems
- Websites take a lot of effort
- Data access is slow
- Data distributed over several systems
Delivering Data
API: Symfony Application
- FOSRest: Routing, Content Negotiation
- ElasticSearch queries
- JMSSerializer: JSON to Model to JSON / XML
API: Versioning
- JMSSerializer feature
- FOSRest handles version detection
query string / accept header
- Only when not backwards compatible
- Alternatives:
- Elasticsearch index per version
- Run old application version in parallel
/**
* The description as HTML
*
* @Serializer\Since("2")
*/
public string $description;
Api-Version: 2
/**
* Plain text version of description.
*
* @Serializer\Until("1")
* @Serializer\VirtualProperty
* @Serializer\SerializedName("description")
*/
public function getDescPlaintext(): string
{
return strip_tags($this->description);
}
Varnish Cache
- Handle more load
- Super fast responses
Varnish Cache: Prerequisites
- Maturity level 2: Correct usage of HTTP verbs
- Get your HTTP status codes right
- API needs to send correct Cache-Control headers
- Challenge: Individual permissions
Varnish Cache: Access Control
- Basic authentication with htpasswd file
- Or: Pre-flight call to backend to verify credentials, cache pre-flight result
- FOSHttpCache user context: share cache between users with the same set of permissions
Varnish Cache: Routing
- Single point of entry, reroute to other backends based on request paths
- Alternate backends, e.g. newest version of PHP and fallback to old on error
Varnish Cache: To BAN or not to BAN
- BAN: Invalidate regular expression on request
- Every request checked against ban list
- If you ban a lot, this list becomes long
- Is outdated cached data acceptable?
- Alternatives: PURGE single path or use xkey / ykey
Varnish Cache: Increase Cache Hits
- Normalize Api-Version and version query
- Normalize Accept and format extension in URL
- Normalize Accept-Language
- Normalize Accept-Encoding
- Supress Cookies if possible
- vcl_error with custom status codes to abort early
Get people to use your API
Documentation
- Self-documenting names is not enough
- Document relationships
- Tutorials
- Changelog and migration guides
Zircote Swagger-PHP
class ProductController {
#[OA\Get(
path: "/procut/{id}",
tags: ["Products"]
)]
#[OA\Response(
response:"200", new OA\JsonContent(
new OA\Schema(
ref:"#/components/schemas/Product"
)
)]
public function getAction($id)
Indexing Architecture
Workflow
Architecture
- Loader (with batch option)
- SOAP, REST APIs
- csv & XML files
- Doctrine (Oracle)
- Message Queue
- Persistor
- Indexer
Indexer
- Factory pattern
- Translate source data into our models
- Plug mapper into factory for each aspect of the product => partial updates
Symfony Workers and Commands
- Commands for cronjobs
- Message workers to handle rabbitmq messages
- Long running PHP tasks: Rejuvenation and restart
- Cloud task scheduler for workers
- Or: Supervisord
Data Quality
- Rules to pick right data
- Rules when to ignore whole record
- Offer control API to override visibility to hide problematic records
Monitoring and Diagnostics
- API access to raw data for debugging
- API call to get status of items
- Central logging (e.g. graylog)
- Montitoring: Newrelic, Dynatrace, ...
- Alerting (e.g. OpsGenie)
Elasticsearch
Elasticsearch: Indexes
- one index per language
- de-normalize everything for fast responses
- no joins or parent-child relations
Elasticsearch: Schema Changes
- ES guesses everything
- Not always correctly => manual definitions
- No way to change definition of existing index
- Deploy new code, but not yet online
- reindex API to copy data from old version to new schema
Elasticsearch: Clustering
- ES runs on several servers
- They communicate among themselves
- Indexes are sharded
- Automatic aggregation and load balancing
Outlook
Beyond REST
- Aggregated information, tailored to actual needs
- Less requests, less bandwith. More server effort.
- Layered Architecture: App API, Facade
Alternatives: Stateless, but...
- React and data pull APIs
- Falcor:
Graph as JSON, specify wanted data fields
- GraphQL:
Schema for graph data, query language
Thank you!
@dbu
David Buchmann, Liip SA