Today, we are pleased to introduce Zuul our answer to these challenges and the latest addition to our our open source suite of software Although Zuul is an edge service originally designed to front the Netflix API, it is now being used in a variety of ways by a number of systems throughout Netflix.
Zuul in Netflix's Cloud Architecture |
How Does Zuul Work?
At the center of Zuul is a series of filters that are capable of performing a range of actions during the routing of HTTP requests and responses. The following are the key characteristics of a Zuul filter:- Type: most often defines the stage during the routing flow when the filter will be applied (although it can be any custom string)
- Execution Order: applied within the Type, defines the order of execution across multiple filters
- Criteria: the conditions required in order for the filter to be executed
- Action: the action to be executed if the Criteria are met
1 | class DeviceDelayFilter extends ZuulFilter { |
Zuul provides a framework to dynamically read, compile, and run these filters. Filters do not communicate with each other directly - instead they share state through a RequestContext which is unique to each request.
Filters are currently written in Groovy, although Zuul supports any JVM-based language. The source code for each filter is written to a specified set of directories on the Zuul server that are periodically polled for changes. Updated filters are read from disk, dynamically compiled into the running server, and are invoked by Zuul for each subsequent request.
Zuul Core Architecture |
There are several standard filter types that correspond to the typical lifecycle of a request:
- PRE filters execute before routing to the origin. Examples include request authentication, choosing origin servers, and logging debug info.
- ROUTING filters handle routing the request to an origin. This is where the origin HTTP request is built and sent using Apache HttpClient or Netflix Ribbon.
- POST filters execute after the request has been routed to the origin. Examples include adding standard HTTP headers to the response, gathering statistics and metrics, and streaming the response from the origin to the client.
- ERROR filters execute when an error occurs during one of the other phases.
Request Lifecycle |
How We Use Zuul
There are many ways in which Zuul helps us run the Netflix API and the overall Netflix streaming application. Here is a short list of some of the more common examples, and for some we will go into more detail below:- Authentication
- Insights
- Stress Testing
- Canary Testing
- Dynamic Routing
- Load Shedding
- Security
- Static Response handling
- Multi-Region Resiliency
Insights
Because Zuul can add, change, and compile filters at run-time, system behavior can be quickly altered. We add new routes, assign authorization access rules, and categorize routes all by adding or modifying filters. And when unexpected conditions arise, Zuul has the ability to quickly intercept requests so we can explore, workaround, or fix the problem.
The dynamic filtering capability of Zuul allows us to find and isolate problems that would normally be difficult to locate among our large volume of requests. A filter can be written to route a specific customer or device to a separate API cluster for debugging. This technique was used when a new page from the website needed tuning. Performance problems, as well as unexplained errors were observed. It was difficult to debug the issues because the problems were only happening for a small set of customers. By isolating the traffic to a single instance, patterns and discrepancies in the requests could be seen in real time. Zuul has what we call a “SurgicalDebugFilter”. This is a special “pre” filter that will route a request to an isolated cluster if the patternMatches() criteria is true. Adding this filter to match for the new page allowed us to quickly identify and analyze the problem. Prior to using Zuul, Hadoop was being used to query through billions of logged requests to find the several thousand requests for the new page. We were able to reduce the problem to a search through a relatively small log file on a few servers and observe behavior in real time.
The following is an example of the SurgicalDebugFilter that is used to route matched requests to a debug cluster:
1 | class SharpDebugFilter extends SurgicalDebugFilter { |
Stress Testing
Gauging the performance and capacity limits of our systems is important for us to predict our EC2 instance demands, tune our autoscaling policies, and keep track of general performance trends as new features are added. An automated process that uses dynamic Archaius configurations within a Zuul filter steadily increases the traffic routed to a small cluster of origin servers. As the instances receive more traffic, their performance characteristics and capacity are measured. This informs us of how many EC2 instances will be needed to run at peak, whether our autoscaling policies need to be modified, and whether or not a particular build has the required performance characteristics to be pushed to production.Multi-Region Resiliency
Zuul is central to our multi-region ELB resiliency project called Isthmus. As part of Isthmus, Zuul is used to bridge requests from the west coast cloud region to the east coast to help us have multi-region redundancy in our ELBs for our critical domains. Stay tuned for a tech blog post about our Isthmus initiative.Zuul OSS
Today, we are open sourcing Zuul as a few different components:- zuul-core - A library containing a set of core features.
- zuul-netflix - An extension library using many Netflix OSS components:
- Servo for insights, metrics, monitoring
- Hystrix for real time metrics with Turbine
- Eureka for instance discovery
- Ribbon for routing
- Archaius for real-time configuration
- Astyanax for and filter persistence in Cassandra
- zuul-filters - Filters to work with zuul-core and zuul-netflix libraries
- zuul-webapp-simple - A simple example of a web application built on zuul-core including a few basic filters
- zuul-netflix-webapp- A web application putting zuul-core, zuul-netflix, and zuul-filters together.
Netflix OSS libraries in Zuul |
- Weighted load balancing to balance a percentage of load to a certain server or cluster for capacity testing
- Request debugging
- Routing filters for Apache HttpClient and Netflix Ribbon
- Statistics collecting
Mikey Cohen - API Platform
Matt Hawthorne - API Platform
comment 0 التعليقات:
more_vertsentiment_satisfied Emoticon