Connect with us

Networking

Cloudflare quit Nginx and uses Pingora written in internal Rust

Published

on

Cloudflare quit Nginx and uses Pingora written in internal Rust

Cloudflare quit Nginx and uses Pingora written in internal Rust.

Cloudflare has long relied on Nginx as part of their HTTP proxy stack; but now, they announced that they have replaced Nginx with their in-house Pingora software written in Rust, ” We’ve built a faster, more efficient, more general internal agency, as a platform for our current and future products “.

According to the introduction, the software can handle more than one trillion requests per day, and can provide better performance while using only about one-third of the original CPU and memory resources.

“As Cloudflare scales, we’ve surpassed NGINX. It’s been great over the years, but its limitations at our scale over time meant it made sense to build something new.

We can no longer get the performance we need and NGINX doesn’t have the features we need for our very complex environment.”

Cloudflare is now primarily focused on services that proxy traffic between its network and servers on the internet, with the Pingora proxy service powering its CDN, Workers fetch, Tunnel, Stream, R2, and many other features and products.

Cloudflare said the reason they chose to build another new proxy was due to the many limitations they had encountered with NGINX over the years.

These include architectural limitations that hurt performance, and the difficulty of adding certain types of functionality

. And pointed out that the NGINX community is not very active, and development is often “closed door . “

And they chose Rust as the language for the project because it can do what C can do in a memory-safe way without compromising performance.

Cloudflare also implemented their own HTTP library for Rust to meet all their different needs. Pingora uses a multi-threaded architecture instead of multi-process.

Cloudflare quit Nginx and uses Pingora written in internal Rust

Overall traffic on Pingora showed a median TTFB reduction of 5ms and a 95th percentile reduction of 80ms.

Among all customers, Pingora has only one third of new connections per second compared to the old service.

For one major customer, it increased connection reuse from 87.1% to 99.92%, which resulted in a 160x reduction in new connections to its origins.

“To visualize this number more clearly, by switching to Pingora, we are saving our customers and users 434 years of handshake time every day.”

In a production environment, Pingora consumes about 70% and 67% less CPU and memory compared to the old service under the same traffic load.

In addition to the performance benefits, Pingora is also considered to be more secure, thanks in large part to the use of Rust.

Pingora isn’t open-sourced yet, and Cloudflare says they’re working on plans, but the HTTP proxy isn’t publicly available yet.

More details can be found on the official blog .

Continue Reading

Networking

What are advantages and disadvantages of different load balancing?

Published

on

By

different load balancing

What are advantages and disadvantages of different load balancing?

In the initial stage of website creation, we generally use a single machine to provide centralized services, but as the business volume increases, there are greater challenges in terms of performance and stability.

At this time, we will think of providing better services through expansion.

 

What are advantages and disadvantages of different load balancing?

 

What is load balancing?

In the initial stage of website creation, we generally use a single machine to provide centralized services, but as the business volume increases, there are greater challenges in terms of performance and stability.

At this time, we will think of providing better services through expansion.

We generally group multiple machines into a cluster to provide external services.

However, our website provides only one access to the outside world, such as www.xxx.com.

So when the user enters www.xxx.com in the browser, how to distribute the user’s request to different machines in the cluster? This is what the load balancing is doing.

Most current Internet systems use server cluster technology. The cluster will deploy the same service on multiple servers to form a cluster to provide services to the outside world.

These clusters can be web application server clusters, database server clusters, or Distributed cache server cluster, etc.

In practical applications, there will always be a load balancing server before the Web server cluster.

The task of the load balancing device is to serve as the entrance to the Web server traffic, select the most suitable Web server, and forward the client’s request to it for processing.

Transparent forwarding from the client to the real server.

The “cloud computing” and distributed architecture, which has been popular in recent years, essentially use back-end servers as computing resources and storage resources, which are packaged into a service provided by a certain management server, and the client does not need to care about what the service is really provided.

Which machine is it, from its perspective, it is as if it is facing a server with almost VPN capability, and in essence, the back-end cluster is the real service provider.

The two core problems solved by software load are: who to choose and forwarding, among which the most important is LVS (Linux Virtual Server).

 

What are advantages and disadvantages of different load balancing?

 

The topology of a typical Internet application is like this:

 

What are advantages and disadvantages of different load balancing?

 

Load balancing classification

Now we know that load balancing is a computer network technology that is used to distribute load among multiple computers (computer clusters), network connections, CPUs, disk drives or other resources, in order to achieve optimized resource usage and ** *The purpose of reducing throughput, minimizing response time, and avoiding overload at the same time. Then, there are many ways to realize this kind of computer technology.

It can be roughly divided into the following types, among which the most commonly used are the 4-tier and 7-tier load balancing:

2-tier load balancing

The load balancing server still provides a VIP (virtual IP) externally. Different machines in the cluster use the same IP address, but the MAC addresses of the machines are different.

After the load balancing server receives the request, it forwards the request to the target machine by rewriting the target MAC address of the message to achieve load balancing.

3-tier load balancing

Similar to Layer 2 load balancing, the load balancing server still provides a VIP (virtual IP) externally, but different machines in the cluster use different IP addresses.

After the load balancing server receives the request, it forwards the request to different real servers through IP according to different load balancing algorithms.

4-tier load balancing

The 4-tier load balancing works at the transport layer of the OSI model. Because at the transport layer, there is only the TCP/UDP protocol. In addition to the source IP and the destination IP, these two protocols also contain the source port number and the destination port number.

After receiving the client’s request, the 4-tier load balancing server forwards the traffic to the application server by modifying the address information (IP+port number) of the data packet.

7-tier load balancing

7-tier load balancing works at the application layer of the OSI model. There are many application layer protocols, such as HTTP, Radius, DNS and so on. The 7-tier load can be loaded based on these protocols.

These application layer protocols will contain a lot of meaningful content.

For example, in the load balancing of the same web server, in addition to the load based on the IP and port, it can also be determined whether to perform load balancing based on the 7-tier URL, browser category, and language.

What are advantages and disadvantages of different load balancing?Figure:  4-tier and 7-tier load balancing

For general applications, Nginx is enough. Nginx can be used for 7-tier load balancing. However, for some large websites, the multi-level load balancing method is generally adopted for DNS + 4-tier load + 7-tier load.

Common load balancing tools

Hardware load balancing has superior performance and comprehensive functions, but it is expensive, and is generally suitable for long-term use by early or local companies.

Therefore, software load balancing is widely used in the Internet field. Commonly used software load balancing software includes Nginx, LVS, HaProxy, etc.

Nginx/LVS/HAProxy are currently the three most widely used load balancing software.

1. LVS

LVS (Linux Virtual Server), also known as Linux Virtual Server, is a free software project initiated by Dr. Zhang Wensong.

The goal of using LVS technology is to achieve a high-performance and highly available server cluster through the load balancing technology provided by LVS and the Linux operating system, which has good reliability, scalability and operability. So as to achieve the best service performance at a low cost.

LVS is mainly used for 4-tier load balancing.

LVS architecture

The server cluster system set up by LVS consists of three parts: the front-end load balancing layer (Loader Balancer), the middle server group layer, represented by Server Array, and the *** layer data shared storage layer, represented by Shared Storage. In the eyes of users, all applications are transparent, and users are just using high-performance services provided by a virtual server.

Detailed introduction of each level of LVS:

Load Balancer layer: Located at the forefront of the entire cluster system, it consists of one or more load schedulers (Director Server).

The LVS module is installed on the Director Server. The main function of the Director is similar to a router. It contains the function of completing the LVS.

The set routing table distributes the user’s request to the application server (Real Server) of the Server Array layer through these routing tables.

At the same time, the monitoring module Ldirectord for the Real Server service must be installed on the Director Server. This module is used to monitor the health of each Real Server service.

Remove it from the LVS routing table when the Real Server is unavailable, and rejoin it when it is restored.

Server Array layer: consists of a group of machines that actually run application services.

Real Server can be one or more of Web server, Mail server, FTP server, DNS server, and video server. Each Real Server is connected by a high-speed LAN.

Or connected to WANs distributed in various places. In actual applications, Director Server can also concurrently play the role of Real Server.

Shared Storage layer: It is a storage area that provides shared storage space and content consistency for all Real Servers. Physically, it is generally composed of disk array devices.

In order to provide content consistency, data can generally be shared through the NFS network file system, but NFS In a busy business system, the performance is not very good.

At this time, cluster file systems can be used, such as Redhat’s GFS file system, or Oracle’s OCFS2 file system.

It can be seen from the entire LVS structure that Director Server is the core of the entire LVS. At present, the operating systems used for Director Server can only be Linux and FreeBSD.

Linux 2.6 kernel can support LVS functions without any settings, while FreeBSD is the application of Director Server.

Not a lot, and the performance is not very good. For Real Server, it can be almost all system platforms, Linux, windows, Solaris, AIX, BSD series can be well supported.

2. Nginx

Nginx (pronounced as engine x) is a web server that can reverse proxy HTTP, HTTPS, SMTP, POP3, IMAP protocol links, as well as a load balancer and an HTTP cache.

Nginx is mainly used for 7-tier load balancing.

Concurrency performance: The official support for 50,000 concurrency per second, the actual domestic concurrency is generally 20,000 per second, and there are optimized to 100,000 concurrency per second. The specific performance depends on the application scenario.

Features:

  • Modular design: good expandability, function expansion can be carried out by means of modules.
  • High reliability: The master process and the worker are implemented synchronously. If one worker has a problem, the other worker will be started immediately.
  • Low memory consumption: 10,000 long connections (keep-alive) consume only 2.5 MB of memory.
  • Support hot deployment: without stopping the server, it can update the configuration file, change the log file, and update the server program version.
  • Strong concurrency: official data supports 50,000 concurrency per second;
  • Rich functions: excellent reverse proxy function and flexible load balancing strategy

The basic working mode of Nginx

 

A master process spawns one or more worker processes. But here the master is started as root, because Nginx needs to work on port 80. Only the administrator has the authority to start ports less than 1023.

The master is mainly responsible for starting the worker, loading the configuration file, and is responsible for the smooth upgrade of the system. The other work is handed over to the worker.

When the worker is started, it is only responsible for some of the simplest tasks of the web, and other tasks are implemented by the modules called in the worker.

The modules implement functions in a pipeline manner. Pipeline refers to a user request, which is realized by combining the functions of multiple modules in turn.

For example, the first module is only responsible for analyzing the header of the request, the second module is only responsible for finding data, and the third module is only responsible for compressing data and completing their respective tasks in turn. To achieve the completion of the entire work.

How do they implement hot deployment? That’s the case. We said earlier that the master is not responsible for specific work, but calls the worker to work. It is only responsible for reading the configuration file.

Therefore, when a module is modified or the configuration file changes, it is The master reads, so the worker’s work will not be affected at this time.

After the master reads the configuration file, it will not immediately notify the worker of the modified configuration file. Instead, let the modified worker continue to work with the old configuration file.

When the worker finishes its work, the child process is directly pawned, replaced with a new child process, and the new rules are used.

3、HAProxy

HAProxy is also a load balancing software that is widely used. HAProxy provides high availability, load balancing, and proxy based on TCP and HTTP applications, supports virtual hosts, and is a free, fast and reliable solution.

Especially suitable for those Web sites with heavy load.

The operating mode makes it easy and safe to integrate into the current architecture, while protecting your web server from being exposed to the network.

HAProxy is a free and open source software written in C language, which provides high availability, load balancing, and application proxy based on TCP and HTTP.

HAproxy is mainly used for 7-tier load balancing.

 

Common load balancing algorithms

As mentioned in the introduction of load balancing technology above, when the load balancing server decides which real server to forward the request to, it is implemented through the load balancing algorithm.

Load balancing algorithms can be divided into two categories: static load balancing algorithms and dynamic load balancing algorithms.

  • Static load balancing algorithms include: polling, ratio, and priority.
  • Dynamic load balancing algorithms include: minimum number of connections, fastest response speed, observation method, prediction method, dynamic performance allocation, dynamic server supplement, service quality, service type, and rule mode.

Round Robin:

A sequential loop will request to connect to each server in a sequential loop.

When one of the servers fails on the second to seventh layers, BIG-IP takes it out of the sequential circular queue and does not participate in the next polling until it returns to normal.

Different servers are requested to be dispatched in turn in a polling manner; when implemented, the servers are generally weighted; this has two advantages: different loads can be allocated according to the performance differences of the servers; when a certain node needs to be removed, only You need to set its weight to 0;

  • Advantages: simple and efficient implementation; easy to expand horizontally
  • Disadvantages: The uncertainty of the request to the destination node makes it unsuitable for writing scenarios (cache, database writing)
  • Application scenario: only read scenarios in the database or application service layer

Random method:

the request is randomly distributed to each node; a balanced distribution can be achieved in a scenario where the data is large enough;

  • Advantages: simple implementation and easy horizontal expansion
  • Disadvantages: Same as Round Robin, cannot be used in writing scenes
  • Application scenario: database load balancing, which is also a read-only scenario

Hash method:

calculate the node that needs to fall according to the key, which can ensure that the same key must fall on the same server;

  • Advantages: The same key must fall on the same node, so it can be used in cache scenarios with writes and reads
  • Disadvantages: After a node failure, it will cause the hash keys to be redistributed, resulting in a significant drop in the *** rate
  • Solution: Consistent hash or use keepalived to ensure the high availability of any node, and other nodes will come up after a failure
  • Application scenario: cache, read and write

Consistent hashing :

When a node of the server fails, only the key on this node is affected, and the guarantee rate is guaranteed; such as the ketama scheme in twemproxy; it can also be planned and specified in the production implementation. Sub-key hashing to ensure that keys with locally similar characteristics can be distributed on the same server;

  • Advantages: limited drop in *** rate after node failure
  • Application scenario: cache

Load according to the range of keys:

Load according to the range of keys. The first 100 million keys are stored in the first server, and 100 to 200 million are stored in the second node.

  • Advantages: easy horizontal expansion, when storage is not enough, add a server to store subsequent new data
  • Disadvantages: uneven load; uneven distribution of databases;
  • (Data is divided into hot and cold. Generally, recently registered users are more active, which causes the subsequent servers to be very busy, while the previous nodes are much idle)
  • Applicable scenario: database sharding load balancing

Load based on the modulus of the key-to-server node number :

load based on the modulus of the key-to-server node number; for example, there are 4 servers, the key modulo 0 falls on the *** node, and 1 falls on the second node. Point.

  • Advantages: balanced distribution of hot and cold data, balanced distribution of database node load;
  • Disadvantages: It is difficult to expand horizontally;
  • Applicable scenario: database sharding load balancing

Pure dynamic node load balancing :

According to the processing capacity of CPU, IO, and network to decide how to schedule the next request.

  • Advantages: make full use of server resources to ensure load processing balance on each node
  • Disadvantages: complicated to implement, less real use

No need for active load balancing :

use the message queue to switch to the asynchronous model to eliminate the problem of load balancing; load balancing is a push model that always sends data to you, then all user requests are sent to the message queue, and all downstream nodes Whoever clicks is free, who comes up to fetch data processing; after switching to the pull model, the problem of load on downstream nodes is eliminated.

  • Advantages: Through the buffering of the message queue, the back-end system is protected, and the back-end server will not be overwhelmed when the request increases sharply; horizontal expansion is easy, after adding a new node, just take the queue directly;
  • Disadvantages: not real-time;
  • Application scenarios: scenarios that do not need to return in real time; for example, 12036 immediately returns a prompt message after placing an order: your order has been queued… wait for the processing to be completed, and then asynchronously notify;

Ratio :

Assign a weighted value to each server as a ratio. Based on this ratio, the user’s request is allocated to each server. When a layer 2 to 7 failure occurs in one of the servers, BIG-IP takes it out of the server queue and does not participate in the next user request allocation until it returns to normal.

Priority:

Group all servers, define priority for each group, BIG-IP user request, assign to the server group of priority *** (in the same group, use polling or ratio algorithm, assign User’s request); When all servers in the *** priority level fail, BIG-IP will send the request to the second priority server group. This method actually provides users with a hot backup method.

Least Connection (Least Connection) :

Pass new connections to those servers that perform the least connection processing. When a layer 2 to 7 failure occurs in one of the servers, BIG-IP takes it out of the server queue and does not participate in the next user request allocation until it returns to normal.

Fastest mode (Fastest):

Pass the connection to the server with the fastest response. When one of the servers fails on the second to seventh layers, BIG-IP takes it out of the server queue and does not participate in the next allocation of user requests until it returns to normal.

Observed mode (Observed) :

The number of connections and response time are based on the optimal balance of these two items to select a server for the new request. When one of the servers fails on the second to seventh layers, BIG-IP takes it out of the server queue and does not participate in the next allocation of user requests until it returns to normal.

Predictive mode (Predictive): BIG-IP uses the collected server’s current performance indicators to perform predictive analysis, select a server in the next time slice, and its performance will reach the request of the corresponding user of the *** server. (Detected by BIG-IP)

Dynamic Ratio-APM (Dynamic Ratio-APM):

BIG-IP collects various performance parameters of applications and application servers, and dynamically adjusts traffic allocation.

Dynamic Server Replenishment (Dynamic Server Act.):

When the number of primary server groups decreases due to failures, the backup server is dynamically replenished to the primary server group.

Continue Reading

Trending News

Copyright © 2023-24. Cos Fone. Developed By Imran Javed Awan.