What is a CDN?
A Content Delivery Network (CDN) is a group of geographically distributed servers that are used to cache files locally for faster download speeds to domestic and international markets. CDNs offer hundreds of data center locations situated near the world’s most populated geographic regions with load balancing on network traffic that routes requests to the closest web server.
CDNs are used in web hosting to cache HTML, CSS, JavaScript, images, video, and other files for download to reduce strain on a central server. By changing the DNS settings for a domain to point to a CDN, all requests to the main web server are distributed across multiple data centers internationally with load balancing on traffic requests that build better performance speeds.
CDNs offer integrated security features for web publishers that include Distributed Denial-of-Service (DDoS) protection, Web Application Firewalls (WAFs), and SSL/TLS certificate management. The use of caching on high-traffic websites and mobile applications reduces the total file size of downloads through compression and minification for faster browsing of content by users. CDNs also support streaming media, gaming, and ecommerce applications.
What is the difference between CDN and hosting?
In web hosting, an HTTP(S) server must support database and programming language processing to store data, retrieve information, and publish dynamic content. In contrast, a Content Delivery Network (CDN) only stores the cached files for a website or mobile application. When dynamic content displays are updated, the cache is cleared to repopulate CDN data.
A CDN is most efficient for serving static pages to anonymous users at scale. For example, a WordPress website running on shared, VPS, or dedicated server hosting can cache web content through a CDN service to distribute pages globally for faster download speeds. The main web server is used for logged-in user pages which cannot be cached efficiently to CDNs.
A web server will also support email, FTP, SSH access, etc., which are not part of a CDN’s core functionality. An edge server network is a hybrid solution that combines the distributed global network features of a CDN with a web server’s ability to process code. The Cloudflare CDN offers the Railgun solution for increased support for user variables in cached website content.
An example of a CDN
Some of the most popular CDN solutions in the current marketplace are Cloudflare, Akamai, EdgeCast, Fastly, Limelight, AWS CloudFront, Azure, Google Cloud, OnApp, and GoDaddy. The Cloudflare CDN is free, includes hundreds of data centers internationally, and integrates natively with cPanel hosting accounts. Other CDNs work with built-in public cloud platform integration.
Akamai, EdgeCast, and Limelight are primarily used by high-traffic enterprise websites and mobile applications to support cloud software in production. These services offer streaming video optimization that will scale to support the requirements of live events online like the Olympics or the SuperBowl. The use of geolocation across multiple data centers improves the ability of streaming media websites to support millions of simultaneous users for live events.
Connecting a web/mobile application to a CDN like AWS, Google Cloud, or Azure allows developers to transform dynamic content to flat files that are compressed and minified for the fastest download speeds in production. Cloudflare is free and more commonly used in shared hosting environments. Web publishers can improve the performance of WordPress, Joomla, Drupal, and other CMS websites by integrating with a CDN for the global distribution of files.
How does a CDN work?
A CDN works through reverse-proxy load balancing which operates in front of the web server and routes web traffic requests to cached pages across data centers internationally. By analyzing network hardware in real-time, the load balancer can direct requests to the data center that is closest to the geo-location of the user. The combination of server proximity and web page caching allows the content to be downloaded at speeds of less than 1 second.
A CDN will offer faster download speeds in production than even a dedicated web server can manage. A CDN is not required to access the database or run code for a website or mobile application to serve cached pages to the user. The load balancer can determine whether to route requests to cached pages or to the main web server based on the availability of files.
Anonymous requests are routed to the cached files in the data center that is closest to the user by geo-location. If the user requires personalized data and displays, the requests are routed to the root server for processing. The root server can operate on shared hosting, VPS plans, or a dedicated server. Developers can also integrate CDN functionality with elastic cloud and multi-cluster hardware. The key is the CDN load balancer which manages global web traffic.
CDN building blocks (PoPs, caching, SSD/HDD + RAM)
The main building blocks of a Content Delivery Network (CDN) are web servers located in multiple international data centers known as Points of Presence (PoPs). The largest CDNs support hundreds of different PoPs in a unified network with load balancing. Web content is stored on hard-disk drives (HDD), solid-state drives (SSD), or RAM depending on the service.
Solid-state drives (SSD) will outperform traditional hard-disk drive (HDD) technology with 20x faster I/O request processing on the same web server. Serving cached data from system RAM is the fastest solution. Reverse-proxy load balancing can support network resources with HDD and SSD storage according to project requirements, optimizing transfers with RAM caching for the most popular requests. Many CDNs use machine learning (ML) to improve load balancing.
The more Points of Presence (PoPs) available on a CDN will usually guarantee better service times. This is due to greater geo-locative support for the most populated urban centers globally. Smaller CDNs may just include single data center locations in North America, Europe, and Asia. Each PoP represents a server location in a unique data center that is integrated with global network load balancing. In this way, users in New York or Tokyo will receive local server access.
What is a reverse proxy?
A reverse proxy server is used to manage load balancing, website caching, and security on a Content Delivery Network (CDN). A traditional proxy server operates with the task of protecting clients on a network by implementing the unified security policies required by organizations. A reverse proxy server functions by protecting the server infrastructure from malicious requests.
A reverse proxy server acts as a middle agent between the requests of internet users and the CDN infrastructure. The reverse proxy server can be configured to recognize the geolocation, device type, and language of the user, in order to route the request to specific files or data center hardware which will transfer the information with the most optimal performance times and correct display. Load balancing is implemented to manage PoPs across CDN infrastructure.
A reverse proxy server will identify which requests are sent from anonymous users of a website or mobile application and route their requests to cached pages for swift fulfillment through regional data centers. Requests for pages or information that are not cached are sent to the root server. In this manner, a CDN can manage hundreds of PoPs internationally for a single cloud application, replicating data for regional users to reduce download speeds to milliseconds.
What are the benefits of using a CDN?
Industry studies by Google have found that as website download speeds increase from 1 second to 5 seconds, user retention on the platform decreases as much as 90%. This is important for ecommerce publishers who need to convert browsers into customers. As a consequence, Google has based recent SEO rankings on the page speed of website downloads. If your website does not load quickly, it will not be promoted in search listings.
CDNs are the best way to achieve page load times that are less than 1 second in production. This is accomplished through the compression and minification of HTML, CSS, JavaScript, and other files required for a page to load. When a web page is reduced to a flat file on a CDN, it no longer requires running the database or code on the root server and can be replicated with geo-location across hundreds of instances globally. This outperforms even dedicated servers.
For businesses operating in streaming media to large audiences, a CDN provides the infrastructure to scale to millions of simultaneous users. The same technology supports the websites and mobile applications with the highest rates of traffic in production. The cost of a CDN service is less than a standard web server due to the limitation of flat file support. A web server processes cached and compressed files with greater speed than database queries.
How does a CDN improve website speed & performance?
The main principle of Content Delivery Network (CDN) optimization is geo-proximity. The internet operates on the basis of high-speed fiber optic cables between data centers that transfer information at speeds around 100 Mbit/s. CDNs locate their servers at the junction points of fiber optic cables in order to be able to serve web pages at the fastest speeds.
Geo-proximity is also the reason for establishing multiple PoPs in international data centers on CDN infrastructure. Rather than serving all network requests from a single hardware unit, CDNs multiply data across regional servers for faster download speeds. This allows East Coast, West Coast, European, Asian, African, and Australian users to connect to a regional data center instead of a central server and reduces the distance data needs to travel to the end user.
By combining geo-location with website caching and file size minification for HTML, CSS, JavaScript, images, video streams, etc., the total size of the request is reduced along with the distance the data needs to travel. In this manner, the reverse proxy load balancing facilities of the CDN are able to serve the data to the user by region in the most optimal time possible by the hardware. This improves website and mobile application performance significantly overall.
How does a CDN keep a website always online?
A CDN will usually support somewhere between 5 to over a hundred different PoPs in data center locations globally. The existence of cached files for your website or mobile application across all of the PoP servers in the network eliminates a single point of failure for hardware. Even if your root web server fails, the data will be preserved in the CDN until it is rebooted.
A single web server will also serve cached pages to anonymous users through a CMS website properly configured. However, if the hardware fails or if web traffic scales to take the unit offline, there are no inherent backup or failover mechanisms in place. A CDN not only reduces the strain on the root server through distributed caching, but also preserves the data in production in case of hardware failure. To accomplish this, you must assign the DNS settings to the CDN.
A CDN will also protect a web server against DDoS attacks by scanning I/O requests and identifying harmful activity based on IP address or patterns of behavior. Once a hacking attack is detected, the user and IP are quarantined and blacklisted by the firewall protection of the CDN. DDoS attacks are mitigated by isolating the malicious activity and preventing the requests from affecting production hardware. In this manner, a CDN can keep a website always online.
How does a CDN protect data?
Most CDN providers have incorporated the use of machine learning (ML) on network traffic to identify common patterns of usage for firewall protection. CDNs are designed to recognize malicious activity as part of the load balancing capabilities of the reverse proxy server. Known spam and hacker domains can be blocked by IP address or at the ISP level for blacklisting.
A CDN provider is the best way to mitigate against DDoS attacks for websites and mobile applications in active production. The CDN service prevents malicious attacks from reaching the root server. Script bots commonly target login and contact forms on a website for MySQL injection attacks. CDN hosting with a reverse proxy server will scan malicious requests in advance to recognize the activity of script bots and prevent them from targeting pages or forms.
A CDN server is distributed across multiple data centers with a failover design that will protect the root web server in case of traffic surges that would otherwise take the system offline. Some CDN service providers also offer SSL/TLS certificate management to encrypt data connections and maintain cloud security. Web Application Firewalls (WAFs) protect running software from hacking attacks by applying the hardened security techniques from the network to applications.
How does a CDN reduce bandwidth costs?
A CDN server does not run any programming language, script, or database functionality. By limiting the CDN service to cached files that are compressed and minified, the hardware can function with greater efficiency and I/O speeds. By locating the servers in multiple data centers internationally, the bandwidth cost is reduced through the use of geo-proximity to local clients.
The use of ML with a CDN load balancer allows the system to calculate the distance from the user to the server with greater accuracy for better optimization. By removing the main web traffic load from the root server, web publishers can receive lower costs on bandwidth from the CDN provider than a web hosting company. A CDN outage is rare in comparison to the crash of a single server in production. Removing the traffic strain from a server leads to better uptime.
How to choose a CDN provider?
The choice of the best CDN provider depends on the budget of the project, the experience of the development team, the total amount of web traffic to be supported, and the platform that is used for cloud hosting. Cloudflare has become the world’s largest CDN provider on the basis of operating free services to shared hosting users on cPanel. Other services like Akamai, EdgeCast, and Limelight focus on the requirements of enterprise ecommerce and streaming media. AWS, Google Cloud, and Azure CDN services are integrated with cloud platforms.
Websites with the highest levels of traffic to support can consider utilizing CDN services from telco providers. The most important deciding factor between CDN providers is the number of PoPs or data center locations that are available internationally. However, the scale of a Content Delivery Network (CDN) must be matched with performance speeds. The load balancing facilities, reverse proxy server, and hardware configuration of a CDN provider must operate with elite speeds in order to enable the caching of web pages that load in less than 1 second.
Enterprise brands, streaming media services, and independent ecommerce websites each have different requirements for a CDN platform, although the core functionality remains the same. The cost-to-performance ratio drives CDN adoption to fulfill the requirements of web publishing, ecommerce, social networking, and mobile application support. Look for CDN services with the most experience in providing infrastructure for world-class events with streaming media support.
Choose cost-effective CDN services for budget projects that take the strain off of web servers without breaking the bank in production. Favor load balancing with the largest number of data centers across an international network, and look for innovation in CDN service providers.
While it is difficult to benchmark CDN providers across platforms, by clearly knowing the development requirements that need to be supported in production, IT managers can make the best decision on vendors. Trust your development team and choose the CDN platform that supports the code of your applications. Balance the total cost of services with the optimized performance that is delivered by the platform to choose the best CDN hosting for your projects.