BigPipe study and research
April 26, 2011
imagine such a scene, a frequently visited site, each time you open the page it should take 6 seconds; while the other site offers a similar service, but the response takes only 3 seconds, then how would you choose? Data indicate that if a user opens a Web site, wait 3 to 4 seconds still no response, they will become irritable, anxious, complaining, or even shut down and no longer access the web, this is a very bad situation. Therefore, the page loading speed is very important, especially for a worldwide 500 million users of Facebook (the world largest social networking service website) for such a large site, with a large number of concurrent requests, mass data and other objective circumstances, the speed must become One of the challenges overcome.
2010 beginning of the year, Facebook team began a front-end performance optimization of their project, after a six-month effort, the success of the personal space from the main page takes 5 seconds to load Now reduced to 2.5 seconds. This is a very great achievement, but also to bring a good user experience. In the optimization project, the engineers proposed a new page is loaded technology, called Bigpipe. Taobao and Facebook is currently facing very similar problems: mass data and page too large, if the details page, list page using bigpipe, or webx integrated bigpipe, will bring significant improve page loading speed. 2.1 The importance of web front-end optimization
“high-performance Website Guide” a book that only 10% to 20% of end-user response time is spent in an HTML document from a Web server to obtain and send to browser. If you want to be able to effectively reduce the response time of the page, you must pay attention to the remaining 80% to 90% of end-user experience. For comparison, if the business logic of the background to optimize efficiency by 50%, but only the final page response time reduced by 5% to 10%, because of its low proportion. If the front-end performance optimization, efficiency 50%, then the final page response time will be reduced by 40% to 45%. This is such a significant figure! In addition, generally higher than the front-end performance optimization to optimize the business logic easier. Therefore, the front-end optimization into a small, quick, high cost, need to invest more attention.
2.2 BigPipe and AJAX
Web2.0 important feature is the page shows a lot of dynamic content, that is, focusing on web web2.0 interaction with the user. Its core technology is AJAX, all major sites are now more or less use AJAX. Similar with AJAX, BigPipe realized the concept of sub-pieces so that the page can step out, that part of each output page content. Then discuss the difference between BigPipe with AJAX.
Simply put, BigPipe has three advantages over AJAX:
1. AJAX is the core of XMLHttpRequest, the client needs to send asynchronous requests to the server, and then sent over Add dynamic content to a website. This implementation has some shortcomings, the request is sent to and from the time-consuming, and BigPipe technology does not need to send the browser XMLHttpRequest request, thus saving time loss.
2. using AJAX, the browser and server work order. The server must wait for the browser request, this will cause the server is idle. Work browser, the server is waiting, and the server work, the browser is waiting for, this is a waste of performance. Use BigPipe, browser and server can work in parallel at the same time, the server need not wait for the browser request, but has been in session the contents of the page is loaded, which will be greatly improved efficiency.
3. to reduce the browser sends the request. 500 million users on a site, reducing the use of AJAX to bring additional request will reduce the load on the server, it will also bring great performance.
Based on the above three points, Facebook used during a BigPipe page optimization techniques. Taobao is currently the main search results page, to load categories, related searches, baby list, advertising, etc., using the php curl the front where the batch concurrent access to the engine to get the data, and the step output. This pattern is somewhat different with bigpipe, this will be mentioned later. In general, the larger the page, and more complex style sheets and scripts are more cases, the use BigPipe to optimize the output page is more appropriate. Another very important point, BigPipe browser does not change the structure of the network protocol, can be achieved using only JS, users do not need to do any settings, you will see significant access time. The next discussion of existing bottlenecks. The face of increasing the page, particularly a large number of css files and js files to load, the traditional page load is difficult to meet this demand model, the direct result of slow page loads, it is definitely not want to see. Current technology, the user page requests made after the complete page load process is as follows:
1. Users access the page, the browser sends an HTTP request to a network server
2 server parses the request, and then to data from the storage layer, and then generate a html file contents, and put it in a HTTP Response sent to the client
3. HTTP response in the network transmission
4. The browser parses the Response, to create a DOM tree, and then download the required CSS and JS files
5. downloaded the CSS file, the browser resolves They also applied to the corresponding content
6. JS downloaded, the browser parse and execute them
Figure 1.
complete the process shown in Figure 1 the left side of the figure indicates that the server, right side of the browser. Browser sends a request first, then the server to find the data, generate page, return html code, and finally the browser to render the page. This model has a very obvious flaw: the operation of the process has a strict order, if not executed before the end of an operation, back operations can not perform that operation can not overlap between. This will result in performance bottlenecks: the content server to generate a page, the browser is idle, the display blank content; loaded when the browser renders the page content, the server is idle, a waste of time and the resulting performance .
Figure 2.
consider Figure 2, the existing service model, the horizontal axis represents the time spent. Yellow pages in the content server to generate the time, White said the network transmission time, the blue pages in the browser rendering time. It can be seen, the existing pattern caused great waste of time. Consider the case of Figure 3, in green indicates that the server be picked up from a spring reservoir data takes time, huge amounts of data, when executing a query time-consuming (as seen below right), the server on the block where the No other operations, and the browser is not any feedback. This will result in a very unfriendly user experience, users do not know what has caused them to wait a long time.
Figure 3.
to face these problems, we look BigPipe solution. BigPipe block proposed the concept that, according to the page content in different locations, the entire page is divided into different pieces called pagelet. The designers of the technology is to study the electronic circuit Changhao Jiang PhD, may have been inspired from the microcomputer, many pagelet will load the same assembly line as the different stages in the browser and executed on the server, so do the browser and the server parallel to achieve the overlapping run-time server and browser client runtime purposes. Use BigPipe not only save time, reduce the time to load, but also with the pagelet step by step through the output, so that part of the output page content faster to get a better user experience. BigPipe, the user page requests made after the complete page load process is as follows:
1. Request parsing: parsing the server and check http request
2. Datafetching: server get data from the storage layer
3. Markup generation: the server generates html tags
4. Network transport: network response
5. CSS downloading : The browser download CSS
6. DOM tree construction and CSS styling: the browser to generate the DOM tree, and use CSS
7. JavaScript downloading: reference browser download page JS file
8. JavaScript execution: the browser page JS code execution
the eight above mentioned process is almost no difference between the existing model, but the entire process pagelet is a complete process, and several different operating stages pagelet can be performed as the same assembly line.
Figure 4
Figure 4, we can see BigPipe the original model improvements. Browser sends a request to access, then the browser returns a different step pagelet content, the specific implementation will be described later. Consider Figure 5, the improvement, BigPipe break the original sequence, the page is divided into different pagelet, so a to, the execution time of all the pagelet still add up to the original time. However, the superposition of different pagelet through different stages of execution time, bringing the total running time greatly reduced, and this is Bigpipe reduce page load time secret.
FaceBook page is divided into many different pagelets, as shown in Figure:
Figure 5
5 BigPipe implementation principle
understand BigPipe the core idea, we discuss the implementation of its principles. When the browser to access the server, the server accepts the request and inspect them. If the request is valid, the server side without any query, but immediately return a http request to the browser, the content is a html code and including html
< br />
The template uses css-div describes the structure of the page, different div tags corresponding to different pagelet, id corresponding to the pagelet name. This response will be returned to the browser, the server began to query the contents of each pagelet, load, generation. When a pagelet to generate good content, immediately call flush () function, it returns to the client, json format data is transmitted, including the need for this pagelet CSS and JS, and html content, and some metadata. For example:
big_pipe.onPageletArrive (
{id: “pagelet_composer”,
content: ““, < br />
css :”[..]“,
js :”[..]“,
…}
);
which “content” means that the pagelet content, is the html source, special characters such as “” / need to be escaped; “id” that content should appear, pagelet is the id of the corresponding label; “css” resource that will need to download the CSS path; “js” expressed the need to download the JS script path. In order to avoid file path is too long, so in front of the need for css and js file path to conversion, converted to 5-bit string: different pagelet may load with a css or js file, so to avoid duplication download.
Although each pagelet has to load the js file, but all the js files are loaded last, so that will help speed up page loading. Client, when the call to “onPageletArrive (json)” function, the first impact of the transfer function of the JS script json parsing the incoming data, then download the required CSS, and then display the html content of DIV tags to position response . Several pagelets can download the CSS file, CSS download is complete before the pagelet display.
in BigPipe in, js given CSS and content than the lower priority. Thus, only when all the pagelets have shown, BigPipe began to download the JS file. All JS files download is complete, Pagelets the JS initialization code starts to execute, follow the download time to complete the order. In this highly parallel systems, several of the pagelet to be performed at different stages can be executed simultaneously. For example, the browser can be downloaded to the two pagelets CSS resources, the browser can render another pagelet content, while still in for another pagelet server generated html source code. From the user point of view, gradually rendering the page. The initial page display faster, users can effectively shorten the perceived delay.
6 BigPipe achieve Discussion
6.1 server-side parallel
Ideally, the server-side implementation is parallel processing of different pagelet content This can improve performance. Multiple concurrent processing server pagelet content, generate a pagelet content is good, and it immediately flush to the browser. But PHP does not support threads, so the server can not use multiple threads to concurrently load the concept of multiple pagelet content. For small sites, the use of serial has been loaded pagelet content can optimize the request. For large sites, in order to achieve faster, concurrent server can choose to separate different pagelet content, the concrete realization of the following ways:
1.java multi-threaded. Back-end logic to use java, you can use java multi-threading mechanism to simultaneously load different pagelet content, plus page after loading the content back to the browser. In the final part you can see online reference using java multi-threading example.
2. with PHP,. PHP does not support threads, can not be used as java concurrent multi-threaded mechanism to deal with different pagelet content. However, Facebook and the main search Taobao business logic is implemented in PHP, so we must consider how to complete the concurrent processing in PHP. There curl PHP extension module, the module can curl_multi_fetch () function for batch processing request, the request should have been a serial execution of concurrent access. Can be written:
Posted: January 3rd, 2012
at 3:14am by admin
Tagged with bigpipe
Categories: Fashion
Comments: No comments
