Friday, August 21, 2009

Business Server Pages Do you want to edit pages Using Proxies

This troubleshooting guide describes all the different problems observed with the use of proxies.

WARNING IN RED: Unforunately, this is a very complex topic! It is recommended at all times to first fetch a cup of coffee and secondly to close the office door. Then read this text slowly, word for word. Read it twice! Experience has shown that a number of GoLive! dates could easily have been kept by prudent reading skills on this one topic alone. This warning is in the spirit of a helping hand. Let us learn from past experience!


What Proxies are Supported?



In principle, when the term proxy is used within this text, we usually and silently refer to a SAP Web Dispatcher. However, in principle, any form of proxy can be used, as long as it obeys the constrains of what we require from a proxy. Most of it will also be listed below as part of the troubleshooting guide.


Documentation



Most of the aspects on the generation of URLs in the context of a proxy is described in URL Generation in an AS-ABAP - Web Dispatcher Configuration . This page is written and continuously updated from OSS support work. As such, we will find large amounts of duplicate information. The documentation itself is written in the form of A to Z, convering the topic systematically. This page is written more in the sense of a troubleshooting guide, containing questions to help troubleshoot a problem to a final solution. It effectively highlights again the documentation. What to read? Both! Twice! (Don't forget the coffee and to close the office door.)

For general information about SAP Web Dispatcher, see the corresponding documentation . Just to stress this point one final time: A number of complex OSS trouble tickets could have been completely avoided by one careful reading session! Make the time this once, it is really worth it!

URL Generation Concepts



Although far off-topic for the immediate discussion of proxies, the basic concepts must be understood.

A URL for a page is written in the form of http(s)://server.domain.ext:port/URL. After the HTML for the page is loaded, all other URLs on the page itself are server-relative. They are either of the form /URL2 or just URL3 or ../URL3. The written URLs within the HTML code will (usually!) never contain a protocol (http(s)://) part, and as such does not hard contain any server references.

The first important aspect to understand is that each and every request from the browser is always server-absolute. This is a must in the world of HTTP! What it means is that each and every incoming HTTP request is triggered by an URL of the form http(s)://server.domain.ext:port/URL, that is, with protocol, server, port and server-relative URL. (For interested reader, if the port has the default value of 80 for HTTP or 443 for HTTPS, it is usually not listed.)

So how does the browser build up these correct URLs? The trick is that the browser will always take the missing parts of information from the URL of the current page that is displayed. In the case of the absolute URL shown above, the browser will take the current active protocol and server information, and generate a new server-absolute URL: http(s)://server.domain.ext:port/URL2.

In the case of a relative URL, the browser will again use the information from the current loaded page to construct a server-absolute URL: http(s)://server.domain.ext:port/URL3. What we notice in this game, is that as long as URLs in HTML pages never contain a protocol (http(s)://) or any form of server information (server.domain.ext:port), URL generation will always work correctly, independent of the actual protocol in use. This is why most code will work correctly in context of a proxy.

So why do we even have problems with proxies? (And why all this complex text?) There are typically two (plus one special one) cases where server-absolute URLs must be generated. And these usually fail if not carefully done.



Case 1: A protocol switch is required. A typical example would be a logon application or a website that requires to switch from HTTP to HTTPS to transfer critical information for a short period of time. In this case, a complete new URL of the form https:// server.domain.ext:port_secure/URL/secure/pages is required.

Case 2: The use of Java applets on pages that are run within an <iframe>, typically a BSP application using a Java applet running in a portal will fall into this case.

Case 3: A special case is when the workbench (or any SAP GUI transaction), wishes to generate a test URL to start a BSP application

Scenarios



As a first step to understand the complexities of the problem space, let us look at some proxy scenarios.

A proxy is just a machine standing somewhere in the Internet, with one leg in one network, and the other leg in another network. In simplistic terms, a proxy will be placed there where a corporate network is connected to the Internet, so that browsers in the Internet can talk to the proxy, that will forward all requests to the server for answering.



In the first scenario (Simple Proxy Configuration), we have a relatively simple configuration where the ports used on the proxy matches exactly those on the server. So if we wish to make a protocol switch to HTTPS, we can take the name that the browser uses to talk to us (see section on Host: header for the gory details), and then just look at our own port number for HTTPS.

In the second scenario (More Complex Proxy Configuration), already this naive approach fails. Here we have a port mapping active. The browser talks to the proxy on the default ports (80 for HTTP and 443 for HTTPS). However, the proxy forwards the requests to the server on different ports. If now a protocol switch to HTTPS is required, the server does not know the port number to use! It can only extract the server name from the Host: header, but has no immediate clue as to what ports are active on the proxy. Here already configuration data becomes critical.

In the third scenario (Really Complex Proxy Configuration), we see in addition a protocol switch. On the Internet, HTTPS is used to encrypt all data and ensure a safe transfer to your corporate network. However, once inside the controlled network, normal HTTP is used from the proxy to the server. The reason for this is pure performance. HTTP transfers are factors faster than HTTPS. Typical problem: on the server the logon application (or webshop) sees that we are using HTTP (it can not see the proxy). So it generates a new HTTPS url. This second URL hits the proxy and gets again transfered with HTTP to the server. Hmmm... unsafe, so again a HTTPS url is generated. This causes then a redirect loop until the browser fails to follow one more redirect. So, in this case we need to recognize that the browser is already using HTTPS, and has a safe communication channel, even although on the server we can only see HTTP traffic.

In the fourth scenario (Double Complex Proxy Configuration) is theorectically possible, especially in a hosted environment, although we see it seldomly. In cases such as these, when generating a new server absolute URL, we need to know what name the browser is using to talk to the proxy, and have to know for which of these names, what protocols/ports are allowed.

Host Header



When a HTTP request is send to the server, the full URL is written in the browser, including the protocol, the host name, port, and path. However, within the HTTP protocol a slightly different format is send. Let us assume that this URL is entered in the browser: http:// server.domain.ext:1080/URL. Then the browser will issue the following HTTP request to the server:

GET /URL HTTP/1.1
Accept: */*
Accept-Encoding: gzip, deflate
Accept-Language: de,en-us;q=0.5
Host: server.domain.ext:1080
User-Agent: Mozilla/4.0 (compatible; MSIE 5.5; Windows NT 5.0)


The request line (first line) contains the HTTP verb, the URL to fetch (/URL) and the protocol version. A Host: header is set containing the full name of the server that the browser sees as its communication partner. If the URL points to a non-standard port (HTTP uses by default port 80 and HTTPS port 443), then the port will be included in the Host: header.

The Host: header reflects the name that the browser uses to reach the server. It does not have to be the actual name of the server. It is just a name that can be mapped onto an IP address to find the server, or via a proxy be resolved onto the server. It is even possible that different names can be used to address the same server. Important, the Host: header is the name (and port) that the browser thinks that it is talking to (the server data), and which will always be again used for all subsequent HTTP requests.

For a small example, let us assume that on one WebAS, two different companies are hosted in two clients. Then one Web Dispatcher will have two names in the Internet, say www.funAreUs.info and www.seriousBusiness.com. The Web Dispatcher forwards all HTTP requests to the server. The server must now use the name under which it was addressed, stored in the Host: header, (that has nothing to do with its own name!) for generating URLs and setting domain based cookies.

Rule 1: The Host: header must be preserved by proxy. The proxy must not in any way change the Host: header.

The Web Dispatcher will always preserve the Host: header, and works correctly on this aspect. The Apache proxy can not in the version 1.x preserve the Host: header, and can not be used with the WebAS together. Only from Apache version 2+ is it possible to configure an Apache so that the Host: header is forwarded unchanged. For this, set the configuration option ProxyPreserveHost.

Protocol Switch



The third scenario with a protocol switch from HTTPS to HTTP is the most used option. This provides secure data communications over the Internet to the Web Dispatcher and also places the processor intensive work of handling the encryption with the Web Dispatcher, and not on the WebAS box itself. On the final leg between the Web Dispatcher and the WebAS machine, normal HTTP is used. Alhtough the most used scenario, it is also the most complex scenario.

The typical problem situation that we have is that of a logon application, or a web shop, that wishes to switch into HTTPS mode at some stage to securely transport data. First a check is made on the WebAS to see if the application is already using HTTPS. Here the answer is no, as the incoming HTTP requests on the Web Server is using HTTP. So the application generates a new HTTPS URL, and orders the browser to redirect there. The next incoming request from the browser is HTTPS to the Web Dispatcher, and then HTTP to the WebAS. The application checks to see if it is HTTPS, finds that HTTP is in use, and then regenerate again a HTTPS protocol. This situation now reached in the browser is a continues redirect loop. Each HTTPS request fired to the Web Dispatcher, is answered with a redirect back onto the same HTTPS url! Anything looking like a continuous redirect should cause the next paragraphs to be read carefully!

What is required is a flag that indicates to the WebAS that out there, the browser is already talking HTTPS to our representative, the proxy. This flag is a special HTTP header field, called ClientProtocol, that is set by the Web Dispatcher. When this HTTP header field has the value https, it indicates that a protocol switch was done. The software on the server should then treat the connection as already in HTTPS mode, even although only HTTP is used into the WebAS.

GET /URL HTTP/1.1
Accept: */*
Accept-Encoding: gzip, deflate
Accept-Language: de,en-us;q=0.5
Host: server.domain.ext:1080
ClientProtocol: https
User-Agent: Mozilla/4.0 (compatible; MSIE 5.5; Windows NT 5.0)


Rule 2: The ClientProtocol HTTP header must be set.

Web Dispatcher Access Points



Starting 620>=SP57, 640>=SP16 and 700>=SP06. In addition, kernel patch 640>=100, 700>=34.

The biggest problem, not yet addressed, is: how to know the port numbers of the proxy. If we look at the scenario two, we can see that an absolute URL must use port numbers that match that of the proxy, and not that of the server. However, this information is not immediately available.

The best idea would be if the proxy can just inform us of all port numbers that it have active with the incoming HTTP request. The Web Dispatcher has now been extended to set a new access point HTTP header (x-sap-webdisp-ap). The values of this header is the ports that the Web Dispatcher itself have active, and what protocols (HTTP or HTTPS) is configured on each port. Once this information is available in the incoming HTTP request, the server can itself generate complete new server absolute URLs that use the proxy port numbers without any further configuration required.

This configuration setting on the Web Dispatcher can be activated with: wdisp/handle_webdisp_ap_header = 1.

GET /URL HTTP/1.1
Accept: */*
Accept-Encoding: gzip, deflate
Accept-Language: de,en-us;q=0.5
Host: server.domain.ext:80
ClientProtocol: https
X-SAP-WebDisp-AP: http=80,https=443
User-Agent: Mozilla/4.0 (compatible; MSIE 5.5; Windows NT 5.0)


Interesting aspect: if the Web Dispatcher has more than one identity into the Internet, it will forward to the WebAS only those ports that match the identity (Host: HTTP header!) with which it was addressed. With these two HTTP header fields together, a lot of work can already completed without the need for any further configuration.

Rule 3: The X-SAP-WebDisp-AP HTTP header must be set (starting at appropriate service pack level).

HTTPURLLOC Table



Starting 620>=SP43, 640>=SP05 and 700>=SP00.

Theorectically, once the proxy can attach its access points onto each incoming HTTP request, we have sufficient information to generate new server absolute URLs. However, there are two cases where this still fails: (1) with the use of older service packs and (2) with the first/startup URL.

For old service packs, before the support of access points by the Web Dispatcher (and also the matching changes in the WebAS), the HTTPURLLOC table is available. This is basically a table that lists all the names and ports available on the proxy. The names are those by which browsers will address the proxy, the ports are the relevant numbers on which are listened. Please read the documentation referenced in the first section for the exact syntax of how the table is constructed. Simply stated, just list all the access points in the table.

For the second case, the HTTPURLLOC table entries are still required, even if access points are active. Let us assume a SAPGUI program requires to start a BSP application, in the worst case it could be for testing of an application from transaction SE80. Now, no incoming HTTP request is available, and thus, no information is available about the proxy out there. If such a scenario is to be supported, then the relevant info must be configured.

MANDT SORT_KEY PROTOCOL APPL HOST PORT


100 010 HTTP • WWW.YOURCOMPANY.COM 80
100 011 HTTPS • WWW.YOURCOMPANY.COM 443
100 100 HTTP • WEBAS.SERVERS.MYCOMPANY.CORP 1080
100 101 HTTPS • WEBAS.SERVERS.MYCOMPANY.CORP 1443
200 021 HTTPS • WWW.MYCOMPANY.COM 443
200 100 HTTP • WEBAS.SERVERS.MYCOMPANY.CORP 1080
200 101 HTTPS • WEBAS.SERVERS.MYCOMPANY.CORP 1443
300 031 HTTPS • WWW.OURCOMPANY.COM 443
300 100 HTTP • WEBAS.SERVERS.MYCOMPANY.CORP 1080
300 101 HTTPS • WEBAS.SERVERS.MYCOMPANY.CORP 1443

For the above table, the last scenario is taken as example. Effectively this table lists all the access points with which the WebAS can be addressed via a proxy or directly. The important aspect is effectively that in the case of a protocol switch, the name is used to decide which entries to read, and from this the correct port for the requested protocol can be determined. The internal name are also listed for those people that access WebAS directly with the proxy.

The sort sequence is of importance when a start URL is to be generated. In this case, the first entry in sort sequence is used to generate the URL. So if the start URLs are always to be generated to be via the proxy, place these entries first. For example, for www.yourCompany.com, the HTTP entry is first in sort sequence, causing HTTP URLs to be generated as default.

Rule 4: The HTTPURLLOC table must be configured if no Web Dispatcher Access Points are used, or in all cases if the start URL must be generated from the WebAS system.

HTTPURLLOC in Client/Mandant 000



Recommended only SP43<=620<SP57, SP05<=640<SP16 and 700<SP06.

Usually the HTTPURLLOC table is only configured for the one specific client where the information is required. However, there is one additional case where this is not sufficient. When a special logon application, for example in 620 the BSP SYSTEM application or from 640 the ICF SYSTEM logon, is used, then the first part of the logon runs without any user of client information. This information only becomes avialable after a successful logon. Up to that point, all requests are processed in client 000! Thus, when any of these logon applications are used, and no access point information is available, the HTTPURLLOC table must also have entries in client 000 for the switch onto HTTPS to work.

MANDT SORT_KEY PROTOCOL APPL HOST PORT


000 901 HTTP • WWW.YOURCOMPANY.COM 80
000 902 HTTPS • WWW.YOURCOMPANY.COM 443
000 902 HTTPS • WWW.MYCOMPANY.COM 443
000 902 HTTPS • WWW.OURCOMPANY.COM 443

Rule 5: The HTTPURLLOC table must also be configured with client/mandant 000 entries.

Testing the Proxy Configration



WORDS OF CAUTION: If you should decide all of this text is fluff, then read at least this one section. It just might save your bacon!

For 620<SP57, 640<SP16 and 700<SP06



The problem is that it is very difficult to quickly look into the proxy configuration. As the proxy updates all HTTP requests that it processes with special headers, one approach could be to make an ICM trace at level 3 on the WebAS, and look at these results. However, this is difficult and time consuming on a WebAS that is used by other HTTP applications. The simplest approach is to fire a HTTP request through the proxy at the WebAS, and have it echo back the complete header. This way we can see exactly what changes the proxy made to the HTTP request. For all tests, we use the BSP application IT00, page misc_echo.htm. The complete URL will be:

http(s)://server.domain.ext:port/sap/bc/bsp/sap/it00/misc_echo.htm

Run the URL and look at the corresponding output for the three different tests.



1. Confirm that the Host: header matches exactly, including the port number, to the string that was entered as URL in the browser. This header must be preserved 1:1.

2. Check that the ClientProtocol HTTP header is set. Even if a HTTPS to HTTP protocol switch is not done, we can only very strongly recommend to set this HTTP header.

3. Check that the Access Point HTTP header is set. Important: this is only available with new systems (>=620SP57, >=640SP16 or >=700SP06).
#In addition, use transaction SE16 and confirm that HTTPURLLOC table is configured correctly for both the client NNN and also for client 000.

Starting 620>=SP57, 640>=SP16 and 700>=SP06



To make life easier, a new BSP test application SYSTEM_TEST is shipped. In this application is also the page test_proxy.htm to automate the tests. The complete URL will be:

http(s)://server.domain.ext:port/sap/bc/bsp/sap/system_test/test_proxy.htm

The program will test above points semi-automagically and give the following feedback:



What this proxy test page does is test for the first three rules (Host header preservation, ClientProtocol HTTP header, and X-SAP-WebDisp-AP (Access Point) HTTP header). If no Access Point HTTP header is found, them the HTTPURLLOC table is also checked in both current defined "rules" above.

SAP Web Dispatcher Configuration



SAP Web Dispatcher

The Web Dispatcher will always preserve the Host header, and no further configuration is required for this aspect.

For HTTPS to HTTP protocol switching, the Web Dispatcher must be configured to also set the ClientProtocol HTTP header. This is done with the option in profile:

wdisp/add_clientprotocol_header = 1

Recommended is to also activate Access Points, as this is the best and simplest way to get a consistent configuration complete(starting 620>=SP57, 640>=SP16 and 700>=SP06). This is achieved with the following profile option:

wdisp/handle_webdisp_ap_header = 1

Apache Configuration



Apache version 1.x is NOT supported, as it can not preserve the Host header. Only Apache from version 2.0+ can be used. For Host header preservation, the configuration option ProxyPreserveHost on must be activated!

In the case of a HTTPS to HTTP protocol switch, the Apache proxy must be configured
to set the ClientProtocol HTTP header with the RequestHeader option.

Here a small extract from an Apache configuration:


ProxyPass / http://us4049.wdf.sap.corp:1080/
ProxyPreserveHost on
RequestHeader set ClientProtocol http
RequestHeader set x-sap-webdisp-ap HTTP=80,HTTPS=443
AllowCONNECT 80


ProxyPass / https://us4049.wdf.sap.corp:1443/
ProxyPreserveHost on
RequestHeader set ClientProtocol https
RequestHeader set x-sap-webdisp-ap HTTP=80,HTTPS=443
AllowCONNECT 443
SSLProxyEngine on
SSLEngine on
SSLCertificateFile conf/ssl/test-cert.crt
SSLCertificateKeyFile conf/ssl/test-cert.key

No comments:

Post a Comment