Downloading a file from HTTP server

Victor Sharov (vic@softspb.com), October 26, 2001.

Introduction

Wireless connection for Pocket PC becomes more and more popular. One often needs to read a web resource from a program. This article describes how you can download a file from an HTTP server using both eVC and eVB. This article also describes using HTTP authorization.

eVC++

Step by step

The following steps should be performed to download a file:

  1. To begin an HTTP session you need to create a CIinternetSession object.
  2. To connect to an HTTP server you need to use GetHttpConnection method of CIinternetSession object that returns a pointer to CHttpConnection object.
  3. To open an HTTP request you should use OpenRequest method of CHttpConnection object that returns a pointer to CHttpFile object.
  4. You may also need to add one or more request header entries. For example in order to pass the basic authentication you should prepare a header and add it to HTTP request:
    1. The basic authentication request header looks like the following string: "Authorization: Basic BASE64_STRING" where BASE64_STRING is a string "login:password" encoded to BASE64. Please refer to remarks below to get to know how to encode your string to BASE64 encoding.
    2. Use AddRequestHeaders method of CHttpFile object to add the request header.
  5. In order to send the request you should use SendRequest member of CHttpFile object.
  6. Now you can read from the file the specified number of bytes using a buffer you supply.
    Note: the file that you received may be ASCII encoded so ReadString method will not work properly. You should use Read method and then convert a string to Unicode. Please refer to www.pocketpcdn.com/articles/strings.html to get to now how to perform such a conversion.
  7. In order to handle exceptions you should use CInternetException class.
  8. Dispose of CInternetSession object automatically cleans up open file handles and connections.

Source code

Here is a sample code that downloads file from the server:

CHttpFile *file = NULL; TRY { CInternetSession *session = NULL; session = new CInternetSession(); CHttpConnection *connection = NULL; // To do: specify actual server, port, login name and password here connection = session->GetHttpConnection( server, 80, login, password ); file = connection->OpenRequest(1, path); CString strRequestHeader = "Authorization: Basic "; strRequestHeader += EncodeBase64(login + _T(":") + password); file->AddRequestHeaders(strRequestHeader); file->SendRequest(); char ch; while (file->Read(&ch, 1)) { // To do: store file content somewhere: } delete session; } CATCH(CInternetException, pEx) { // To do: handle internet exceptions here: } END_CATCH

Remarks

The Base64 Content-Transfer-Encoding is designed to represent arbitrary sequences of octets in a form that need not be humanly readable. The encoding and decoding algorithms are simple, but the encoded data are consistently only about 33 percent larger than the un-encoded data.

The encoding process represents 24-bit groups of input bits as output strings of 4 encoded characters. Proceeding from left to right, a 24-bit input group is formed by concatenating 3 8-bit input groups. These 24 bits are then treated as 4 concatenated 6-bit groups, each of which is translated into a single digit in the base64 alphabet. When encoding a bit stream via the base64 encoding, the bit stream must be presumed to be ordered with the most-significant-bit first. That is, the first bit in the stream will be the high-order bit in the first byte, and the eighth bit will be the low-order bit in the first byte, and so on.

Each 6-bit group is used as an index into an array of 64 printable characters. The character referenced by the index is placed in the output string. These characters, identified in Table below, are selected so as to be universally representable, and the set excludes characters with particular significance to SMTP (e.g., ".", CR, LF). Padding character used is "=".

ValueEncodingValueEncodingValueEncodingValueEncoding
0A16Q32g48w
1B17R33h49x
2C18S34i50y
3D19T35j51z
4E20U36k520
5F21V37l531
6G22W38m542
7H23X39n553
8I24Y40o564
9J25Z41p575
10K26a42q586
11L27b43r597
12M28c44s608
13N29d45t619
14O30e46u62+
15P31f47v63/

The output stream (encoded bytes) must be represented in lines of no more than 76 characters each. All line breaks or other characters not found in Table must be ignored by decoding software. In base64 data, characters other than those in Table, line breaks, and other white space probably indicate a transmission error, about which a warning message or even a message rejection might be appropriate under some circumstances.

Special processing is performed if fewer than 24 bits are available at the end of the data being encoded. A full encoding quantum is always completed at the end of a body. When fewer than 24 input bits are available in an input group, zero bits are added (on the right) to form an integral number of 6-bit groups. Padding at the end of the data is performed using the '=' character. Since all base64 input is an integral number of octets, only the following cases can arise: (1) the final quantum of encoding input is an integral multiple of 24 bits; here, the final unit of encoded output will be an integral multiple of 4 characters with no "=" padding, (2) the final quantum of encoding input is exactly 8 bits; here, the final unit of encoded output will be two characters followed by two "=" padding characters, or (3) the final quantum of encoding input is exactly 16 bits; here, the final unit of encoded output will be three characters followed by one "=" padding character.

Because it is used only for padding at the end of the data, the occurrence of any '=' characters may be taken as evidence that the end of the data has been reached. No such assurance is possible, however, when the number of octets transmitted was a multiple of three.

Please refer to RFC 1521 Part 5.2 for details.

Sample code

Here is a sample code that encodes a string to BASE64 encoding. Note that it does not take into consideration string length more than 76 characters:

CString EncodeBase64(CString str) { char base64alpahabet[64] = { 'A', 'B', 'C', 'D', 'E', 'F', 'G', 'H', 'I', 'J', 'K', 'L', 'M', 'N', 'O', 'P', 'Q', 'R', 'S', 'T', 'U', 'V', 'W', 'X', 'Y', 'Z', 'a', 'b', 'c', 'd', 'e', 'f', 'g', 'h', 'i', 'j', 'k', 'l', 'm', 'n', 'o', 'p', 'q', 'r', 's', 't', 'u', 'v', 'w', 'x', 'y', 'z', '0', '1', '2', '3', '4', '5', '6', '7', '8', '9', '+', '/' }; char *pAnsiString = GetAnsiString(str); int l = str.GetLength(); int t = l % 3; char *encodedString = new char[4*(l+1)/3+1]; int pos = 0; int nTriplets = (l % 3) ? l / 3 + 1 : l / 3; for (int i = 0; i < nTriplets; i++) { char ch1; char ch2; char ch3; if (i == l/3) { switch(t) { case 0: // 3 final characters ch1 = pAnsiString[3*i]; ch2 = pAnsiString[3*i+1]; ch3 = pAnsiString[3*i+2]; break; case 1: // 1 final characters ch1 = pAnsiString[3*i]; ch2 = '\0'; ch3 = '\0'; break; case 2: // 2 final characters ch1 = pAnsiString[3*i]; ch2 = pAnsiString[3*i+1]; ch3 = '\0'; break; } }else { ch1 = pAnsiString[3*i]; ch2 = pAnsiString[3*i+1]; ch3 = pAnsiString[3*i+2]; } int code1 = ch1 >> 2; int code2 = ((0x03 & ch1) << 4) + (ch2 >> 4); int code3 = ((0x0F & ch2) << 2) + (ch3 >> 6); int code4 = 0x3F & ch3; if (i == l/3) { switch(t) { case 0: // 3 final characters encodedString[pos] = base64alpahabet[code1]; encodedString[pos+1] = base64alpahabet[code2]; encodedString[pos+2] = base64alpahabet[code3]; encodedString[pos+3] = base64alpahabet[code4]; break; case 1: // 1 final characters encodedString[pos] = base64alpahabet[code1]; encodedString[pos+1] = base64alpahabet[code2]; encodedString[pos+2] = '='; encodedString[pos+3] = '='; break; case 2: // 2 final characters encodedString[pos] = base64alpahabet[code1]; encodedString[pos+1] = base64alpahabet[code2]; encodedString[pos+2] = base64alpahabet[code3]; encodedString[pos+3] = '='; break; } }else { encodedString[pos] = base64alpahabet[code1]; encodedString[pos+1] = base64alpahabet[code2]; encodedString[pos+2] = base64alpahabet[code3]; encodedString[pos+3] = base64alpahabet[code4]; } pos += 4; } encodedString[pos] = '\0'; CString res = encodedString; return res; }

eVB

Step by step

The simplest way to download a file from a server is using Winsock control. The following steps should be performed to download a file:

  1. First you should add the Microsoft CE Winsock Control to your eVB project (using Project/Components menu).
  2. Then you should specify the protocol to be used (because Winsock supports TCP, UDP and IRDA connections) via Protocol property: WinSock1.Protocol = CInt(0)
  3. You need to specify a port to connect to via RemotePort property: WinSock1.RemotePort = "80" or WinSock1.RemotePort = 80 depending on the version of eVB
  4. You also need to specify a host name to connect to via RemoteHost property: WinSock1.RemoteHost = "localhost"
  5. To open connection use Connect method: WinSock1.Connect
  6. Then you should perform one of the most important actions. You should send a request to the server. Here you should be very careful because you have to prepare the correct request. Please refer to RFC 1945 in order to find out the details of requests message. In the simple case your request will look like the following: WinSock1.SendData "GET params/getparams.cgi" & vbCrLf Please do not forget to finish with CR/LF pair because it is essential to receive a response from the server. See remarks below to learn how to pass basic authorization.
  7. You should provide a handler to data arrival event. It will look like the following: Private Sub WinSock1_DataArrival(ByVal bytesTotal As Long) Dim sockdata WinSock1.GetData sockdata End Sub Here you can use the data received from the server. Again, you should parse and handle all the received headers by yourself. Please refer to RFC 1945 in order to get to know the details of response message.
  8. Finally you should close the connection to the server via Close method: WinSock1.Close

Source code

There is a good sample of using Winsock control that can be found at wce211\ms palm size pc\samples\eVB\wsock\" folder of Windows CE Tools.

Remarks

Please refer to remarks in the previous section to get to know how to implement basic authorization for your client program. Here is a sample request to be sent:

GET /index.html HTTP/1.1 Host: localhost Range: bytes=0- User-Agent: PocketPCDownloader/1.0 Accept: *.*, */* Authorization: Basic bAsE64eNcOdEdLoGiNaNdPaSsWoRd==

The response received will look like the following:

HTTP/1.1 200 OK Date: Tue, 30 Oct 2001 14:26:41 GMT Server: Apache/1.3.20 (Win32) Transfer-Encoding: chunked Content-Type: text/plain

Related resources: