Web Downloads: Build Smarter ASP.NET File Downloading Into Your Web Applications -- MSDN Magazine, September 2006

hances are good that your users need to download files from your organization's Web site. And since providing a download is as easy as providing a link, you certainly don't need to read an article about the process, right? Well, thanks to so many Web advances, there are many reasons it might not be that easy. Maybe you want the file to be downloaded as a file rather than shown as content in the browser. Maybe you don't yet know the path to the files (or maybe they're not on disk at all), so those simple HTML links aren't possible. Maybe you need to worry about your users losing connectivity during large downloads.

In this article I'll present some solutions to these problems so your users will have a faster, error-free downloading experience. Along the way I'll discuss dynamically generated links, explain how to bypass default file behaviors, and illustrate resumable ASP.NET-driven downloads using HTTP 1.1 features.

The Basic Download Link

Let's tackle the missing link problem first. If you don't know what the path to a file is going to be, you could simply pull the list of links from a database later. You could even build the link list dynamically by enumerating the files in a given directory at run time. Here I'll explore that second approach.

Imagine I built a DataGrid in Visual Basic^® 2005 and filled it with links to all the files in the download directory, like you see in Figure 1. This could be done by using Server.MapPath within the page to retrieve the full path to the download directory (./downloadfiles/ in this case), retrieving a list of all files in that directory using DirectoryInfo.GetFiles, and then from the resulting array of FileInfo objects building up a DataTable with columns for each of the relevant properties. That DataTable can be bound to a DataGrid on the page, through which links can be generated with a HyperLinkColumn definition as follows:

<asp:HyperLinkColumn DataNavigateUrlField="Name"  
    DataNavigateUrlFormatString="downloadfiles/{0}" 
    DataTextField="Name" 
    HeaderText="File Name:" 
    SortExpression="Name" />

If you were to click on the links, you would see that the browser treats each file type differently, depending on which helper applications are registered to open each file type. By default, if you clicked on the .asp page, the .html page, the .jpg, the .gif, or the .txt, it would open in the browser itself and no Save As dialog would appear. The reason for this is that these file extensions are of known MIME types. So either the browser itself knows how to render the file, or the operating system has a helper application that the browser will use. Webcasts (.wmv, .avi, and so on), PodCasts (.mp3 or .wma), PowerPoint^® files, and all Microsoft^® Office documents are of known MIME types, presenting a problem if you don't want them opened inline by default.

Figure 1 Simple HTML Links in a DataGrid

In addition, if you allow downloading in this manner, you have only a very general access control mechanism at your disposal. You can control download access on a directory-by-directory basis, but controlling access to individual files or files types would require detailed access control—a very labor intensive process for Web masters and system administrators. Fortunately, ASP.NET and the .NET Framework provide a number of solutions. They include:

Using the Response.WriteFile method
Streaming the file using the Response.BinaryWrite method
Using the Response.TransferFile method in ASP.NET 2.0
Using an ISAPI filter
Writing to a custom browser control

Forcing Downloads for All File Types

The most easily employed of the solutions I just listed is Response.WriteFile method. The basic syntax is very simple; this complete ASPX page looks for a file path specified as a query string parameter and serves that file up to the client:

<%@ Page language="VB" AutoEventWireup="false" %>
<html>
   <body>
        <%
            If Request.QueryString("FileName") Then
                Response.Clear()
                Response.WriteFile(Request.QueryString("FileName"))
                Response.End()
            End If
        %>
   </body>
</html>

When your code, which is running in an IIS worker process (aspnet_wp.exe on IIS 5.0 or w3wp.exe on IIS 6.0) calls Response.Write, the ASP.NET worker process starts to send data to the IIS process (inetinfo.exe or dllhost.exe). As the data is sent from the worker process to the IIS process, the data is buffered in memory. In many cases this is not a cause for concern. However, it's not a great solution for very large files.

On the plus side, because the HTTP response that sends the file is created in the ASP.NET code, you have full access to all of ASP.NET authentication and authorization mechanisms and can therefore make decisions based on authentication status, on the existence of Identity and Principal objects at run time, or any other mechanism you see fit.

Thus, you can integrate existing security mechanisms like the built-in ASP.NET user and group mechanisms, Microsoft server add-ins such as Authorization Manager and defined role groups, Active Directory^® Application Mode (ADAM) or even Active Directory, to provide granular control over download permissions.

Initiating the download from inside your application code also lets you supersede the default behavior for known MIME types. To accomplish this you need to change the link you display. Here is code to construct a hyperlink that will post back to the ASPX page:

<!-- in the DataGrid definition in FileFetch.aspx -- >
<asp:HyperLinkColumn DataNavigateUrlField="Name"          
    DataNavigateUrlFormatString="FileFetch.aspx?FileName={0}" 
    DataTextField="Name" 
    HeaderText="File Name:" 
    SortExpression="Name" />

Next you need to check the Query String when the page is requested to see if the request is a postback that includes a filename argument to be sent to the client's browser (see Figure 2). Now, thanks to the Content-Disposition response header, when you click on one of the links in the grid, you get the save dialog regardless of the MIME type (see Figure 3). Notice, too, that I've restricted what files can be downloaded based on the result of calling a method named IsSafeFileName. For more information on why I'm doing this and on what this method accomplishes, see the "Unintended File Access" sidebar.

Figure 3 Forcing a File Download Dialog

An important metric to consider when using this technique is the size of the file download. You must limit the size of the file or you'll expose your site to denial-of-service attacks. Attempts to download files that are larger than resources permit will generate a runtime error stating that the page cannot be displayed or will display an error like this:

Server Application Unavailable

The Web application you are 
attempting to access on this Web 
server is currently unavailable. 
Please hit the "Refresh" button in your Web 
browser to retry your request.

Administrator Note: An error message detailing 
the cause of this specific request failure can be 
found in the system event log of the Web server. 
Please review this log entry to discover what 
caused this error to occur.

The maximum downloadable file size is a factor of the hardware configuration and runtime state of the server. To deal with this issue, see the Knowledge Base article "FIX: Downloading Large Files Causes a Large Memory Loss and Causes the Aspnet_wp.exe Process to Recycle" at support.microsoft.com/kb/823409.

This method may be symptomatic when downloading large files such as videos, particularly on Web servers running Windows 2000 and IIS 5.0 (or Windows Server^™ 2003 with IIS 6.0 running in compatibility mode). This issue will be exacerbated on Web servers that are minimally configured with memory since the file must be loaded into server memory before it can be downloaded to the client.

Empirical evidence generated on my test machine, a server running IIS 5.0 with 2GB of RAM, indicates download failure when file sizes approach 200MB. In a production environment, the more user downloads running concurrently, the more server memory constraints will result in user download failures. The solution to this problem requires a few more straightforward lines of code.

Downloading Huge Files in Small Pieces

The file size problem with the previous code sample stems from the single call to Response.WriteFile, which buffers the entire source file in memory. A better approach for a large file is to read and send it to the client in smaller, manageable chunks, an example of which is shown in Figure 4. This version of the Page_Load event handler uses a while loop to read the file 10,000 bytes at a time and then sends those chunks to the browser. Therefore, no significant portion of the file is held in memory at run time. The chunk size is currently set as a constant, but it could also be modified programmatically, or even moved into a configuration file so it can be changed to meet server constraints and performance needs. I tested this code with files up to 1.6GB, and the downloads were fast and resulted in no significant server memory consumption.

IIS itself does not support file downloads greater than 2GB in size. If you require larger downloads, you will need to use FTP, a third-party control, the Microsoft Background Intelligent Transfer Service (BITS), or a custom solution like streaming the data through sockets to a browser-hosted custom control.

A Better Solution

The commonality of file download requirements, and the ever-increasing size of the files in general, caused the ASP.NET development team to add a specific method to ASP.NET for downloading files without buffering the file in memory before sending it to the browser. That method is Response.TransmitFile, which is available in ASP.NET 2.0.

TransmitFile can be used just like WriteFile, but typically yields better performance characteristics. TransmitFile also comes compete with additional functionality. Take a look at the code in Figure 5, which uses some additional features of the newly added TransmitFile to avoid the aforementioned memory usage problems.

I was able to add some security and fault tolerance with just a few additional lines of code. First, I added a bit of security and logic constraint using the file extension of the requested file to determine the MIME type and specifying the requested MIME type in an HTTP Header by setting the "ContentType" property of the Response object:

Response.ContentType = "application/x-zip-compressed"

This allowed me to limit downloads to only certain content types, and map different file extensions to a single content type. Notice also the statement that adds a Content-Disposition header. This statement let me specify the file name to download, separate from the original file name on the server's hard disk.

In this code I create a new file name by appending a prefix to the original name. While the prefix here is static, I could dynamically create a prefix so that the downloaded file name will never conflict with a file name already on the user's hard disk.

But, what if halfway though fetching a large file, my download fails? While the code thus far has come a long way from a simple download link, I still can't gracefully handle a failed download and resume downloading a file that has already been partially moved from the server to the client. All the solutions I have examined so far would require the user to start the download over again from the beginning in the event of a failure.

Resuming Downloads that Fail

To address the question of resuming a failed download, let's go back to the approach of manually chunking a file for transmission. While not as simple as the code that uses the TransmitFile method, there is an advantage to manually writing the code to read and send the a file in chunks. At any given point in time, the runtime state contains the number of bytes that have already been sent to the client, and by subtracting that from the total file size, you get the number of bytes remaining to be transmitted in order for the file to be complete.

If you look back at the code, you'll see that the read/send loop checks as a loop condition the result of Response.IsClientConnected. This test insures that transmission is suspended if the client is no longer connected. At the first loop iteration in which this test is false (the Web browser that initiated the file download is no longer connected), the server stops sending data and the remaining bytes required to complete the file can be recorded. What's more, the partial file received by the client can be saved in the event the user attempts to complete the failed download.

The rest of the resumable download solution comes via some little-known features in the HTTP 1.1 protocol. Normally, HTTP's stateless nature is the bane of the Web developer's existence, but in this case the HTTP specification is a big help. Specifically, there are two HTTP 1.1 header elements relative to the task at hand: Accept-Ranges and Etag.

The Accept-Ranges header element quite simply tells the client, the Web browser in this case, that this process supports resumable downloads. The Entity Tag, or Etag, element specifies a unique identifier for the session. So the HTTP Headers that the ASP.NET application might send to the browser to begin a resumable download might look like this:

HTTP/1.1 200 OK
Connection: close
Date: Mon, 22 May 2006 11:09:13 GMT
Accept-Ranges: bytes
Last-Modified: Mon, 22 May 2006 08:09:13 GMT
ETag: "58afcc3dae87d52:3173"
Cache-Control: private
Content-Type: application/x-zip-compressed
Content-Length: 39551221

Because of ETag and Accept-Headers, the browser knows that the Web server will support resumable downloads.

If the download fails, when the file is requested again, Internet Explorer will send the ETag, file name, and the value range indicating how much of the file has been successfully downloaded before the interruption so that the Web server (IIS) can attempt to resume the download. That second request might look something like this.

GET http://192.168.0.1/download.zip HTTP/1.0
Range: bytes=933714-
Unless-Modified-Since: Sun, 26 Sep 2004 15:52:45 GMT
If-Range: "58afcc3dae87d52:3173"

Notice that the If-Range element contains the original ETag value that the server can use to identify the file to be resent. You'll also see that the Unless-Modified-Since element contains the date and time that the original download began. The server will use this to determine whether the file has been modified since the original download began. If it has, the server will restart the download from the beginning.

The Range element, which is also in the header tells the server how many bites are required to complete the file, which the server can use to determine where in the partially downloaded file it should resume.

Different browsers use these headers a bit differently. Other HTTP headers that a client might send to uniquely identify the file are: If-Match, If-Unmodified-Since, and Unless-Modified-Since. Note that the HTTP 1.1 is not specific about which headers a client should be required to support. It is therefore possible that some Web browsers will not support any of these HTTP headers and others may use a different header than those that are expected by Internet Explorer^®.

By default, IIS will include a header set like the following:

HTTP/1.1 206 Partial Content
Content-Range: bytes 933714-39551221/39551222
Accept-Ranges: bytes
Last-Modified: Sun, 26 Sep 2004 15:52:45 GMT
ETag: "58afcc3dae87d52:3173"
Cache-Control: private
Content-Type: application/x-zip-compressed
Content-Length: 2021408

This header set includes a different response code than that of the original request. The originating response included a code of 200, whereas this request uses a response code of 206, Resume Download, which tells the client that the data to follow is not a complete file, but rather the continuation of a previously initiated download whose file name is identified by the ETag.

While some Web browsers rely on the file name itself, Internet Explorer very specifically requires the ETag header. If the ETag header is not present in the initial download response or the download resumption, Internet Explorer will not attempt to resume the download; it will simply begin a new one.

In order for the ASP.NET download application to implement a resumable download feature, you need to be able to intercept the request (for download resumption) from the browser and use the HTTP headers in the request to formulate an appropriate response in the ASP.NET code. In order to do this you should catch the request a little earlier in the normal sequence of processing.

Thankfully, the .NET Framework is here to help. This is a great example of a fundamental design premise of .NET—providing a well-factored object library of functionality for a large portion of the standard plumbing work that developers are called on to perform daily.

In this case, you can take advantage of the IHttpHandler interface provided by the System.Web namespace in the .NET Framework in order to build your own custom HTTP handler. By creating your own class that implements the IHttpHandler, you will be able to intercept Web requests for a specific file type and respond to those requests in your own code rather than simply allowing IIS to respond with its default behaviors.

The code download for this article contains a working implementation of an HTTP handler that supports resumable downloads. While there is quite a bit of code to this feature, and its implementation requires some understanding of HTTP mechanics, the .NET Framework nevertheless makes this a relatively simple implementation. This solution provides the capability to download very large files, and after the download is initiated, browsing can continue. However, there are certain infrastructure considerations that will be beyond your control.

For example, many companies and Internet service providers maintain their own caching mechanisms. Broken or misconfigured Web cache servers can cause large downloads to fail due to file corruption or premature session termination, especially if your file size is greater that 255MB.

If you require file downloads in excess of 255MB or other custom functions, you may want to consider custom or third-party download managers. You may, for example, build a custom browser control or browser helper function to manage the downloads, hand them off to BITS, or even hand off the file request to an FTP client in the custom code. The options are endless and should be tailored to your specific needs.

From large file downloads in two lines of code to segmented, resumable downloads with custom security, the .NET Framework and ASP.NET provide a full range of options for building the most suitable download experience for the Web site's end users.

NEW: Explore the sample code online! - or - Code download available at: Downloading2006_09.exe (174KB)