Skip over navigation

How to load and save documents in TWebBrowser in a Delphi-like way

Contents

Why do it

I use the TWebBrowser control quite a lot. But, since the control is simply a wrapper round the Microsoft® control, it doesn't really do things the "Delphi way".

If, like me, you've found yourself saving dynamically generated HTML code to disk just so you can access it using TWebBrowser.Navigate() you probably know what I mean. You wouldn't have to do such a thing with a native Delphi control such as a TMemo – you would simply access a relevant property like TMemo.Lines[] or use a method like TMemo.LoadFromStream. You can also do this with TWebBrowser, but its not straightforward – not very "Delphi" – you have to query and manipulate interfaces and all sorts of stuff. It's all so very COM!

So, I decided to create a wrapper class for TWebBrowser that makes navigating, loading and saving a whole lot easier and more intuitive. This article walks through the development of that class and examines some of the key techniques for working with TWebBrowser along the way.

A word of caution before we get started. The code I'll present here is for illustration purposes only. Don't expect it to be perfect for production code, although you should be able to use it as a basis.

Do feel free to take the core unit from the demo and modify, specialise or generalise it for your own purposes. It's open source and licensed under the MPL / GLP / LGPL tri-license, which should suit most open-source hackers.

Requirements

The approach we will take is to develop a wrapper class for TWebBrowser rather than derive a new class from it. Let's first decide which of the main functions of the web browser we want to be able to access easily – these will be our requirements for the wrapper class. Here's the list I drew up:

  • Load HTML code into an existing document from a local file, a stream or a string.
  • Navigate to a URL (as now) but without worrying about remembering the file:// protocol when loading an local file and with easy access to HTML resources using the res:// protocol.
  • Programmatically save the browser's content as HTML source code. We should also be able to store the content in a string or write it to a stream or file.
  • Provide access to the underlying TWebBrowser object so that we can use it directly to access any functionality not provided by the new code.

Now we're living in a Unicode world the following extra requirements will be added:

  • Be able to read and write HTML code in ANSI or Unicode encodings. And be able to specify the encoding used when saving a document or when loading from a string.
  • Get the encoding used by the browser object for the current document.

From this list you can see we are focussing on navigating, loading and saving documents.

Implementation

Now we have our specification we will approach implementation in two stages:

  • Stage 1: Basic implementation of the first set of requirements. This code will be suitable for most Delphi compilers.
  • Stage 2: Add on the required Unicode support. This will require Delphi 2009 or later.

Stage 1: Basic Implementation

This first stage won't worry about Unicode support other than that needed to make the code compile and work with Delphi 2009 and later.

Here is an outline of a class that, once implemented, meets our basic requirements:

  1type
  2  TWebBrowserWrapper = class(TObject)
  3  private
  4    fWebBrowser: TWebBrowser; // wrapped control
  5  protected
  6    procedure InternalLoadDocumentFromStream(const Stream: TStream);
  7    procedure InternalSaveDocumentToStream(const Stream: TStream);
  8  public
  9    constructor Create(const WebBrowser: TWebBrowser);
 10    procedure LoadFromFile(const FileName: string);
 11    procedure LoadFromStream(const Stream: TStream);
 12    procedure LoadFromString(const HTML: string); overload;
 13    function NavigateToLocalFile(const FileName: string): Boolean;
 14    procedure NavigateToResource(const Module: HMODULE;
 15      const ResName: PChar; const ResType: PChar = nil); overload;
 16    procedure NavigateToResource(const ModuleName: string;
 17      const ResName: PChar; const ResType: PChar = nil); overload;
 18    procedure NavigateToURL(const URL: string);
 19    function SaveToString: string;
 20    procedure SaveToStream(const Stm: TStream); overload;
 21    procedure SaveToFile(const FileName: string); overload;
 22    property WebBrowser: TWebBrowser read fWebBrowser;
 23  end;
Listing 1

The first thing to notice is the WebBrowser property that enables access to the wrapped control. The public methods fall naturally into several groups and, rather than explaining the purpose of each method now, we will look at them in groups. The protected "helper" methods will be discussed along with the public methods they service.

Constructor

The constructor is very simple – it just stores a reference to the TWebBrowser control that the object is wrapping. This control reference is passed as a parameter to the constructor:

  1constructor TWebBrowserWrapper.Create(const WebBrowser: TWebBrowser);
  2begin
  3  inherited Create;
  4  fWebBrowser := WebBrowser;
  5end;
Listing 2

Navigation Methods

We have declared three different methods for navigation:

  • NavigateToURL – A thin wrapper around the existing TWebBrowser.Navigate method that only deals with standard URLs and intelligently sets the cache and history flags. The main difference between this method and the underlying control's method is that our method blocks until the required document has completely downloaded.
  • NavigateToLocalFile – This convenience method simply adds the required file:// protocol to the URL before calling NavigateToURL.
  • NavigateToResource – Loads HTML code from the program's (or other module's) resources into the browser. Overloaded versions of this method allow resources to be accessed by instance handle (like the TResourceStream constructors) or by providing the name of the module containing the resources.

NavigateToLocalFile and both versions of NavigateToResource work by creating the required URL from the parameters passed to them and then calling NavigateToURL to do the actual navigation. Let's first look at how NavigateToURL handles the navigation then come back to look at how the other routines put the URLs together.

  1procedure TWebBrowserWrapper.NavigateToURL(const URL: string);
  2
  3  procedure Pause(const ADelay: Cardinal);
  4  var
  5    StartTC: Cardinal; // tick count when routine called
  6  begin
  7    StartTC := Windows.GetTickCount;
  8    repeat
  9      Application.ProcessMessages;
 10    until Int64(Windows.GetTickCount) - Int64(StartTC) >= ADelay;
 11  end;
 12
 13var
 14  Flags: OleVariant; // flags that determine action
 15begin
 16  // Don't record in history
 17  Flags := navNoHistory;
 18  if AnsiStartsText('res://', URL)
 19    or AnsiStartsText('file://', URL)
 20    or AnsiStartsText('about:', URL)
 21    or AnsiStartsText('javascript:', URL)
 22    or AnsiStartsText('mailto:', URL) then
 23    // don't use cache for local files
 24    Flags := Flags or navNoReadFromCache or navNoWriteToCache;
 25
 26  // Do the navigation and wait for it to complete
 27  WebBrowser.Navigate(URL, Flags);
 28  while WebBrowser.ReadyState <> READYSTATE_COMPLETE do
 29    Pause(5);
 30end;
Listing 3

This method essentially falls into two parts. Firstly we decide whether to use the browser's cache to access the document. The decision is based on whether the document is stored locally or is on the internet. We simply check the start of the URL string for some known local protocols etc. and don't use the cache if the URL conforms to one of these types. It would be a simple matter to adapt the method by adding a default parameter to let the user of the code specify what if any caching should take place. This is left as an exercise. The final part of the routine simply uses the browser object's Navigate() method to load the resource into the document. We then go into a loop and wait for the document to load completely. The local Pause procedure does a busy wait, polling the message queue for about 5ms at a time.

Now let us review the specialised navigation methods. The simplest of these is the NagivateToLocalFile method. This method simply checks if the file exists and, if so, prefixes the given file name with the file:// protocol then calls NavigateToURL. If the file doesn't exist no action is taken. A boolean value indicating whether the file exists is returned. You may prefer to modify the method to raise an exception when the file does not exist.

  1function TWebBrowserWrapper.NavigateToLocalFile(
  2  const FileName: string): Boolean;
  3begin
  4  Result := FileExists(FileName);
  5  if Result then
  6    NavigateToURL('file://' + FileName)
  7end;
Listing 4

NavigateToResource is slightly more complicated in that we need to create the required URL using the IE specific res:// protocol URL. We discussed this protocol in article #10 where we also developed some functions to return res:// formatted URLs. (We will re-use these functions later).

Two overloaded methods are provided. Both methods create the required URL for a given module, resource name and an optional resource type. They then call NavigateToURL to do the actual navigation. The overloaded methods vary in the way the module is described. The first method accepts the handle of a loaded module (pass HInstance to access the current program). The second method is simply passed a module name as a string. If the resource type parameter is omitted then it is left out of the URL – the res:// protocol simply defaults to expecting an RT_HTML (=MakeIntResource(23)) resource in such cases. Here are the methods:

  1procedure TWebBrowserWrapper.NavigateToResource(const Module: HMODULE;
  2  const ResName, ResType: PChar);
  3begin
  4  NavigateToURL(MakeResourceURL(Module, ResName, ResType));
  5end;
  6
  7procedure TWebBrowserWrapper.NavigateToResource(const ModuleName: string;
  8  const ResName, ResType: PChar);
  9begin
 10  NavigateToURL(MakeResourceURL(ModuleName, ResName, ResType));
 11end;
Listing 5

As can be seen, the methods simply rely on overloaded versions of the MakeResourceURL function. The implementation of these functions is described in article #10. In both cases we could improve the methods by checking that the required resources exist and raising an exception or returning false if not. This is left as an exercise (hint: use the Windows FindResource function to check for the resource's existence).

The MakeResourceURL functions are included in the demo source code that accompanies this article.

Document Loading Methods

There are three methods that load new content into an existing document:

  • LoadFromString – Replaces the current document with the HTML code stored in a string.
  • LoadFromFile – Replaces the current document with the HTML code read from a file.
  • LoadFromStream – Replaces the current document with the HTML code read from a stream.

At first sight LoadFromFile is similar to NavigateToLocalFile, but NavigateToLocalFile reads a file into the browser, creating a new document whereas LoadFromFile requires that a document already exists and replaces it's HTML code – i.e. the HTML is changed dynamically.

Both LoadFromString and LoadFromFile simply create a suitable stream and call LoadFromStream, which in turn uses the protected InternalLoadDocumentFromStream method to perform the actual loading of the code. Let's first look at the file and string methods, which are very similar:

  1procedure TWebBrowserWrapper.LoadFromFile(const FileName: string);
  2var
  3  FileStream: TFileStream;
  4begin
  5  FileStream := TFileStream.Create(
  6    FileName, fmOpenRead or fmShareDenyNone
  7  );
  8  try
  9    LoadFromStream(FileStream);
 10  finally
 11    FileStream.Free;
 12  end;
 13end;
 14
 15procedure TWebBrowserWrapper.LoadFromString(const HTML: string);
 16var
 17  StringStream: TStringStream;
 18begin
 19  StringStream := TStringStream.Create(HTML);
 20  try
 21    LoadFromStream(StringStream);
 22  finally
 23    StringStream.Free;
 24  end;
 25end;
Listing 6

As can be seen this is all quite straightforward if you're used to using TStreams. LoadFromFile opens a read-only TFileStream onto the file and passes the stream to LoadFromStream. Similarly LoadFromString uses the very useful TStringStream class to open a stream that can read the HTML code string.

LoadFromStream itself is quite straightforward because it hands most of its work off to InternalLoadFromStream:

  1procedure TWebBrowserWrapper.LoadFromStream(const Stream: TStream);
  2begin
  3  NavigateToURL('about:blank');
  4  InternalLoadDocumentFromStream(Stream);
  5end;
Listing 7

The main thing to note here is that we must ensure there is a document present in TWebBrowser since, as we will see in a moment, we need it in order to load the code from the stream. We do this by navigating to the special about:blank document, which simply creates a blank document in the web browser control. Once we have our blank document we load the stream into it using InternalLoadDocumentFromStream, which is defined below:

  1procedure TWebBrowserWrapper.InternalLoadDocumentFromStream(
  2  const Stream: TStream);
  3var
  4  PersistStreamInit: IPersistStreamInit;
  5  StreamAdapter: IStream;
  6begin
  7  if not Assigned(WebBrowser.Document) then
  8    Exit;
  9  // Get IPersistStreamInit interface on document object
 10  if WebBrowser.Document.QueryInterface(
 11    IPersistStreamInit, PersistStreamInit
 12  ) = S_OK then
 13  begin
 14    // Clear document
 15    if PersistStreamInit.InitNew = S_OK then
 16    begin
 17      // Get IStream interface on stream
 18      StreamAdapter := TStreamAdapter.Create(Stream);
 19      // Load data from Stream into WebBrowser
 20      PersistStreamInit.Load(StreamAdapter);
 21    end;
 22  end;
 23end;
Listing 8

And this is where it gets more complicated – for the first time we have to mess around with the COM stuff.

We check that the web browser control's document object is available and bail out if not. We then check to see if the document supports the IPersistStreamInit interface, getting a reference to the supporting object. IPersistStreamInit is used to effectively "clear" the document object (using the interface's InitNew method). If this succeeds we finally load the stream's content into the document by calling the IPersistStreamInit.Load method.

IPersistStreamInit.Load accepts a stream object, but the stream it expects is a COM one that must support the IStream interface. Since TStream does not natively support this interface, we have to find some way to provide it. TStreamAdapter from Delphi's Classes unit comes to the rescue here – this object implements IStream and translates IStream's method calls into equivalent calls onto the TStream object that it wraps. We create the needed TStreamAdapter object by passing a reference to our stream in its constructor. Finally, we pass the adpated stream to IPersistStreamInit.Load, and we're done.

If you're wandering why we don't free StreamAdapter and PersistStreamInit it's because they are both interfaced objects and will be automatically destroyed at the end of the method by Delphi's built in interface reference counting. Note that even though the stream adapter class is freed, the underlying TStream object continues to exist, which is what we want.

Note that the browser control interprets the encoding of the stream and sets the document's character set accordingly. However, the character set can also be specified in HTML code.

Document Saving Methods

There are three methods that are used to save a document's code. They complement the three LoadXXX methods as follows:

  • SaveToString – Returns the document contents in a string.
  • SaveToFile – Saves the document contents to a specified file.
  • SaveToStream – Saves the document contents to a given stream.

Like their LoadXXX counterparts, the SaveToFile and SaveToString methods simply map down onto SaveToStream, after creating suitable output streams. Their operation is quite simple and needs little explanation:

  1procedure TWebBrowserWrapper.SaveToFile(const FileName: string);
  2var
  3  FileStream: TFileStream;
  4begin
  5  FileStream := TFileStream.Create(FileName, fmCreate);
  6  try
  7    SaveToStream(FileStream);
  8  finally
  9    FileStream.Free;
 10  end;
 11end;
 12
 13function TWebBrowserWrapper.SaveToString: string;
 14var
 15  StringStream: TStringStream;
 16begin
 17  StringStream := TStringStream.Create('');
 18  try
 19    SaveToStream(StringStream);
 20    Result := StringStream.DataString;
 21  finally
 22    StringStream.Free;
 23  end;
 24end;
Listing 9

The only thing of note in the above methods is the use of TStringStream's DataString property to read out the completed string after writing to the stream.

The SaveToStream method follows. It's as simple as it could be, it just calls InternalSaveDocumentToStream.

 1procedure TWebBrowserWrapper.SaveToStream(const Stm: TStream);
 2begin
 3  InternalSaveDocumentToStream(Stm);
 4end;
Listing 10

Finally, we get to see how the protected InternalSaveDocumentToStream method is implemented. It interacts with the browser control to save the whole document to a stream.

  1procedure TWebBrowserWrapper.InternalSaveDocumentToStream(
  2  const Stream: TStream);
  3var
  4  StreamAdapter: IStream;
  5  PersistStreamInit: IPersistStreamInit;
  6begin
  7  if not Assigned(WebBrowser.Document) then
  8  Exit;
  9  if WebBrowser.Document.QueryInterface(
 10    IPersistStreamInit, PersistStreamInit
 11  ) = S_OK then
 12  begin
 13    StreamAdapter := TStreamAdapter.Create(Stream);
 14    PersistStreamInit.Save(StreamAdapter, True);
 15  end;
 16end;
Listing 11

Note that, like in InternalLoadDocumentFromStream, we again try to get the web browser document's IPersistStreamInit interface and then use its Save method to write the document to the stream. We also use TStreamAdapter once again to provide the required IStream interface for the TStream.

Note that the browser control writes the stream in the correct encoding for the document.

Stage 2: Adding Unicode Support

Since the browser control supports different character encodings we need to add support for this to our code. For much of this we're going to rely on the encoding support built into Delphi 2009 and later, so get ready for some conditionally defined code.

Thanks to Mauricio Julio for suggesting some of the ideas and code used in this section. Particular thanks are due for the GetStreamEncoding function (see Listing 14) from his TcyComponents pack.

When we discussed requirements we decided we needed to be able to specify an encoding when writing to files and streams and when reading from a string. To handle this we define new overloads of the SaveToStream, SaveToFile and LoadFromString methods.

The second requirement was to provide access to the encoding used for the browser control's current document.

Taking these into account the definition of our TWebBrowserWrapper class becomes:

  1type
  2  TWebBrowserWrapper = class(TObject)
  3  private
  4    fWebBrowser: TWebBrowser; // wrapped control
  5  protected
  6    procedure InternalLoadDocumentFromStream(const Stream: TStream);
  7    procedure InternalSaveDocumentToStream(const Stream: TStream);
  8    {$IFDEF UNICODE}
  9    function GetDocumentEncoding: TEncoding;
 10    {$ENDIF}
 11  public
 12    constructor Create(const WebBrowser: TWebBrowser);
 13    procedure LoadFromFile(const FileName: string);
 14    procedure LoadFromStream(const Stream: TStream);
 15    procedure LoadFromString(const HTML: string); overload;
 16    {$IFDEF UNICODE}
 17    procedure LoadFromString(const HTML: string; 
 18      const Encoding: TEncoding); overload;
 19    {$ENDIF}
 20    function NavigateToLocalFile(const FileName: string): Boolean;
 21    procedure NavigateToResource(const Module: HMODULE;
 22      const ResName: PChar; const ResType: PChar = nil); overload;
 23    procedure NavigateToResource(const ModuleName: string;
 24      const ResName: PChar; const ResType: PChar = nil); overload;
 25    procedure NavigateToURL(const URL: string);
 26    function SaveToString: string;
 27    procedure SaveToStream(const Stm: TStream); overload;
 28    {$IFDEF UNICODE}
 29    procedure SaveToStream(const Stm: TStream;
 30      const Encoding: TEncoding); overload;
 31    {$ENDIF}
 32    procedure SaveToFile(const FileName: string); overload;
 33    {$IFDEF UNICODE}
 34    procedure SaveToFile(const FileName: string;
 35      const Encoding: TEncoding); overload;
 36    {$ENDIF}
 37    property WebBrowser: TWebBrowser read fWebBrowser;
 38    {$IFDEF UNICODE}
 39    property Encoding: TEncoding read GetDocumentEncoding;
 40    {$ENDIF}
 41  end;
Listing 12

Encoding Property

We provide access to the browser's current document encoding via the read only Encoding property which has a read accessor method named GetDocumentEncoding, defined in the following listing.

  1{$IFDEF UNICODE}
  2function TWebBrowserWrapper.GetDocumentEncoding: TEncoding;
  3var
  4  Doc: IHTMLDocument2;
  5  DocStm: TStream;
  6begin
  7  Assert(Assigned(WebBrowser.Document));
  8  Result := TEncoding.Default;
  9  if WebBrowser.Document.QueryInterface(IHTMLDocument2, Doc) = S_OK then
 10  begin
 11    DocStm := TMemoryStream.Create;
 12    try
 13      InternalSaveDocumentToStream(DocStm);
 14      Result := GetStreamEncoding(DocStm);
 15    finally
 16      DocStm.Free;
 17    end;
 18  end;
 19end;
 20{$ENDIF}
Listing 13

To get the document encoding we need to examine the structure of the stream that is generated when then the document is saved. We first record the default encoding to return if we can't examine the document for any reason. Once we have a reference to the current document in Doc we create a memory stream object and save the browser content into it by calling InternalSaveDocumentToStream. The resulting stream is then examined by Mauricio Julio's GetStreamEncoding function to get the encoding. Listing 14 shows the implementation of GetStreamEncoding.

  1{$IFDEF UNICODE}
  2function GetStreamEncoding(const Stream: TStream): TEncoding;
  3var
  4  Bytes: TBytes;
  5  Size: Int64;
  6begin
  7  Stream.Seek(0, soFromBeginning);
  8  Size := Stream.Size;
  9  SetLength(Bytes, Size);
 10  Stream.ReadBuffer(Pointer(Bytes)^, Size);
 11  Result := nil; // must initialise Result to pass as var param below
 12  TEncoding.GetBufferEncoding(Bytes, Result);
 13end;
 14{$ENDIF}
Listing 14

This routine simply copies the provided stream into a TBytes array then uses the GetBufferEncoding class method of TEncoding to determine the encoding.

Revised Document Loading Methods

As noted already we will provide a new overloaded version of LoadFromString that takes a TEncoding parameter that determines the encoding that will be used to load the string containing the HTML. We will also need to re-implement the original LoadFromString method. Here's the new code:

  1{$IFDEF UNICODE}
  2procedure TWebBrowserWrapper.LoadFromString(const HTML: string;
  3  const Encoding: TEncoding);
  4var
  5  HTMLStm: TMemoryStream;
  6begin
  7  Assert(Assigned(Encoding));
  8  HTMLStm := TMemoryStream.Create;
  9  try
 10    StringToStreamBOM(HTML, HTMLStm, Encoding);
 11    HTMLStm.Position := 0;
 12    LoadFromStream(HTMLStm);
 13  finally
 14    HTMLStm.Free;
 15  end;
 16end;
 17{$ENDIF}
 18
 19procedure TWebBrowserWrapper.LoadFromString(const HTML: string);
 20{$IFDEF UNICODE}
 21begin
 22  LoadFromString(HTML, TEncoding.Default);
 23end;
 24{$ELSE}
 25var
 26  StringStream: TStringStream;
 27begin
 28  StringStream := TStringStream.Create(HTML);
 29  try
 30    LoadFromStream(StringStream);
 31  finally
 32    StringStream.Free;
 33  end;
 34end;
 35{$ENDIF}
Listing 15

The first thing to note is that, on non-Unicode compilers, the original version of LoadFromString is unchanged. However the Unicode version now calls the new overloaded version of the method, passing the default encoding in the Encoding parameter.

The new, Unicode only, overloaded method first writes the the string to a temporary memory stream, encoded according to the Encoding parameter. The stream is prefixed by any byte order mark required by the encoding. All this work is done by the StringToStreamBOM helper routine that is shown in Listing 16. Once we have the stream we simply call the existing LoadFromStream method to load the stream into the document.

  1{$IFDEF UNICODE}
  2procedure StringToStreamBOM(const S: string; const Stm: TStream;
  3  const Encoding: TEncoding);
  4var
  5  Bytes: TBytes;
  6  Preamble: TBytes;
  7begin
  8  Assert(Assigned(Encoding));
  9  Bytes := Encoding.GetBytes(S);
 10  Preamble := Encoding.GetPreamble;
 11  if Length(Preamble) > 0 then
 12    Stm.WriteBuffer(Preamble[0], Length(Preamble));
 13  Stm.WriteBuffer(Bytes[0], Length(Bytes));
 14end;
 15{$ENDIF}
Listing 16

StringToStreamBOM first converts the string into a byte array according the required encoding. It then writes any required byte order mark to the stream (stored in the Preamble variable) followed by the byte array.

Revised Document Saving Methods

In addition to providing new overloaded versions of SaveToStream and SaveToFile we must re-implement SaveToString when compiling with Unicode compilers to take account of the browser document's encoding.

We will first look at the revised SaveToString method before discussing the new overloaded methods. Here is the new implementation of SaveToString.

  1function TWebBrowserWrapper.SaveToString: string;
  2{$IFDEF UNICODE}
  3var
  4  MS: TMemoryStream;
  5  Encoding: TEncoding;
  6  Bytes: TBytes;
  7begin
  8  MS := TMemoryStream.Create;
  9  try
 10    SaveToStream(MS);
 11    // This stream may have a pre-amble indicating encoding
 12    Encoding := GetStreamEncoding(MS);
 13    MS.Position := Length(Encoding.GetPreamble);
 14    SetLength(Bytes, MS.Size - MS.Position);
 15    MS.ReadBuffer(Bytes[0], Length(Bytes));
 16    Result := Encoding.GetString(Bytes);
 17  finally
 18    MS.Free;
 19  end;
 20end;
 21{$ELSE}
 22var
 23  StringStream: TStringStream;
 24begin
 25  StringStream := TStringStream.Create('');
 26  try
 27    SaveToStream(StringStream);
 28    Result := StringStream.DataString;
 29  finally
 30    StringStream.Free;
 31  end;
 32end;
 33{$ENDIF}
Listing 17

Just like with LoadFromString the non-Unicode version of SaveToString remains unchanged.

The Unicode version of the method writes the browser's document into a temporary memory stream. This is done because we need to interpret the output stream according to its encoding. So we call GetStreamEncoding to find the encoding used to generate the stream. Next we set the memory stream's position to skip over any byte order mark (preamble). The remainder of the stream is then copied into a byte array that we can pass to the GetString method of TEncoding which, in turn, returns the required string.

Having disposed of the most complex method we now turn to the new overloaded versions of SaveToFile and SaveToStream:

  1{$IFDEF UNICODE}
  2procedure TWebBrowserWrapper.SaveToFile(const FileName: string;
  3  const Encoding: TEncoding);
  4var
  5  FileStream: TFileStream;
  6begin
  7  FileStream := TFileStream.Create(FileName, fmCreate);
  8  try
  9    SaveToStream(FileStream, Encoding);
 10  finally
 11    FileStream.Free;
 12  end;
 13end;
 14{$ENDIF}
 15
 16{$IFDEF UNICODE}
 17procedure TWebBrowserWrapper.SaveToStream(const Stm: TStream;
 18  const Encoding: TEncoding);
 19var
 20  HTML: string;
 21begin
 22  HTML := SaveToString;
 23  StringToStreamBOM(HTML, Stm, Encoding);
 24end;
 25{$ENDIF}
Listing 18

SaveToFile is very similar to the version that does not take an Encoding. It creates a stream onto the file then passes the stream and the encoding to the overloaded version of SaveToStream.

In contrast the overload of SaveToStream is very different. First it calls SaveToString to retrieve a string containing the document's content as HTML. It does this because SaveToString handles the document encoding correctly. We then save the string to the stream using the encoding provided in the Encoding parameter, prefixed by any required byte order mark.

If Encoding is the same as the document's encoding then calling this method is wasteful. You should call the SaveToStream overload that does not take an Encoding parameter instead.

Conclusion

So there we have it, a class that makes TWebBrowser a lot more friendly to use when loading and saving documents.

There is obviously a lot more we could do. A couple of things immediately spring to mind.

  1. We retro-fit some of the Unicode functionality and support for non-ANSI encodings to the pre-Unicode compiler code. The present code when compiled with anything earlier than Delphi 2009 will not save document content to strings correctly if the document character set is not ANSI.
  2. Although this code checks input and output streams for preambles that identify HTML file and document encodings, it does not check any encoding included explicitly in the HTML.

    Hint

    Check the IHTMLDocument2.charset property or the <meta http-equiv="Content-Type" …> declaration.

Demo Program

A demo program to accompany this article can be found in the delphidabbler/article-demos Git repository on GitHub.

You can view the code in the article-14 sub-directory. Alternatively download a zip file containing all the demos by going to the repository's landing page and clicking the Clone or download button and selecting Download ZIP.

See the demo's README.md file for details.

As noted above, the code presented in this article does not work correctly when loading and saving in Unicode (or UTF-8) when built with a non-Unicode version of Delphi. These limitations are also present in the demo. Choosing Unicode or UTF-8 examples when loading strings into the demo will fail when it is built with compilers earlier than Delphi 2009.

This source code is merely a proof of concept and is intended only to illustrate this article. It is not designed for use in its current form in finished applications. The code is provided on an "AS IS" basis, WITHOUT WARRANTY OF ANY KIND, either express or implied.

The demo is open source. See the demo's LICENSE.md file for licensing details.

Acknowledgements

Several people's ideas and code samples have been used in developing this code. In particular I'd like to thank Christian Schwarz, Nick Hodges and Babur Saylan for writing some very useful tips on working with TWebBrowser at the now defunct "Delphi Pool" website.

Various useful articles were also found on the Microsoft Developer Network. Unfortunately but links to the articles have been lost.

Thanks are also due to Mauricio Julio for his ideas and code.

Feedback

I hope you found this article useful.

If you have any observations, comments, or have found any errors there are two places you can report them.

  1. For anything to do with the article content, but not the downloadable demo code, please use this website's Issues page on GitHub. Make sure you mention that the issue relates to "article #".
  2. For bugs in the demo code see the article-demo project's README.md file for details of how to report them.