Implementation
Now we have our specification we will approach implementation in two stages:
- Stage 1: Basic implementation of the first set of requirements. This code will be suitable for most Delphi compilers.
- Stage 2: Add on the required Unicode support. This will require Delphi 2009 or later.
Stage 1: Basic Implementation
This first stage won't worry about Unicode support other than that needed to make the code compile and work with Delphi 2009 and later.
Here is an outline of a class that, once implemented, meets our basic requirements:
1type
2 TWebBrowserWrapper = class(TObject)
3 private
4 fWebBrowser: TWebBrowser;
5 protected
6 procedure InternalLoadDocumentFromStream(const Stream: TStream);
7 procedure InternalSaveDocumentToStream(const Stream: TStream);
8 public
9 constructor Create(const WebBrowser: TWebBrowser);
10 procedure LoadFromFile(const FileName: string);
11 procedure LoadFromStream(const Stream: TStream);
12 procedure LoadFromString(const HTML: string); overload;
13 function NavigateToLocalFile(const FileName: string): Boolean;
14 procedure NavigateToResource(const Module: HMODULE;
15 const ResName: PChar; const ResType: PChar = nil); overload;
16 procedure NavigateToResource(const ModuleName: string;
17 const ResName: PChar; const ResType: PChar = nil); overload;
18 procedure NavigateToURL(const URL: string);
19 function SaveToString: string;
20 procedure SaveToStream(const Stm: TStream); overload;
21 procedure SaveToFile(const FileName: string); overload;
22 property WebBrowser: TWebBrowser read fWebBrowser;
23 end;
Listing 1
The first thing to notice is the WebBrowser property that enables access to the wrapped control. The public methods fall naturally into several groups and, rather than explaining the purpose of each method now, we will look at them in groups. The protected "helper" methods will be discussed along with the public methods they service.
Constructor
The constructor is very simple – it just stores a reference to the TWebBrowser control that the object is wrapping. This control reference is passed as a parameter to the constructor:
1constructor TWebBrowserWrapper.Create(const WebBrowser: TWebBrowser);
2begin
3 inherited Create;
4 fWebBrowser := WebBrowser;
5end;
Listing 2
Navigation Methods
We have declared three different methods for navigation:
- NavigateToURL – A thin wrapper around the existing TWebBrowser.Navigate method that only deals with standard URLs and intelligently sets the cache and history flags. The main difference between this method and the underlying control's method is that our method blocks until the required document has completely downloaded.
- NavigateToLocalFile – This convenience method simply adds the required
file://
protocol to the URL before calling NavigateToURL.
- NavigateToResource – Loads HTML code from the program's (or other module's) resources into the browser. Overloaded versions of this method allow resources to be accessed by instance handle (like the TResourceStream constructors) or by providing the name of the module containing the resources.
NavigateToLocalFile and both versions of NavigateToResource work by creating the required URL from the parameters passed to them and then calling NavigateToURL to do the actual navigation. Let's first look at how NavigateToURL handles the navigation then come back to look at how the other routines put the URLs together.
1procedure TWebBrowserWrapper.NavigateToURL(const URL: string);
2
3 procedure Pause(const ADelay: Cardinal);
4 var
5 StartTC: Cardinal;
6 begin
7 StartTC := Windows.GetTickCount;
8 repeat
9 Application.ProcessMessages;
10 until Int64(Windows.GetTickCount) - Int64(StartTC) >= ADelay;
11 end;
12
13var
14 Flags: OleVariant;
15begin
16
17 Flags := navNoHistory;
18 if AnsiStartsText('res://', URL)
19 or AnsiStartsText('file://', URL)
20 or AnsiStartsText('about:', URL)
21 or AnsiStartsText('javascript:', URL)
22 or AnsiStartsText('mailto:', URL) then
23
24 Flags := Flags or navNoReadFromCache or navNoWriteToCache;
25
26
27 WebBrowser.Navigate(URL, Flags);
28 while WebBrowser.ReadyState <> READYSTATE_COMPLETE do
29 Pause(5);
30end;
Listing 3
This method essentially falls into two parts. Firstly we decide whether to use the browser's cache to access the document. The decision is based on whether the document is stored locally or is on the internet. We simply check the start of the URL string for some known local protocols etc. and don't use the cache if the URL conforms to one of these types. It would be a simple matter to adapt the method by adding a default parameter to let the user of the code specify what if any caching should take place. This is left as an exercise. The final part of the routine simply uses the browser object's Navigate() method to load the resource into the document. We then go into a loop and wait for the document to load completely. The local Pause procedure does a busy wait, polling the message queue for about 5ms at a time.
Now let us review the specialised navigation methods. The simplest of these is the NagivateToLocalFile method. This method simply checks if the file exists and, if so, prefixes the given file name with the file://
protocol then calls NavigateToURL. If the file doesn't exist no action is taken. A boolean value indicating whether the file exists is returned. You may prefer to modify the method to raise an exception when the file does not exist.
1function TWebBrowserWrapper.NavigateToLocalFile(
2 const FileName: string): Boolean;
3begin
4 Result := FileExists(FileName);
5 if Result then
6 NavigateToURL('file://' + FileName)
7end;
Listing 4
NavigateToResource is slightly more complicated in that we need to create the required URL using the IE specific res://
protocol URL. We discussed this protocol in article #10 where we also developed some functions to return res://
formatted URLs. (We will re-use these functions later).
Two overloaded methods are provided. Both methods create the required URL for a given module, resource name and an optional resource type. They then call NavigateToURL to do the actual navigation. The overloaded methods vary in the way the module is described. The first method accepts the handle of a loaded module (pass HInstance to access the current program). The second method is simply passed a module name as a string. If the resource type parameter is omitted then it is left out of the URL – the res://
protocol simply defaults to expecting an RT_HTML (=MakeIntResource(23)
) resource in such cases. Here are the methods:
1procedure TWebBrowserWrapper.NavigateToResource(const Module: HMODULE;
2 const ResName, ResType: PChar);
3begin
4 NavigateToURL(MakeResourceURL(Module, ResName, ResType));
5end;
6
7procedure TWebBrowserWrapper.NavigateToResource(const ModuleName: string;
8 const ResName, ResType: PChar);
9begin
10 NavigateToURL(MakeResourceURL(ModuleName, ResName, ResType));
11end;
Listing 5
As can be seen, the methods simply rely on overloaded versions of the MakeResourceURL function. The implementation of these functions is described in article #10. In both cases we could improve the methods by checking that the required resources exist and raising an exception or returning false if not. This is left as an exercise (hint: use the Windows FindResource function to check for the resource's existence).
The MakeResourceURL functions are included in the demo source code that accompanies this article.
Document Loading Methods
There are three methods that load new content into an existing document:
- LoadFromString – Replaces the current document with the HTML code stored in a string.
- LoadFromFile – Replaces the current document with the HTML code read from a file.
- LoadFromStream – Replaces the current document with the HTML code read from a stream.
At first sight LoadFromFile is similar to NavigateToLocalFile, but NavigateToLocalFile reads a file into the browser, creating a new document whereas LoadFromFile requires that a document already exists and replaces it's HTML code – i.e. the HTML is changed dynamically.
Both LoadFromString and LoadFromFile simply create a suitable stream and call LoadFromStream, which in turn uses the protected InternalLoadDocumentFromStream method to perform the actual loading of the code. Let's first look at the file and string methods, which are very similar:
1procedure TWebBrowserWrapper.LoadFromFile(const FileName: string);
2var
3 FileStream: TFileStream;
4begin
5 FileStream := TFileStream.Create(
6 FileName, fmOpenRead or fmShareDenyNone
7 );
8 try
9 LoadFromStream(FileStream);
10 finally
11 FileStream.Free;
12 end;
13end;
14
15procedure TWebBrowserWrapper.LoadFromString(const HTML: string);
16var
17 StringStream: TStringStream;
18begin
19 StringStream := TStringStream.Create(HTML);
20 try
21 LoadFromStream(StringStream);
22 finally
23 StringStream.Free;
24 end;
25end;
Listing 6
As can be seen this is all quite straightforward if you're used to using TStreams. LoadFromFile opens a read-only TFileStream onto the file and passes the stream to LoadFromStream. Similarly LoadFromString uses the very useful TStringStream class to open a stream that can read the HTML code string.
LoadFromStream itself is quite straightforward because it hands most of its work off to InternalLoadFromStream:
1procedure TWebBrowserWrapper.LoadFromStream(const Stream: TStream);
2begin
3 NavigateToURL('about:blank');
4 InternalLoadDocumentFromStream(Stream);
5end;
Listing 7
The main thing to note here is that we must ensure there is a document present in TWebBrowser since, as we will see in a moment, we need it in order to load the code from the stream. We do this by navigating to the special about:blank
document, which simply creates a blank document in the web browser control. Once we have our blank document we load the stream into it using InternalLoadDocumentFromStream, which is defined below:
1procedure TWebBrowserWrapper.InternalLoadDocumentFromStream(
2 const Stream: TStream);
3var
4 PersistStreamInit: IPersistStreamInit;
5 StreamAdapter: IStream;
6begin
7 if not Assigned(WebBrowser.Document) then
8 Exit;
9
10 if WebBrowser.Document.QueryInterface(
11 IPersistStreamInit, PersistStreamInit
12 ) = S_OK then
13 begin
14
15 if PersistStreamInit.InitNew = S_OK then
16 begin
17
18 StreamAdapter := TStreamAdapter.Create(Stream);
19
20 PersistStreamInit.Load(StreamAdapter);
21 end;
22 end;
23end;
Listing 8
And this is where it gets more complicated – for the first time we have to mess around with the COM stuff.
We check that the web browser control's document object is available and bail out if not. We then check to see if the document supports the IPersistStreamInit interface, getting a reference to the supporting object. IPersistStreamInit is used to effectively "clear" the document object (using the interface's InitNew method). If this succeeds we finally load the stream's content into the document by calling the IPersistStreamInit.Load method.
IPersistStreamInit.Load accepts a stream object, but the stream it expects is a COM one that must support the IStream interface. Since TStream does not natively support this interface, we have to find some way to provide it. TStreamAdapter from Delphi's Classes unit comes to the rescue here – this object implements IStream and translates IStream's method calls into equivalent calls onto the TStream object that it wraps. We create the needed TStreamAdapter object by passing a reference to our stream in its constructor. Finally, we pass the adpated stream to IPersistStreamInit.Load, and we're done.
If you're wandering why we don't free StreamAdapter and PersistStreamInit it's because they are both interfaced objects and will be automatically destroyed at the end of the method by Delphi's built in interface reference counting. Note that even though the stream adapter class is freed, the underlying TStream object continues to exist, which is what we want.
Note that the browser control interprets the encoding of the stream and sets the document's character set accordingly. However, the character set can also be specified in HTML code.
Document Saving Methods
There are three methods that are used to save a document's code. They complement the three LoadXXX methods as follows:
- SaveToString – Returns the document contents in a string.
- SaveToFile – Saves the document contents to a specified file.
- SaveToStream – Saves the document contents to a given stream.
Like their LoadXXX counterparts, the SaveToFile and SaveToString methods simply map down onto SaveToStream, after creating suitable output streams. Their operation is quite simple and needs little explanation:
1procedure TWebBrowserWrapper.SaveToFile(const FileName: string);
2var
3 FileStream: TFileStream;
4begin
5 FileStream := TFileStream.Create(FileName, fmCreate);
6 try
7 SaveToStream(FileStream);
8 finally
9 FileStream.Free;
10 end;
11end;
12
13function TWebBrowserWrapper.SaveToString: string;
14var
15 StringStream: TStringStream;
16begin
17 StringStream := TStringStream.Create('');
18 try
19 SaveToStream(StringStream);
20 Result := StringStream.DataString;
21 finally
22 StringStream.Free;
23 end;
24end;
Listing 9
The only thing of note in the above methods is the use of TStringStream's DataString property to read out the completed string after writing to the stream.
The SaveToStream method follows. It's as simple as it could be, it just calls InternalSaveDocumentToStream.
1procedure TWebBrowserWrapper.SaveToStream(const Stm: TStream);
2begin
3 InternalSaveDocumentToStream(Stm);
4end;
Listing 10
Finally, we get to see how the protected InternalSaveDocumentToStream method is implemented. It interacts with the browser control to save the whole document to a stream.
1procedure TWebBrowserWrapper.InternalSaveDocumentToStream(
2 const Stream: TStream);
3var
4 StreamAdapter: IStream;
5 PersistStreamInit: IPersistStreamInit;
6begin
7 if not Assigned(WebBrowser.Document) then
8 Exit;
9 if WebBrowser.Document.QueryInterface(
10 IPersistStreamInit, PersistStreamInit
11 ) = S_OK then
12 begin
13 StreamAdapter := TStreamAdapter.Create(Stream);
14 PersistStreamInit.Save(StreamAdapter, True);
15 end;
16end;
Listing 11
Note that, like in InternalLoadDocumentFromStream, we again try to get the web browser document's IPersistStreamInit interface and then use its Save method to write the document to the stream. We also use TStreamAdapter once again to provide the required IStream interface for the TStream.
Note that the browser control writes the stream in the correct encoding for the document.
Stage 2: Adding Unicode Support
Since the browser control supports different character encodings we need to add support for this to our code. For much of this we're going to rely on the encoding support built into Delphi 2009 and later, so get ready for some conditionally defined code.
Thanks to Mauricio Julio for suggesting some of the ideas and code used in this section. Particular thanks are due for the GetStreamEncoding function (see Listing 14) from his TcyComponents pack.
When we discussed requirements we decided we needed to be able to specify an encoding when writing to files and streams and when reading from a string. To handle this we define new overloads of the SaveToStream, SaveToFile and LoadFromString methods.
The second requirement was to provide access to the encoding used for the browser control's current document.
Taking these into account the definition of our TWebBrowserWrapper class becomes:
1type
2 TWebBrowserWrapper = class(TObject)
3 private
4 fWebBrowser: TWebBrowser;
5 protected
6 procedure InternalLoadDocumentFromStream(const Stream: TStream);
7 procedure InternalSaveDocumentToStream(const Stream: TStream);
8 {$IFDEF UNICODE}
9 function GetDocumentEncoding: TEncoding;
10 {$ENDIF}
11 public
12 constructor Create(const WebBrowser: TWebBrowser);
13 procedure LoadFromFile(const FileName: string);
14 procedure LoadFromStream(const Stream: TStream);
15 procedure LoadFromString(const HTML: string); overload;
16 {$IFDEF UNICODE}
17 procedure LoadFromString(const HTML: string;
18 const Encoding: TEncoding); overload;
19 {$ENDIF}
20 function NavigateToLocalFile(const FileName: string): Boolean;
21 procedure NavigateToResource(const Module: HMODULE;
22 const ResName: PChar; const ResType: PChar = nil); overload;
23 procedure NavigateToResource(const ModuleName: string;
24 const ResName: PChar; const ResType: PChar = nil); overload;
25 procedure NavigateToURL(const URL: string);
26 function SaveToString: string;
27 procedure SaveToStream(const Stm: TStream); overload;
28 {$IFDEF UNICODE}
29 procedure SaveToStream(const Stm: TStream;
30 const Encoding: TEncoding); overload;
31 {$ENDIF}
32 procedure SaveToFile(const FileName: string); overload;
33 {$IFDEF UNICODE}
34 procedure SaveToFile(const FileName: string;
35 const Encoding: TEncoding); overload;
36 {$ENDIF}
37 property WebBrowser: TWebBrowser read fWebBrowser;
38 {$IFDEF UNICODE}
39 property Encoding: TEncoding read GetDocumentEncoding;
40 {$ENDIF}
41 end;
Listing 12
Encoding Property
We provide access to the browser's current document encoding via the read only Encoding property which has a read accessor method named GetDocumentEncoding, defined in the following listing.
1{$IFDEF UNICODE}
2function TWebBrowserWrapper.GetDocumentEncoding: TEncoding;
3var
4 Doc: IHTMLDocument2;
5 DocStm: TStream;
6begin
7 Assert(Assigned(WebBrowser.Document));
8 Result := TEncoding.Default;
9 if WebBrowser.Document.QueryInterface(IHTMLDocument2, Doc) = S_OK then
10 begin
11 DocStm := TMemoryStream.Create;
12 try
13 InternalSaveDocumentToStream(DocStm);
14 Result := GetStreamEncoding(DocStm);
15 finally
16 DocStm.Free;
17 end;
18 end;
19end;
20{$ENDIF}
Listing 13
To get the document encoding we need to examine the structure of the stream that is generated when then the document is saved. We first record the default encoding to return if we can't examine the document for any reason. Once we have a reference to the current document in Doc we create a memory stream object and save the browser content into it by calling InternalSaveDocumentToStream. The resulting stream is then examined by Mauricio Julio's GetStreamEncoding function to get the encoding. Listing 14 shows the implementation of GetStreamEncoding.
1{$IFDEF UNICODE}
2function GetStreamEncoding(const Stream: TStream): TEncoding;
3var
4 Bytes: TBytes;
5 Size: Int64;
6begin
7 Stream.Seek(0, soFromBeginning);
8 Size := Stream.Size;
9 SetLength(Bytes, Size);
10 Stream.ReadBuffer(Pointer(Bytes)^, Size);
11 Result := nil;
12 TEncoding.GetBufferEncoding(Bytes, Result);
13end;
14{$ENDIF}
Listing 14
This routine simply copies the provided stream into a TBytes array then uses the GetBufferEncoding class method of TEncoding to determine the encoding.
Revised Document Loading Methods
As noted already we will provide a new overloaded version of LoadFromString that takes a TEncoding parameter that determines the encoding that will be used to load the string containing the HTML. We will also need to re-implement the original LoadFromString method. Here's the new code:
1{$IFDEF UNICODE}
2procedure TWebBrowserWrapper.LoadFromString(const HTML: string;
3 const Encoding: TEncoding);
4var
5 HTMLStm: TMemoryStream;
6begin
7 Assert(Assigned(Encoding));
8 HTMLStm := TMemoryStream.Create;
9 try
10 StringToStreamBOM(HTML, HTMLStm, Encoding);
11 HTMLStm.Position := 0;
12 LoadFromStream(HTMLStm);
13 finally
14 HTMLStm.Free;
15 end;
16end;
17{$ENDIF}
18
19procedure TWebBrowserWrapper.LoadFromString(const HTML: string);
20{$IFDEF UNICODE}
21begin
22 LoadFromString(HTML, TEncoding.Default);
23end;
24{$ELSE}
25var
26 StringStream: TStringStream;
27begin
28 StringStream := TStringStream.Create(HTML);
29 try
30 LoadFromStream(StringStream);
31 finally
32 StringStream.Free;
33 end;
34end;
35{$ENDIF}
Listing 15
The first thing to note is that, on non-Unicode compilers, the original version of LoadFromString is unchanged. However the Unicode version now calls the new overloaded version of the method, passing the default encoding in the Encoding parameter.
The new, Unicode only, overloaded method first writes the the string to a temporary memory stream, encoded according to the Encoding parameter. The stream is prefixed by any byte order mark required by the encoding. All this work is done by the StringToStreamBOM helper routine that is shown in Listing 16. Once we have the stream we simply call the existing LoadFromStream method to load the stream into the document.
1{$IFDEF UNICODE}
2procedure StringToStreamBOM(const S: string; const Stm: TStream;
3 const Encoding: TEncoding);
4var
5 Bytes: TBytes;
6 Preamble: TBytes;
7begin
8 Assert(Assigned(Encoding));
9 Bytes := Encoding.GetBytes(S);
10 Preamble := Encoding.GetPreamble;
11 if Length(Preamble) > 0 then
12 Stm.WriteBuffer(Preamble[0], Length(Preamble));
13 Stm.WriteBuffer(Bytes[0], Length(Bytes));
14end;
15{$ENDIF}
Listing 16
StringToStreamBOM first converts the string into a byte array according the required encoding. It then writes any required byte order mark to the stream (stored in the Preamble variable) followed by the byte array.
Revised Document Saving Methods
In addition to providing new overloaded versions of SaveToStream and SaveToFile we must re-implement SaveToString when compiling with Unicode compilers to take account of the browser document's encoding.
We will first look at the revised SaveToString method before discussing the new overloaded methods. Here is the new implementation of SaveToString.
1function TWebBrowserWrapper.SaveToString: string;
2{$IFDEF UNICODE}
3var
4 MS: TMemoryStream;
5 Encoding: TEncoding;
6 Bytes: TBytes;
7begin
8 MS := TMemoryStream.Create;
9 try
10 SaveToStream(MS);
11
12 Encoding := GetStreamEncoding(MS);
13 MS.Position := Length(Encoding.GetPreamble);
14 SetLength(Bytes, MS.Size - MS.Position);
15 MS.ReadBuffer(Bytes[0], Length(Bytes));
16 Result := Encoding.GetString(Bytes);
17 finally
18 MS.Free;
19 end;
20end;
21{$ELSE}
22var
23 StringStream: TStringStream;
24begin
25 StringStream := TStringStream.Create('');
26 try
27 SaveToStream(StringStream);
28 Result := StringStream.DataString;
29 finally
30 StringStream.Free;
31 end;
32end;
33{$ENDIF}
Listing 17
Just like with LoadFromString the non-Unicode version of SaveToString remains unchanged.
The Unicode version of the method writes the browser's document into a temporary memory stream. This is done because we need to interpret the output stream according to its encoding. So we call GetStreamEncoding to find the encoding used to generate the stream. Next we set the memory stream's position to skip over any byte order mark (preamble). The remainder of the stream is then copied into a byte array that we can pass to the GetString method of TEncoding which, in turn, returns the required string.
Having disposed of the most complex method we now turn to the new overloaded versions of SaveToFile and SaveToStream:
1{$IFDEF UNICODE}
2procedure TWebBrowserWrapper.SaveToFile(const FileName: string;
3 const Encoding: TEncoding);
4var
5 FileStream: TFileStream;
6begin
7 FileStream := TFileStream.Create(FileName, fmCreate);
8 try
9 SaveToStream(FileStream, Encoding);
10 finally
11 FileStream.Free;
12 end;
13end;
14{$ENDIF}
15
16{$IFDEF UNICODE}
17procedure TWebBrowserWrapper.SaveToStream(const Stm: TStream;
18 const Encoding: TEncoding);
19var
20 HTML: string;
21begin
22 HTML := SaveToString;
23 StringToStreamBOM(HTML, Stm, Encoding);
24end;
25{$ENDIF}
Listing 18
SaveToFile is very similar to the version that does not take an Encoding. It creates a stream onto the file then passes the stream and the encoding to the overloaded version of SaveToStream.
In contrast the overload of SaveToStream is very different. First it calls SaveToString to retrieve a string containing the document's content as HTML. It does this because SaveToString handles the document encoding correctly. We then save the string to the stream using the encoding provided in the Encoding parameter, prefixed by any required byte order mark.
If Encoding is the same as the document's encoding then calling this method is wasteful. You should call the SaveToStream overload that does not take an Encoding parameter instead.