How to dynamically add data to an executable file
Contents
Introduction
In article #2 we discussed how it is often useful to distribute data that is embedded in a program's executable file. That article solved the problem by writing the data to resource files and linking them into the program at compile time. This article solves the same problem in a different way – by appending the data to the executable file. This has the advantage that the data doesn't need to be linked in at compile time and can be added or updated dynamically. Data can be attached to an already compiled file. The only disadvantage of this method is that it's slightly harder to read the data at run time than it is when using resources.
A typical use for this technique would be in creating install programs. We have the actual installer as a program stub and append the files to be installed to the end of the stub. The installer stub would contain code to extract the files.
Overview
The Windows PE file format permits additional data to be appended to the executable file. This data is ignored by the Windows program loader, so we can use the data for our own purposes without affecting the program code. For the purposes of this article let's call this data the payload.
The problem to be solved is how to denote that an executable file has a payload and how to find out what size it is. We must be able to do this without modifying the executable portion of the file. Following Daniel Polistchuck we will use a special record to identify the payload. This record will follow the payload data.
So, an executable file that contains a payload has three main components (in order):
- The original executable file.
- The payload data.
- A footer record that identifies that a payload is present and records the size of the executable code and the payload.
Our task then, is to produce code that can create, modify, read, delete and check for the existence of, a payload. To enable this we must be able to detect, read and update the payload footer.
We begin our discussion by investigating how to handle this footer.
Payload management class
In this section we will develop a class that lets us manage payloads. Let us first define the requirements of the class. They are:
- To check if an executable file contains a payload.
- To find the size of the payload data.
- To extract the payload from a file into a suitably sized buffer.
- To delete a payload from a file.
- To store payload data in a file.
The class is declared as follows:
1type
2 TPayload = class(TObject)
3 private
4 fFileName: string;
5 fOldFileMode: Integer;
6 fFile: File;
7 procedure Open(Mode: Integer);
8 procedure Close;
9 public
10 constructor Create(const FileName: string);
11 function HasPayload: Boolean;
12 function PayloadSize: Integer;
13 procedure SetPayload(const Data; const DataSize: Integer);
14 procedure GetPayload(var Data);
15 procedure RemovePayload;
16 end;
Listing 4
The public methods are:
- Create – Creates an object to work on a named file.
- HasPayload – Returns true if the file contains a payload.
- PayloadSize – Returns the size of the payload. This information is required when allocating a buffer into which payload data can be read using GetPayload.
- SetPayload – Copies a specified number of bytes from a buffer and stores them as a payload at the end of the file. Overwrites any existing payload.
- GetPayload – Copies the file's payload into a given buffer. The buffer must be large enough to store all the payload. The required buffer size is given by PayloadSize.
- RemovePayload – Deletes any payload from the file and removes the footer record. This method restores file to its original condition.
In addition there are two private helper methods:
- Open – Opens the file in a specified mode.
- Close – Closes the file and restores the original file mode.
The class also has three fields:
- fFileName – Stores the name of the file we are manipulating.
- fOldFileMode – Preserves the current Pascal file mode.
- fFile – Pascal file descriptor that records the details of an open file.
As can be seen from the discussion of the fields we will be using standard un-typed Pascal file routines to manipulate the file.
We will discuss the implementation of the class is several chunks. We begin in Listing 5 where we look at the constructor, some required constants and the two private helper methods.
1const
2
3 cReadOnlyMode = 0;
4 cReadWriteMode = 2;
5
6constructor TPayload.Create(const FileName: string);
7begin
8
9 inherited Create;
10 fFileName := FileName;
11end;
12
13procedure TPayload.Open(Mode: Integer);
14begin
15
16 fOldFileMode := FileMode;
17 AssignFile(fFile, fFileName);
18 FileMode := Mode;
19 Reset(fFile, 1);
20end;
21
22procedure TPayload.Close;
23begin
24
25 CloseFile(fFile);
26 FileMode := fOldFileMode;
27end;
Listing 5
The two constants define the two Pascal file modes we will require. The constructor simply records the name of the file associated with the class. The Open method first stores the current file mode then opens the file using the required file mode. Finally, Close closes the file and restores the original file mode.
We next consider the two methods that provide information about a file's payload – PayloadSize and HasPayload:
1function TPayload.PayloadSize: Integer;
2var
3 Footer: TPayloadFooter;
4begin
5
6 Result := 0;
7
8 Open(cReadOnlyMode);
9 try
10
11 if ReadFooter(fFile, Footer) then
12 Result := Footer.DataSize;
13 finally
14 Close;
15 end;
16end;
17
18function TPayload.HasPayload: Boolean;
19begin
20
21 Result := PayloadSize > 0;
22end;
Listing 6
The only method here of any substance is PayloadSize. We first assume a payload size of zero in case there is no payload. Next we open the file in read mode and attempt to read the footer. The ReadFooter helper routine is used to do this. If the footer is read successfully we get the size of the payload from the footer record's DataSize field. The file is then closed.
HasPayload simply calls PayloadSize and checks if the payload size it returns is greater than zero.
Now we move on to consider GetPayload which is described in Listing 7. This method's Data parameter is a data buffer which must have a size of at least PayloadSize bytes.
1procedure TPayload.GetPayload(var Data);
2var
3 Footer: TPayloadFooter;
4begin
5
6 Open(cReadOnlyMode);
7 try
8
9 if ReadFooter(fFile, Footer) and (Footer.DataSize > 0) then
10 begin
11
12 Seek(fFile, Footer.ExeSize);
13 BlockRead(fFile, Data, Footer.DataSize);
14 end;
15 finally
16
17 Close;
18 end;
19end;
Listing 7
GetPayload opens the file in read only mode and tries to read the footer record. If we succeed in reading a footer and the payload contains data we move the file pointer to the start of the payload, then read the payload into the Data buffer. Note that we use the footer record's ExeSize field to perform the seek operation and the DataSize field to determine how many bytes to read. The method ends by closing the file.
Finally we examine the implementation of the two methods that modify the file – RemovePayload and SetPayload.
1procedure TPayload.RemovePayload;
2var
3 PLSize: Integer;
4 FileLen: Integer;
5begin
6
7 PLSize := PayloadSize;
8 if PLSize > 0 then
9 begin
10
11 Open(cReadWriteMode);
12 FileLen := FileSize(fFile);
13 try
14
15 Seek(fFile, FileLen - PLSize - SizeOf(TPayloadFooter));
16 Truncate(fFile);
17 finally
18 Close;
19 end;
20 end;
21end;
22
23procedure TPayload.SetPayload(const Data; const DataSize: Integer);
24var
25 Footer: TPayloadFooter;
26begin
27
28 RemovePayload;
29 if DataSize > 0 then
30 begin
31
32 Open(cReadWriteMode);
33 try
34
35 InitFooter(Footer);
36 Footer.ExeSize := FileSize(fFile);
37 Footer.DataSize := DataSize;
38
39 Seek(fFile, Footer.ExeSize);
40 BlockWrite(fFile, Data, DataSize);
41 BlockWrite(fFile, Footer, SizeOf(Footer));
42 finally
43 Close;
44 end;
45 end;
46end;
Listing 8
RemovePayload checks the existing payload's size and proceeds only if a payload is present. If so the file is opened for writing and it's size is noted. We then seek to the end of the executable part of the file and truncate it before closing the file. We have calculated the end of the executable section by deducting the payload size and the size of the footer record from the file length. We could also have read the footer and simply used the value of its ExeSize field.
SetPayload takes two parameters: a data buffer (Data) and the size of the buffer (DataSize). The method begins by using RemovePayload to remove any existing payload, ensuring that the file contains only the executable code. If the payload contains some data we open the file for writing. A new payload record is then initialized using the InitFooter helper routine, then the sizes of both the executable file and of the new payload are stored in the record. Finally we append the payload data and the footer record to the file before closing it.
Now we have created the TPayload class it is easy to manipulate payload data. Unfortunately we must read and write the whole of the payload at once, which is not always convenient. An improvement would be to enable random access to the data. That's what we do next.
Random payload access
The "Delphi way" of providing random access to data is to derive a class from TStream and to override its abstract methods – and this is what we will do here. Our new class will be called TPayloadStream. It will detect payload data and provide read / write random access to it.
Not only does this approach provide random access but it also has the added advantage of hiding the details of how the payload is implemented from the user of the class. All the user sees is the familiar TStream interface while all the gory details are hidden in TPayloadStream's implementation.
Listing 9 shows the definition of the new class, along with an enumeration – TPayloadOpenMode – that is used to determine whether a payload stream object is to read or write the payload data. Note that in addition to overriding TStream's abstract methods, TPayloadStream also overrides the virtual SetSize method to enable the user to change the size of the payload. This is necessary because, by default, SetSize does nothing.
1type
2 TPayloadOpenMode = (
3 pomRead,
4 pomWrite
5 );
6
7 TPayloadStream = class(TStream)
8 private
9 fMode: TPayloadOpenMode;
10 fOldFileMode: Integer;
11 fFile: File;
12 fDataStart: Integer;
13 fDataSize: Integer;
14 public
15
16 constructor Create(const FileName: string;
17 const Mode: TPayloadOpenMode);
18
19 destructor Destroy; override;
20
21 function Seek(Offset: LongInt; Origin: Word): LongInt; override;
22
23 procedure SetSize(NewSize: LongInt); override;
24
25 function Read(var Buffer; Count: LongInt): LongInt; override;
26
27 function Write(const Buffer; Count: LongInt): LongInt; override;
28 end;
Listing 9
The public methods of TPayloadStream are:
- Create – Creates a TPayloadStream and opens the named file either in read or write mode.
- Destroy – Updates the payload footer, closes the file and destroys the stream object.
- Seek – Moves the stream's pointer to the specified position in the payload data, ensuring that the pointer remains within the payload.
- SetSize – Sets the size of the payload in write mode only. Raises an exception when used in read mode. Note that setting the size to zero will remove the payload and the associated footer record.
- Read – Attempts to read a specified number of bytes into a buffer. If there is insufficient data in the payload then just the remaining bytes are read.
- Write – Writes a specified number of bytes from a buffer to the payload, extending the payload if required. Works only in write mode – an exception is raised in read mode.
The class also uses the following private fields:
- fMode – Records whether the stream is open for reading or writing.
- fOldFileMode – Preserves the current Pascal file mode.
- fFile – Pascal file descriptor that records the details of an open file.
- fDataStart – Offset of the start of payload data from the start of the executable file.
- fDataSize – Size of payload data.
We begin our review of the class's implementation by examining Listing 10, which shows the constructor and destructor. Once again we are using classic Pascal un-typed files to perform the underlying physical access to the executable file. However this could easily be changed to use some other file access techniques.
1constructor TPayloadStream.Create(const FileName: string;
2 const Mode: TPayloadOpenMode);
3var
4 Footer: TPayloadFooter;
5begin
6 inherited Create;
7
8 fMode := Mode;
9 fOldFileMode := FileMode;
10 AssignFile(fFile, FileName);
11 case fMode of
12 pomRead: FileMode := 0;
13 pomWrite: FileMode := 2;
14 end;
15 Reset(fFile, 1);
16
17 if ReadFooter(fFile, Footer) then
18 begin
19
20 fDataStart := Footer.ExeSize;
21 fDataSize := Footer.DataSize;
22 end
23 else
24 begin
25
26 fDataStart := FileSize(fFile);
27 fDataSize := 0;
28 end;
29
30 case fMode of
31 pomRead: System.Seek(fFile, fDataStart);
32 pomWrite: System.Seek(fFile, fDataStart + fDataSize);
33 end;
34end;
35
36destructor TPayloadStream.Destroy;
37var
38 Footer: TPayloadFooter;
39begin
40 if fMode = pomWrite then
41 begin
42
43 if fDataSize > 0 then
44 begin
45
46 InitFooter(Footer);
47 Footer.ExeSize := fDataStart;
48 Footer.DataSize := fDataSize;
49 System.Seek(fFile, fDataStart + fDataSize);
50 Truncate(fFile);
51 BlockWrite(fFile, Footer, SizeOf(Footer));
52 end
53 else
54 begin
55
56 System.Seek(fFile, fDataStart);
57 Truncate(fFile);
58 end;
59 end;
60
61 CloseFile(fFile);
62 FileMode := fOldFileMode;
63 inherited;
64end;
Listing 10
In the constructor the first thing we do is to record the open mode and then open the underlying file in the required mode. Next we try to read a payload footer record, using the ReadFooter function we developed in Listing 3.
If we have found a footer we a get the start of the payload data (fDataStart) and the size of the payload (fDataSize) from the footer's ExeSize and DataSize fields respectively. If there is no footer record then we have no payload, so we set fDataStart to refer to just beyond the end of the file and set fDataSize to zero.
Setting fDataStart
fDataStart is the same as the size of the executable file because payloads always start immediately after the executable code.
Finally the constructor sets the file pointer according to the file mode – in read mode we set it to the start of the payload while in write mode we set it to the end.
In the destructor we proceed differently according to whether we are in read mode or write mode:
- In read mode all there is to do is to close the file and restore the previous Pascal file mode.
- In write mode we first check if we actually have a payload (
fDataSize > 0
). If so, we create a footer record using the InitFooter routine we defined in Listing 2 and record the payload size and start position in the record. We then seek to the end of the new payload data, truncate any data that falls beyond its end (as will be the case when the data size has shrunk), then write the footer. If there is no payload data we truncate the file at the end of the executable code. Finally we close the file and restore the file mode.
That completes the discussion of the class constructor and destructor. Let us now consider how we override the abstract Seek, Read and Write methods. Listing 11 has the details:
1function TPayloadStream.Seek(Offset: Integer;
2 Origin: Word): LongInt;
3begin
4
5
6 case Origin of
7 soFromBeginning:
8
9 if Offset >= 0 then
10 Result := Offset
11 else
12 Result := 0;
13 soFromEnd:
14
15 if Offset <= 0 then
16 Result := fDataSize + Offset
17 else
18 Result := fDataSize;
19 else
20
21 Result := FilePos(fFile) - fDataStart + Offset;
22 end;
23
24 if Result < 0 then
25 Result := 0;
26 if Result > fDataSize then
27 Result := fDataSize;
28
29 System.Seek(fFile, fDataStart + Result);
30end;
31
32function TPayloadStream.Read(var Buffer;
33 Count: Integer): LongInt;
34var
35 BytesRead: Integer;
36 AvailBytes: Integer;
37begin
38
39 AvailBytes := fDataSize - Position;
40 if AvailBytes < Count then
41 Count := AvailBytes;
42
43 BlockRead(fFile, Buffer, Count, BytesRead);
44 Result := BytesRead;
45end;
46
47function TPayloadStream.Write(const Buffer;
48 Count: Integer): LongInt;
49var
50 BytesWritten: Integer;
51 Pos: Integer;
52begin
53
54 if fMode <> pomWrite then
55 raise EPayloadStream.Create(
56 'TPayloadStream can''t write in read mode.');
57
58 BlockWrite(fFile, Buffer, Count, BytesWritten);
59 Result := BytesWritten;
60
61 Pos := FilePos(fFile);
62 if Pos - fDataStart > fDataSize then
63 fDataSize := Pos - fDataStart;
64end;
Listing 11
Seek is the most complicated of the three methods. This is because the FilePos and Seek routines (that we use to get and set the file pointer) operate on the whole file, while our stream positions must be relative to the start of the payload data. We must also ensure that the file pointer cannot be set outside the payload data. The case statement contains the code that calculates the required offset within the payload, depending on the seek origin. The two lines following the case statement constrain the offset within the payload data. Finally we perform the actual seek operation on the underlying file, offset from the start of the payload data. The method returns the new offset relative to the payload.
The Read method must ensure that the read falls wholly within the payload data. We can't assume that all the remaining bytes in the stream can be read, because the payload may be followed by a footer record that is not part of the data. Therefore we calculate the number of available bytes by subtracting the current position in the payload from the size of the payload data. If there is insufficient data to meet the request, the number of bytes to be read is reduced to the number of available bytes.
Note that Read uses TStream's Position property to get the current position in the payload data. This property calls the Seek method which, as we have seen, ensures that the position returned falls within the payload data.
Write is quite simple – we just check we are in write mode and output the data to the underlying file at the current position if so. The number of bytes written is returned. The only complication is that we must check if the write operation took us beyond the end of the current data and record the new data size if so. Should the stream be in read mode then Write raises an exception.
All that remains to do now is to override the SetSize method. Listing 12 provides the implementation.
1procedure TPayloadStream.SetSize(NewSize: Integer);
2var
3 Pos: Integer;
4begin
5
6 if fMode <> pomWrite then
7 raise EPayloadStream.Create(
8 'TPayloadStream can''t change size in read mode.');
9
10 if NewSize < fDataSize then
11 begin
12 Pos := Position;
13 fDataSize := NewSize;
14 if Pos > fDataSize then
15 Position := fDataSize;
16 end;
17end;
Listing 12
Obviously, we can't change the stream size in read mode, so we raise an exception in this case. In write mode we only record the new size if it is less than the current payload size. In this case we must also check if the current stream position falls beyond the end of the reduced payload and move the position to the end of the truncated data if so. The Position property is used to get and set the stream position. As noted earlier, this property calls our overridden Seek method.
Why can't SetSize increase the payoad size?
Prohibiting SetSize from extending the payload data is a design decision I took, because enlarging the data leaves the problem of having to write padding bytes to the payload. What should those bytes be? Zeros? Random data? I think it's reasonable to assume that the payload should only be extended by explicitly appending data to it using the Write method.
If your view is different, then I leave implementation to you as an exercise!
This completes our presentation of the TPayloadStream class.
Demo code
A demo program to accompany this article can be found in the delphidabbler/article-demos
Git repository on GitHub.
You can view the code in the article-07
sub-directory. Alternatively download a zip file containing all the demos by going to the repository's landing page and clicking the Clone or download button and selecting Download ZIP.
See the demo's README.md file for details.
The demo does not currently contain the source code for, or an example of using, TPayloadStream.
This source code is merely a proof of concept and is intended only to illustrate this article. It is not designed for use in its current form in finished applications. The code is provided on an "AS IS" basis, WITHOUT WARRANTY OF ANY KIND, either express or implied.
The demo is open source. See the demo's LICENSE.md file for licensing details.
Feedback
I hope you found this article useful.
If you have any observations, comments, or have found any errors there are two places you can report them.
- For anything to do with the article content, but not the downloadable demo code, please use this website's Issues page on GitHub. Make sure you mention that the issue relates to "article #7".
- For bugs in the demo code see the
article-demo
project's README.md
file for details of how to report them.