MPQ Archives - Incremental MPQ patches

MPQ Archives

Incremental MPQ patches

Introduction

Incremental patch files were first observed in BETA version of World of Warcraft - Cataclysm. They are regular files in the MPQ, with the MPQ_FILE_PATCH_FILE bit set in the block table. They can be stored per file sector, or as single unit files. The following file structures have different meaning than regular MPQ files:

Flags in block table entry contain MPQ_FILE_PATCH_FILE bit (0x00100000).
File size in the block table contains the size of the patched file, not size of the patch.
At the begin of the file in the MPQ, there is a structure called TPatchInfo.
The TPatchInfo is followed by sector offset table, unless the file is stored in single unit.

TPatchInfo structure

This structure contains size of the patch, flags and also MD5 of the patch. The structure can be defined as the following C structure:

// Patch file header
struct TPatchInfo
{
    DWORD dwLength;                     // Length of patch info header, in bytes
    DWORD dwFlags;                      // Flags. 0x80000000 = MD5 (?)
    DWORD dwDataSize;                   // Uncompressed size of the patch file
    BYTE  md5[0x10];                    // MD5 of the entire patch file after decompression

    // Followed by the sector offset table (variable length)
};

TPatchInfo is part of the metadata - it contains information needed to read the file from the MPQ.

TPatchHeader structure

Each incremental patch file in a patch MPQ starts with TPatchHeader header. It is supposed to be a structure with variable length, defined by the following C code:

// Header for PTCH files 
struct TPatchHeader
{
    //-- PATCH header -----------------------------------
    DWORD dwSignature;                  // 'PTCH'
    DWORD dwSizeOfPatchData;            // Size of the entire patch (decompressed)
    DWORD dwSizeBeforePatch;            // Size of the file before patch
    DWORD dwSizeAfterPatch;             // Size of file after patch
    
    //-- MD5 block --------------------------------------
    DWORD dwMD5;                        // 'MD5_'
    DWORD dwMd5BlockSize;               // Size of the MD5 block, including the signature and size itself
    BYTE md5_before_patch[0x10];        // MD5 of the original (unpached) file
    BYTE md5_after_patch[0x10];         // MD5 of the patched file

    //-- XFRM block -------------------------------------
    DWORD dwXFRM;                       // 'XFRM'
    DWORD dwXfrmBlockSize;              // Size of the XFRM block, includes XFRM header and patch data
    DWORD dwPatchType;                  // Type of patch ('BSD0' or 'COPY')

    // Followed by the patch data
};

Patch data

Patch data immediately follow the patch header. Type of data depends on the value of dwPatchType in the patch header. Size of the data depends on the value of dwXfrmBlockSize in the patch header. The following formats are described:

'BSD0' - Blizzard-modified version of BSDIFF40 incremental patch
'BSDP' - Unknown
'COPY' - Plain replace
'COUP' - Unknown
'CPOG' - Unknown

The following paragraphs describe known patch methods

BSD0: BSDIFF40-based incremetal patch

Data for the patch start with a 32-bit value, which holds size of unpacked data. After that, BSDIFF40 data follow, compressed by RLE compression. The format of Blizzard version of BSDIFF40 data is:

Offset	Size	Meaning
0x0000	0x08 bytes	'BSDIFF40' signature
0x0008	0x08 bytes	Size of CTRL block (in bytes)
0x0010	0x08 bytes	Size of DATA block (in bytes)
0x0018	0x08 bytes	Size of file after applying the patch (in bytes)

0x0020	Variable	CTRL block. Consists of an array of triads, each triad has 0x0C bytes (3 DWORDS). Length of the CTRL block is stored in the BSDIFF header. Note that original BSDIFF40 format uses 64-bits for each value in the triad.

Variable	Variable	DATA block. Length of the DATA block is stored in the BSDIFF header.

Variable	Variable	EXTRA block. Length of the extra block is the extra bytes beyond DATA block, until the end of the BSDIFF40 patch.

COPY: Just take the content and replace the file

As the title says, COPY patch type means that the file data are simply replaced by the data in the patch.

Algorithms

RLE unpacking

void Decompress_RLE(LPBYTE pbDecompressed, DWORD cbDecompressed, LPBYTE pbCompressed, DWORD cbCompressed)
{
    LPBYTE pbDecompressedEnd = pbDecompressed + cbDecompressed;
    LPBYTE pbCompressedEnd = pbCompressed + cbCompressed;
    BYTE RepeatCount; 
    BYTE OneByte;

    // Cut the initial DWORD from the compressed chunk
    pbCompressed += sizeof(DWORD);
    cbCompressed -= sizeof(DWORD);

    // Pre-fill decompressed buffer with zeros
    memset(pbDecompressed, 0, cbDecompressed);

    // Unpack
    while(pbCompressed < pbCompressedEnd)
    {
        OneByte = *pbCompressed++;
        
        // Is it a repetition byte ?
        if(OneByte & 0x80)
        {
            RepeatCount = (OneByte & 0x7F) + 1;
            for(BYTE i = 0; i < RepeatCount; i++)
            {
                if(pbDecompressed == pbDecompressedEnd || pbCompressed == pbCompressedEnd)
                    break;

                *pbDecompressed++ = *pbCompressed++;
            }
        }
        else
        {
            pbDecompressed += (OneByte + 1);
        }
    }
}

Applying BSD0 patch

For more information about applying BSD0 patch, please look into StormLib, function ApplyMpqPatch_BSD0()