How to Make Modern PHP More Modern? With Preprocessing!

Christopher Pitt
Share

Let’s have a bit of fun. A while ago, I experimented with PHP macros, adding Python range syntax. Then, the talented SaraMG mentioned an RFC, and LordKabelo suggested instead adding C#-style getters and setters to PHP.

Aware of how painfully slow it can be for an outsider to suggest and implement a new language feature, I took to my editor…

The code for this tutorial can be found on Github. It’s been tested with PHP ^7.1, and the generated code should run on PHP ^5.6|^7.0.

Vector illustration of cog with upward facing arrows running through it, indicating preprocessing, upgrading, improving

How Do Macros Work Again?

It’s been a while (and perhaps you’ve never heard of them) since I’ve talked about macros. To refresh your memory, they take code that looks like this:

macro {
  →(···expression)
} >> {
  ··stringify(···expression)
}

macro {
  T_VARIABLE·A[
    ···range
  ]
} >> {
  eval(
    '$list = ' . →(T_VARIABLE·A) . ';' .
    '$lower = ' . explode('..', →(···range))[0] . ';' .
    '$upper = ' . explode('..', →(···range))[1] . ';' .
    'return array_slice($list, $lower, $upper - $lower);'
  )
}

…and turn custom PHP syntax, like this:

$few = many[1..3];

…into valid PHP syntax, like this:

$few = eval(
    '$list = ' . '$many' . ';'.
    '$lower = ' . explode('..', '1..3')[0] . ';' .
    '$upper = ' . explode('..', '1..3')[1] . ';' .
    'return array_slice($list, $lower, $upper - $lower);'
);

If you’d like to see how this works, head over to the the post I wrote about it.

The trick is to understand how a parser tokenizes a string of code, build a macro pattern, and then apply that pattern recursively to the new syntax.

The macro library isn’t well documented, though. It’s difficult to know exactly what the pattern needs to look like, or what valid syntax to generate in the end. Every new application begs for a tutorial like this to be written, before others can understand what’s really going on.

Building A Base

So, let’s look at the application at hand. We’d like to add getter and setter syntax, resembling that of C#, to PHP. Before we can do that, we need to have a good base of code to work from. Perhaps something in the form of a trait that we can add to classes needing this new functionality.

We need to implement code that will inspect a class definition and create these dynamic getter and setter methods for each special property or comment it sees.

Perhaps we can start by defining a special method name format, and magic __get and __set methods:

namespace App;

trait AccessorTrait
{
  /**
   * @inheritdoc
   *
   * @param string $property
   * @param mixed $value
   */
  public function __get($property)
  {
    if (method_exists($this, "__get_{$property}")) {
      return $this->{"__get_{$property}"}();
    }
  }

  /**
   * @inheritdoc
   *
   * @param string $property
   * @param mixed $value
   */
  public function __set($property, $value)
  {
    if (method_exists($this, "__set_{$property}")) {
      return $this->{"__set_{$property}"}($value);
    }
  }
}

Each method starting with the name __get_ and __set_ needs to be connected to an as-yet undefined property. We can imagine this syntax:

namespace App;

class Sprocket
{
    private $type {
        get {
            return $this->type;
        }

        set {
            $this->type = strtoupper($value);
        }
    };
}

…being converted to something very much like:

namespace App;

class Sprocket {
    use AccessorTrait;

    private $type;

    private function __get_type() {
        return $this->type;  
    }

    private function __set_type($value) {
        $this->type = strtoupper($value);   
    }
}

Defining Macros

Defining the required macros is the hardest part of any of this. Given the lack of documentation (and widespread use), and with only a handful of helpful exception messages, it’s mostly a lot of trial and error.

I spent a few hours coming up with the following patterns:

macro ·unsafe {
  ·ns()·class {
    ···body
  }
} >> {
·class {
    use AccessorTrait;

    ···body
  }
}

macro ·unsafe {
  private T_VARIABLE·var {
    get {
      ···getter
    }

    set {
      ···setter
    }
  };
} >> {
  private T_VARIABLE·var;

  private function ··concat(__get_ ··unvar(T_VARIABLE·var))() {
    ···getter
  }

  private function ··concat(__set_ ··unvar(T_VARIABLE·var))($value) {
    ···setter
  }
}

Ok, let’s look at what these two macros are doing:

  1. We begin by matching class MyClass { ... }, and inserting the AccessorTrait we built previously. This provides the __get and __set implementations, which links __get_bar to print $class->bar etc.
  2. We match the accessor block syntax, and replace it with an ordinary property definition, followed by a couple of individual method definitions. We can wrap the exact contents of the get { ... } and set { ... } blocks within these functions.

At first, when you run this code, you’ll get an error. That’s because the ··unvar function isn’t a standard part of the macro processor. It’s something I had to add, to convert from $type to type:

namespace Yay\DSL\Expanders;

use Yay\Token;
use Yay\TokenStream;

function unvar(TokenStream $ts) : TokenStream {
  $str = str_replace('$', '', (string) $ts);

  return
    TokenStream::fromSequence(
      new Token(
        T_CONSTANT_ENCAPSED_STRING, $str
      )
    )
  ;
}

I was able to copy (almost exactly) the ··stringify expander, which is included in the macro parser. You don’t need to understand much about the internals of Yay in order to see what this is doing. Casting a TokenStream to a string (in this context) means you’re getting the string value of whatever token is currently referenced – in this case it’s ··unvar(T_VARIABLE·var) – and perform string manipulations on it.

(string) $ts becomes "$type", as opposed to "T_VARIABLE·var".

Usually, these macros are applied when they are placed inside the script they are meant to apply to. In other words, we could create a script resembling:

<?php

macro ·unsafe {
  ...
} >> {
  ...
}

macro ·unsafe {
  ...
} >> {
  ...
}

namespace App;

trait AccessorTrait
{
  ...
}

class Sprocket
{
  private $type {
    get {
      return $this->type;
    }

    set {
      $this->type = strtoupper($value);
    }
  };
}

… then we could run it using a command like:

vendor/bin/yay src/Sprocket.pre >> src/Sprocket.php

Finally, we could use this code (with some Composer PSR-4 autoloading), using:

require __DIR__ . "/vendor/autoload.php";

$sprocket = new App\Sprocket();
$sprocket->type = "acme sprocket";

print $sprocket->type; // Acme Sprocket

Automating Conversion

As a manual process, this sucks. Who wants to run that bash command every time they change src/Sprocket.pre? Fortunately, we can automate this!

The first step is to define a custom autoloader:

spl_autoload_register(function($class) {
  $definitions = require __DIR__ . "/vendor/composer/autoload_psr4.php";

  foreach ($definitions as $prefix => $paths) {
    $prefixLength = strlen($prefix);

    if (strncmp($prefix, $class, $prefixLength) !== 0) {
      continue;
    }

    $relativeClass = substr($class, $prefixLength);

    foreach ($paths as $path) {
      $php = $path . "/" . str_replace("\\", "/", $relativeClass) . ".php";

      $pre = $path . "/" . str_replace("\\", "/", $relativeClass) . ".pre";

      $relative = ltrim(str_replace(__DIR__, "", $pre), DIRECTORY_SEPARATOR);

      $macros = __DIR__ . "/macros.pre";

      if (file_exists($pre)) {
        // ... convert and load file
      }
    }
  }
}, false, true);

You can save this file as autoload.php, and use files autoloading to include it through Composer’s autoloader, as explained in the documentation.

The first part of this definition comes straight out of the example implementation of the PSR-4 specification. We fetch Composer’s PSR-4 definitions file, and for each prefix, we check whether it matches the class currently being loaded.

If it matches, we check each potential path, until we find a file.pre, in which our custom syntax is defined. Then we get the contents of a macros.pre file (in the project base directory), and create an interim file – using macros.pre contents + the matched file’s contents. That means the macros are available to the file we pass to Yay. Once Yay has compiled file.pre.interimfile.php, we delete file.pre.interim.

The code for that process is:

if (file_exists($php)) {
  unlink($php);
}

file_put_contents(
  "{$pre}.interim",
  str_replace(
    "<?php",
    file_get_contents($macros),
    file_get_contents($pre)
  )
);

exec("vendor/bin/yay {$pre}.interim >> {$php}");

$comment = "
  # This file is generated, changes you make will be lost.
  # Make your changes in {$relative} instead.
";

file_put_contents(
  $php,
  str_replace(
    "<?php",
    "<?php\n{$comment}",
    file_get_contents($php)
  )
);

unlink("{$pre}.interim");

require_once $php;

Notice those two booleans at the end of the call to spl_autoload_register. The first is whether or not this autoloader should throw exceptions for loading errors. The second is whether this autoloader should be prepended to the stack. This puts it before Composer’s autoloaders, which means we can convert file.pre before Composer tries to load file.php!

Creating A Plugin Framework

This automation is great, but it’s wasted if one has to repeat it for every project. What if we could just composer require a dependency (for a new language feature), and it would just work? Let’s do that…

First up, we need to create a new repo, containing the following files:

  • composer.json → autoload the following files
  • functions.php → create macro path functions (to other libraries can add their own macro files dynamically)
  • expanders.php → create expander functions, like ··unvar
  • autoload.php → augment Composer’s autoloader, loading each other library’s macro files into each compiled .pre file
{
  "name": "pre/plugin",
  "require": {
    "php": "^7.0",
    "yay/yay": "dev-master"
  },
  "autoload": {
    "files": [
      "functions.php",
      "expanders.php",
      "autoload.php"
    ]
  },
  "minimum-stability": "dev",
  "prefer-stable": true
}

This is from composer.json

<?php

namespace Pre;

define("GLOBAL_KEY", "PRE_MACRO_PATHS");

/**
 * Creates the list of macros, if it is undefined.
 */
function initMacroPaths() {
  if (!isset($GLOBALS[GLOBAL_KEY])) {
    $GLOBALS[GLOBAL_KEY] = [];
  }
}

/**
 * Adds a path to the list of macro files.
 *
 * @param string $path
 */
function addMacroPath($path) {
  initMacroPaths();
  array_push($GLOBALS[GLOBAL_KEY], $path);
}

/**
 * Removes a path to the list of macro files.
 *
 * @param string $path
 */
function removeMacroPath($path) {
  initMacroPaths();

  $GLOBALS[GLOBAL_KEY] = array_filter(
    $GLOBALS[GLOBAL_KEY],
    function($next) use ($path) {
      return $next !== $path;
    }
  );
}

/**
 * Gets all macro file paths.
 *
 * @return array
 */
function getMacroPaths() {
  initMacroPaths();
  return $GLOBALS[GLOBAL_KEY];
}

This is from functions.php

You may be cringing at the thought of using $GLOBALS as a store for the macro file paths. It’s unimportant, as we could store these paths in any number of other ways. This is just the simplest approach to demonstrate the pattern.

<?php

namespace Yay\DSL\Expanders;

use Yay\Token;
use Yay\TokenStream;

function unvar(TokenStream $ts) : TokenStream {
  $str = str_replace('$', '', (string) $ts);

  return
    TokenStream::fromSequence(
      new Token(
        T_CONSTANT_ENCAPSED_STRING, $str
      )
    )
  ;
}

This is from expanders.php

<?php

namespace Pre;

if (file_exists(__DIR__ . "/../../autoload.php")) {
  define("BASE_DIR", realpath(__DIR__ . "/../../../"));
}

spl_autoload_register(function($class) {
  $definitions = require BASE_DIR . "/vendor/composer/autoload_psr4.php";

  foreach ($definitions as $prefix => $paths) {
    // ...check $prefixLength

    foreach ($paths as $path) {
      // ...create $php and $pre

      $relative = ltrim(str_replace(BASE_DIR, "", $pre), DIRECTORY_SEPARATOR);

      $macros = BASE_DIR . "/macros.pre";

      if (file_exists($pre)) {
        // ...remove existing PHP file

        foreach (getMacroPaths() as $macroPath) {
          file_put_contents(
            "{$pre}.interim",
            str_replace(
              "<?php",
              file_get_contents($macroPath),
              file_get_contents($pre)
            )
          );
        }

        // ...write and include the PHP file
      }
    }
  }
}, false, true);

This is from autoload.php

Now, additional macro plugins can use these functions to hook their own code into the system…

Creating A New Language Feature

With the plugin code built, we can refactor our class accessors to be a stand-alone, automatically applied feature. We need to create a few more files to make this happen:

  • composer.json → needs to require the base plugin repository and autoload the following files
  • macros.pre → macro code for this plugin
  • functions.php → place to hook the accessor macros into the base plugin system
  • src/AccessorsTrait.php → largely unchanged from before
{
    "name": "pre/class-accessors",
    "require": {
        "php": "^7.0",
        "pre/plugin": "dev-master"
    },
    "autoload": {
        "files": [
            "functions.php"
        ],
        "psr-4": {
            "Pre\\": "src"
        }
    },
    "minimum-stability": "dev",
    "prefer-stable": true
}

This is from composer.json

namespace Pre;

addMacroPath(__DIR__ . "/macros.pre");

This is from functions.php

macro ·unsafe {
  ·ns()·class {
      ···body
  }
} >> {
  ·class {
    use \Pre\AccessorsTrait;

    ···body
  }
}

macro ·unsafe {
  private T_VARIABLE·variable {
    get {
      ···getter
    }

    set {
      ···setter
    }
  };
} >> {
  // ...
}

macro ·unsafe {
  private T_VARIABLE·variable {
    set {
      ···setter
    }

    get {
      ···getter
    }
  };
} >> {
  // ...
}

macro ·unsafe {
  private T_VARIABLE·variable {
    set {
      ···setter
    }
  };
} >> {
  // ...
}

macro ·unsafe {
  private T_VARIABLE·variable {
    get {
      ···getter
    }
  };
} >> {
  // ...
}

This is from macros.pre

This macro file is a little more verbose compared to the previous version. There’s probably a more elegant way of handling all the arrangements the accessors could be defined in, but I haven’t found it yet.

Putting it all together

Now that everything is so nicely packaged, it’s rather straightforward to use the new language feature. Take a look at this quick demonstration!

Demonstration

You can find these plugin repositories on Github:

Conclusion

As with all things, this can be abused. Macros are no exception. This code is definitely not production-ready, though it is conceptually cool.

Please don’t be that person who comments about how bad you think the use of this code would be. I’m not actually recommending you use this code, in this form.

Having said that, perhaps you think it’s a cool idea. Can you think of other language features you’d like PHP to get? Maybe you can use the class accessors repository as an example to get you started. Maybe you want to use the plugin repository to automate things, to the point where you can see if your idea has any teeth.

Let us know how it goes in the comments.

Since writing this tutorial, I’ve been frantically working on the underlying libraries. So much that there’s now a site where this code is hosted and demonstrated: https://preprocess.io. It’s still in an alpha state, but it showcases all the code I’ve spoken about here and then some. There’s also a handy REPL, in case you’d like to try any of the macros.