Does require custom hardware or kernel cooperation for speed (e.g. it needs to do batched MMU operations without clearing the TBL on each 2MB page). Looks like it's got a better read barrier than the Pauseless one; that does of course cost extra on stock hardware.