Created by: gwideman, Oct 8, 2013 9:08 pm
Revised by: gwideman, Oct 23, 2013 8:37 pm (19 revisions)

Overview

In an effort to discover the true behavior of Windows Python Launcher, this page examines the main source code for py.exe, launcher.c, at bitbucket.org/vinay.sajip/pylauncher.
The source code interacts with the following items, which affects its decisions:
  • builtin_virtual_paths: A list of specific paths to python, hard-coded in launcher.c, which may appear in the shebang line of a script, which Launcher treats specially.
  • builtin_prefixes: A list of specific directories, hard-coded in launcher.c, which Launcher removes from a script's shebang line, if present.
  • commands: A list of key-value pairs, defining "custom virtual shebang-line commands". Launcher reads these from the global and/or user's py.ini file, [command] section.
    • Launcher may also find commands in Windows PATH. See later for details.
  • installed_pythons: A list of INSTALLED_PYTHON records corresponding to all python installations found recorded in the registry.
    • INSTALLED_PYTHON records:
      • version: 4 characters, eg: 2.7 or 3.3
        • Registry doesn't seem to record, and Launcher doesn't discover, the third version digit
      • bits: 32 or 64
      • executable: Path to python.exe for this version
    • Sorted from highest-numbered version to lowest, with 64-bit before 32-bit of the same version
      • Sort order for multiple X.Y.* releases? Sort order seems not to be defined.
  • magic_values: Hardcoded into Launcher. Translates the "magic number" values in first two bytes of compiled python file into python version.

launcher.c source code steps

launcher.c source code version: a6c53b8
Decision locations
py.exe args
script
shebang-line
Action
Error exit
Comments
process()
none
n/a
locate_python("").
Run (invoke1) python with args: ""
If no python installed

process()
-X [script]
-X.Y [script]
ignored
Check that arg is valid -X or -X.Y format and X and Y are valid digits.
If yes, locate_python("X.Y"), run (invoke1) python with Args: rest-CL
If not X.Y format, fall through...
If X.Y format, but no such version installed

process()
-h
--help
ignored
locate_python("").
Show py help. Run (invoke1) python with same arguments (so it shows help).
If no python installed

process()
script [args]
-arg(s) script [args]
various
scriptpath = find first arg not prefixed with '-'.
maybe_handle_shebang(scriptpath, all-of-py's args)
continue...


maybe_handle_shebang
ditto
script is compiled
Get magic number (first 2 bytes) of file. find_by_magic() translates to X.Y, then locate_python().
Run python (invoke2) with all-of-py's args.
No, but maybe it should?
If find_by_magic() fails, then resulting path through maybe_handle_shebang() code is haphazard.
parse_shebang,
maybe_handle_shebang
ditto
Starts with one of the builtin virtual paths:
  • "/usr/bin/python"
  • "/usr/local/bin/python"
  • "python"
Isolate "version suffix" (if any) joined to end of virtual path. Example: "/usr/bin/python3.2" --> "3.2". Use version suffix (even if "") in locate_python(suffix). Run python (invoke5). Args: ???
Fails if "version characters" are invalid X.Y, or python version unavailable.
Precludes custom version paths (ie: custom virtual shebang-line commands) based on these same directories, whose command name starts with "python". Eg: "/usr/bin/pythonABC" or even just "pythonABC".
parse_shebang, maybe_handle_shebang/
ditto
Is exactly builtin virtual path using env:
  • "/usr/bin/env python"
find_on_path("python"), and run it (invoke4). Args: ???
If no python on PATH, fall through...

Unix behavior in this case is to run env, with the first arg as the program for env to find on the PATH, and then run. Does not need to be specifically plain "python"
parse_shebang, maybe_handle_shebang/
ditto
Starts with builtin virtual path using env:
  • "/usr/bin/env python"
Like: "/usr/bin/env python3.2"
Proceeds as for other /builtin-virtual-paths/python cases. Ie: Uses locate_python(suffix), (not find_on_path()). Run python (invoke5).
Fails if "version characters" are invalid X.Y, or python version unavailable.
Odd: 'env python' uses PATH, whereas 'env python2' uses locate_python().
parse_shebang, maybe_handle_shebang
ditto
Not a builtin virtual path
Take first "argument" in shebang-line, remove built-in prefix if present, look up result:
  • in PATH
  • in custom commands defined in py.ini [commands] section
If found, run command (invoke3). Args: ???
(no error)
Odd: command that happens to be on PATH cannot be overridden by command definition in py.ini.
parse_shebang, maybe_handle_shebang, process

Present, but not a builtin, custom or PATH-findable command
locate_python(""); invoke1.
No python installed








_._._._._._._._._._._._.
_._._._._._._._._._._._._._.



Notes:
  • rest-CL: Rest of py.exe command-line, following parameter(s) that Launcher processed itself.

locate_python algorithm

document here

py.ini configuration file format


[defaults]
python=3
python=3.2
python2=2.7
python3=3.2
[commands]
MyPython2=C:\mypythons\2.7.3\python.exe
MySpecialPython=C:\mypythons\3.3\python.exe
Notes:
  • Launcher does not interpret the default settings in multiple iterations. That is to say, specifying 'python=3', and then 'python3=3.2', results in Launcher interpreting a shebang-line containing just 'python' as a request for version 3, not for version 3.2 specifically.
    • And somewhere Launcher's doc is incorrect on this point
  • Does Launcher care about slash vs backslash???

Hard-coded Lists

builtin_virtual_paths:
  /usr/bin/env python
  /usr/bin/python
  /usr/local/bin/python
  python
 
builtin_prefixes
  /usr/bin/env
  /usr/bin/
  /usr/local/bin

Potential surprises

To be verified:
  • Default py command behaves differently than python command.
    When there is no py configuration (ie: no py.ini [default] sections and no corresponding environment variables) and no shebang-line, issuing command-line 'py script' or just plain 'script' will use locate_python's algorithm, which defaults to selecting version=2. In particular, this scenario uses locate_python instead of using PATH, in contrast to 'python script', which does use PATH.
  • Version specifier X.Y-32 but no X.Y-64.
    • A request not specifying -32 means accept latest of either, and choose 64 if both 32 and 64 of same X.Y available
    • You can specify "latest 32", but not "latest 64".
    • So if you have installed 3.1-64 and 3.3-32, there's no way for a script to request and get the 64-bit python.
    • The implication is that scripts ought not have dependencies on 64-bit.
  • Version specificity is limited to X.Y, (not X.Y.Z)
    • In py.exe command line, py.ini [defaults], and shebang-line, versions can only be specified to X.Y, not X.Y.Z
    • locate_python algorithm doesn't appear to necessarily choose highest installed version within X.Y range. (request 3.2 might select 3.2.1 instead of 3.2.2)
    • This area of concern turns out not to be an issue because
      • The msi installer uses registry keys HKLM\SOFTWARE\(Wow6432Node)\Python\PythonCore\X.Y for each installation. So only stores a single set of data for each X., and apparently only allows a single X.Y version to be installed at a time.
        • Presumably the idea is that X.Y.n+1 is primarily a bug-fix relative to X.Y, and consequently should replace it, rather than both being available and selectable on the same machine.
        • (Not sure of installer algorithm when an attempt is made to install a lower-numbered version when a higher-numbered one is already installed.)
        • Additional same-X.Y Python directory trees could be put in place, and even pointed-to by PATH, and used. However, the msi-created registry keys would not reflect these other Python "installations", and they would not be found by Launcher's locate_python algorithm (though they might be found by Launcher's find_on_path algorithm).
  • There's no way for script or command line to request a version >= X.Y. To me this seems like the commonest scenario, where the script programmer knows that the script uses a python feature introduced in version X.Y. The script doesn't need version precisely X.Y, but it does need X.Y or above. (Probably limited to within the version 2 or the version 3 family.)
  • Custom virtual commands can't start with the word 'python'.
  • Custom virtual commands can't override same-named executables on PATH (including ones that could be found by virtue of PATHEXT).
  • py.exe only became available in 3.3.something. Installing a version of 3.x < 3.3 causes HKCR\.py --> HKCR\Python.File\shell\open\command and friends to point direct to \some\install\of\python, not to py.exe.
  • Launcher documentation page bitbucket.org/vinay.sajip/pylauncher/src/tip/Doc/launcher.rst#rst-header-customizing-default-python-versions says: "If PY_PYTHON=3 and PY_PYTHON3=3.1, the commands python and python3 will both use specifically 3.1". This is mistaken. The effect of these two variables is separate, not cumulative.

Issues in handling of shebang line

shebang
algorithm

/user/bin/env python
PATH
Launcher treats '.../env python' like Unix would, by using PATH
/user/bin/env python2
locate_python
But odd that Launcher does not use PATH when processing '.../env pythonX.Y'. (Unlike under Unix).
/user/bin/env Specialpython
Algorithm transforms this to just 'Specialpython'. Look up on PATH, or failing that, custom command
But then again, '.../env arg_doesnt_start_with_python' does cause Launcher to perform a PATH search (and if that fails, looks for custom command).
/user/bin/env pythonSpecial
Fails validate_version
Thus pre-empts custom virtual shebang-line commands that begin with 'python'.
python3.2.1
Fails validate_version
Attempts to define specific custom command for X.Y.Z would fail if they begin with 'python'
python3.2-64
Fails validate_version







_._._._._._._._._._._._._._._._._._._.


Code clarifications

  • It would be very helpful if comments were attached to all functions describing the expected input args and result values.
    • And for result values within a function to be named according to the meaning of the result.
  • A number of variables are named in ways that are either misleadingly or burdensomely ambiguous:
Function or variable
better name
Discussion
Several scopes:
command, cmdline

The word "command" is ambiguous because in this code it could refer to:
  • User's command line which invoked py
  • Launcher Virtual command: One of several unix-appropriate paths to python which py.exe understands as an instruction to use its locate_python algorithm. The list of these Virtual Command paths is hard-coded into py.exe
  • Custom command: A command defined by the user in one of the py.ini files. To be invoked in a script's shebang line
  • The command parsed from a script's shebang line
  • The command being assembled by py.exe to execute python
Because of this overloading of the word "command", all instances of 'command' or 'cmd' could be made more distinct
maybe_handle_shebang: is_virt
is_builtin
This variable does not signify "is virtual command" -- a shebang line may contain a virtual command and this value ends up either True or False. Instead, this value signifies whether the shebang line contains a built-in virtual command.
locate_all_pythons
survey_all_pythons
In this code 'locate' is used in two different ways: (a) "get info from registry and put it into the list of available pythons" and (2) "get an available version of python from the list". I suggests "survey" for the functions which read the registry and build the list, and "find" or "lookup" for those which read the list.
locate_pythons_for_key
survey_pythons_for_key

find_python_by_version
find_python_by_version
(no name change)
find_existing_python
find_python_by_path

find_by_magic
find_python_by_magic

locate_python
find_best_python_for_version_request