By David Barr, barr@visi.com


This is one system administrator's point of view why LD_LIBRARY_PATH, as frequently used, is bad. This is written from a SunOS 4.x/5.x (and to some extent Linux) point of view, but this also applies to most other UNIXes.


LD_LIBRARY_PATH is an environment variable you set to give the run-time shared library loader (ld.so) an extra set of directories to look for when searching for shared libraries. Multiple directories can be listed, separated with a colon (:). This list is prepended to the existing list of compiled-in loader paths for a given executable, and any system default loader paths.

For security reasons, LD_LIBRARY_PATH is ignored at runtime for executables that have their setuid or setgid bit set. This severely limits the usefulness of LD_LIBRARY_PATH.

Why was it invented?

There were a couple good reasons why it was invented: As an often unwanted side effect, LD_LIBRARY_PATH will also be searched at link (ld) stage after directories specified with -L (also if no -L flag is given). Some good examples of how LD_LIBRARY_PATH is used:

How has it been corrupted?

Too often people use it as a crutch for not doing the right thing (i.e. relying on the compiled in path). Often programs (even commercial ones) are compiled without any run-time loader paths at all, forcing you to have LD_LIBRARY_PATH set or else the program won't run.

LD_LIBRARY_PATH is one of those insidious things that once it gets set globally for a user, things tend to happen which cause people to rely on it being set. Eventually when LD_LIBRARY_PATH needs to be changed or removed, mass breakage will occur!

How does the shared loader work?

SunOS 4.x uses major and minor revision numbers. If you have a library Xt, then it's named something like libXt.so.4.10 (Major version 4, minor 10). If you update the library (to correct a bug, for example), you would install libX11.so.4.11 and applications would automatically use the new version. To do this, the loader must do a readdir() for every directory in the loader path and glob out the correct file name. This is quite expensive especially if the directories are large, contain symlinks, and/or are located over NFS.

Linux, SunOS 5.x and most other SYSV variants use only major revision numbers. A library Xt is just named something like libXt.so.4. (Linux confuses things by generally using major/minor library file names, but always include a symlink that is the actual library path referenced. So, for example, a library "libXt.so.6" is actually a symlink to "libXt.so.6.0". The linker/loader actually looks for "libXt.so.6".)

The loader works essentially the same except that you don't have minor library updates (you update the existing library) and the loader just does a stat() for each directory in the loader path. (This is much faster)

The bad old days before separate run-time vs link-time paths

Nowadays you specify the run-time path for an executable at link stage with the -R (or sometimes -rpath) flag to ld. There's also LD_RUN_PATH which is an environment variable which acts to ld just like specifying -R.

Before all this you had only -L, which applied not only during compile-time, but during run time as well. There was no way to say "use this directory during compile time" but "use this other directory at run time". There were some rather spectacular failure modes that one could get in to because of this. For example, say you are building X11R6 in an NFS automounted directory /home/snoopy/src. X11R6 is made up of shared libraries as well as programs. The programs are compiled against the libraries when they are located in the build tree, not in their final installed location. Since the linker must resolve symbols at link time, you need a -L path that includes the link-time path in addition to the final run-time path of, say, /usr/local/X11R6/lib. Now all the programs which use shared libraries will look first in /home/snoopy/src for their libraries and then in the correct place. Now every time an X11R6 app starts up it NFS automounts its build directory! You probably removed the temporary build directory ages ago, but the linker will still search there. What's worse, say snoopy is down or no longer exists, no X11R6 apps will run! Bummer! Happily this all has been fixed, assuming your OS has a modern linker/loader. It also is worked around by specifying the final run time path first, before the build path in the -L options.

Evil Case Study #1

My first experience with this breakage was under SunOS 4.x, with OpenWindows. For some dumb reason, a few Sun OpenWindows apps were not compiled with correct run-time loader paths, forcing you to have LD_LIBRARY_PATH set all the time. Remember, at this time, in the global OpenWindows startup scripts the system would automatically set your LD_LIBRARY_PATH to be $OPENWINHOME/lib.

Okay, how did it break? Well, it just so happens that this site also had compiled X11R4 from source, in /usr/local/X11R4 . Things got really confusing because if you ever wanted to run the X11R4 apps, they would run against the OpenWindows libraries in /usr/openwin/lib, not the libraries in /usr/local/X11R4/lib! Things got even more confusing once X11R5 and then X11R6 came out. Now we had four different and often incompatible versions of a given shared library.

Hm. What do you do? If you set LD_LIBRARY_PATH to put OpenWindows first, then at best it will slow things down (since most people were running X11R5 and X11R6 stuff, searching for libraries in /usr/openwin/lib was a waste). At worst it caused spurious warnings ("ld.so: warning: libX11.x.y has older revision than expected z") or caused apps to break altogether due to incompatibilities. It was also confusing to lots of people trying to compile X apps and forget to use -L. What did I do? I whipped out emacs and binary edited the few OpenWindows apps which didn't have a correct run-time path compiled in, and changed to the correct location in /usr/openwin/lib. (it should be noted that these tended to be apps which were fixed with system patches.. alas it seems guys who build the patched versions didn't have the same environment as the FCS guys). I then changed all the startup scripts and removed any "setenv LD_LIBRARY_PATH" statements. I even put in an "unsetenv LD_LIBRARY_PATH" in my own .cshrc for good measure.

Evil Case Study #2

(based on a true story).

Due to licensing issues, it's common for commercial apps to ship in binary form a copy of the shared Motif library. Motif is a commercial product, and not all OS's come with it. It's a common toolkit for commercial programs to write applications against. It's also an evolving product, with ongoing bugfixes and new features.

Say application WidgetMan is one such application. In its startup script, it sets LD_LIBRARY_PATH to point to its copy of Motif so it uses that one when it runs. As it happens, WidgetMan is designed to launch other programs too. Unfortunately, when WidgetMan launches other apps, they inherit the LD_LIBRARY_PATH setting and some Motif based apps now break when run from WidgetMan because WidgetMan's Motif is incompatible with (but the same library version as) the system Motif library. Bummer!

Imagine if you had followed what some clueless commercial install apps tell you to do and set LD_LIBRARY_PATH globally!

Half-hearted attempts to improve things

Some OS's (e.g. Linux) have a configurable loader. You can configure what run-time paths to look in by modifying /etc/ld.so.conf. This is almost as bad a LD_LIBRARY_PATH! Install scripts should never modify this file! This file should contain only the standard library locations as shipped with the OS.

Canonical rules for handling LD_LIBRARY_PATH

  1. Never ever set LD_LIBRARY_PATH globally.

  2. If you must ship binaries that use shared libraries and want to allow your clients to install the program outside a 'standard' location, do one of the following:
  3. If you are forced to set LD_LIBRARY_PATH, do so only as part of a wrapper.

Some software packages make you install a symlink from the standard location pointing to the real location. While this 'works', it does not solve the problem. What if you need to have two versions installed? Not to mention the fact that many vendors seem to choose stupid locations as their 'standard' location (like putting them in '/' or '/usr'). This also typically makes things difficult for network installations, since even though you install an application on a network directory, you need to go around to every computer on the network and make a symlink.

Thoughts on improving LD_LIBRARY_PATH implementations in UNIX